Taplio

Paul Iusztin's Linkedin Analytics

Get the Linkedin stats of Paul Iusztin and many LinkedIn Influencers by Taplio.

Want detailed analytics of your Linkedin Account? Try Taplio for free.

Paul Iusztin

open on linkedin

I am a senior machine learning engineer and contractor with 𝟲+ 𝘆𝗲𝗮𝗿𝘀 𝗼𝗳 𝗲𝘅𝗽𝗲𝗿𝗶𝗲𝗻𝗰𝗲. I design and implement modular, scalable, and production-ready ML systems for startups worldwide. My central mission is to build data-intensive AI/ML products that serve the world. Since training my first neural network in 2017, I have 2 passions that fuel my mission: → Designing and implementing production AI/ML systems using MLOps best practices. → Teaching people about the process. . I currently develop production-ready Deep Learning products at Metaphysic, a leading GenAI platform. In the past, I built Computer Vision and MLOps solutions for CoreAI, Everseen, and Continental. Also, I am the Founder of Decoding ML, a channel for battle-tested content on learning how to design, code, and deploy production-grade ML and MLOps systems. I am writing articles and posts each week on: - 𝘓𝘪𝘯𝘬𝘦𝘥𝘐𝘯: 29k+ followers - 𝘔𝘦𝘥𝘪𝘶𝘮: 2.5k+ followers ~ 🔗 https://medium.com/@pauliusztin - 𝘚𝘶𝘣𝘴𝘵𝘢𝘤𝘬 (𝘯𝘦𝘸𝘴𝘭𝘦𝘵𝘵𝘦𝘳): 6k+ followers ~ 🔗 https://decodingml.substack.com/ . If you want to learn how to build an end-to-end production-ready LLM & RAG system using MLOps best practices, you can take Decoding ML’s self-guided free course: → 𝘓𝘓𝘔 𝘛𝘸𝘪𝘯 𝘊𝘰𝘶𝘳𝘴𝘦: 𝘉𝘶𝘪𝘭𝘥𝘪𝘯𝘨 𝘠𝘰𝘶𝘳 𝘗𝘳𝘰𝘥𝘶𝘤𝘵𝘪𝘰𝘯-𝘙𝘦𝘢𝘥𝘺 𝘈𝘐 𝘙𝘦𝘱𝘭𝘪𝘤𝘢 ~ 🔗 https://github.com/decodingml/llm-twin-course . 💬 If you need machine learning solutions for your business, let’s discuss! 🌎 Only open to full remote positions as a contractor. . Contact: 📱 Phone: +40 732 509 516 ✉️ Email: p.b.iusztin@gmail.com 💻 Decoding ML: https://linktr.ee/decodingml 🕵🏻‍♂️ Personal site & Socials: https://www.pauliusztin.me/

Check out Paul Iusztin's verified LinkedIn stats (last 30 days)

Followers: 54,582

Posts: 20

Engagements: 3,656

Likes: 3,029

What is Paul talking about?

frequency
engagement

Who is engaging with Paul

Paul Iusztin's Best Posts (last 30 days)

Use Taplio to search all-time best posts

2025/06/09

Here’s the problem with most AI books: They teach the model, not the system. Which is fine... until you try to deploy that model in production. That’s where everything breaks: - Your RAG pipeline is duct-taped together - Your eval framework is an afterthought - Your prompts aren’t versioned - Your architecture can’t scale That’s why Maxime and I wrote the LLM Engineer’s Handbook... We wanted to create a practical guide for AI engineers who build real world AI applications. This isn’t just another guide... It's a practical road map for designing and deploying real-world LLM systems. In the book, we cover: → Efficient fine-tuning workflows → RAG architectures → Evaluation pipelines with LLM-as-judge → Scaling strategies for serving + infra → MLOps + LLMOps patterns baked in Whether you’re building your first assistant or scaling your 10th RAG app... This book gives you the mental models and engineering scaffolding to do it right. 🔗 Here's the link to get your copy: https://lnkd.in/dVgFJtzF

151

2025/06/14

Everyone chunks documents for retrieval. But what if that’s the wrong unit? Let me explain.. In standard RAG, we embed small text chunks and pass those into the LLM as context. It’s simple, but flawed. Why? Because small chunks are great for retrieval precision, but terrible for generation context. That’s where Parent Retrieval comes in. (aka small-to-big retrieval) Here’s how it works: → You split your documents into small chunks → You embed and retrieve using those small chunks → But you don’t pass the chunk to the LLM... → You pass the parent document that the chunk came from The result? → Precise semantic retrieval (thanks to small, clean embeddings that encode a single entity) → Rich generation context (because the LLM sees the broader section) → Fewer hallucinations → Less tuning needed around chunk size and top-k It’s one of the few advanced RAG techniques that work in production. No fancy agents. No latency bombs. No retraining. We break it all down (with diagrams and code examples) in 𝗟𝗲𝘀𝘀𝗼𝗻 𝟱 𝗼𝗳 𝘁𝗵𝗲 𝗦𝗲𝗰𝗼𝗻𝗱 𝗕𝗿𝗮𝗶𝗻 𝗔𝗜 𝗔𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 𝗰𝗼𝘂𝗿𝘀𝗲. 🔗 Link to the full lesson in the comments.

162

2025/06/14

The #1 mistake in building LLM agents? Thinking the project ends at reasoning. Here's when it actually ends: When your agent can talk to the world securely, reliably, and in real time. And that’s what 𝗟𝗲𝘀𝘀𝗼𝗻 𝟰 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗵𝗶𝗹𝗼𝗔𝗴𝗲𝗻𝘁𝘀 𝗰𝗼𝘂𝗿𝘀𝗲 is all about. Up to this point, we focused on making our agents think: → Philosophical worldviews → Context-aware reasoning → Memory-backed conversations But intelligence alone isn’t enough. To be useful, agents need a voice. To be deployable, they need an interface. To be real, they need to exist as APIs. This lesson is the bridge from the local prototype to the live system. Here’s what you’ll learn: → How to deploy your agent as a REST API using FastAPI → How to stream responses token-by-token with WebSockets → How to wire up a clean backend–frontend architecture using FastAPI (web server) + Phaser (game interface) → How to think about agent interfaces in real-world products (not just demos) In short: 𝗧𝗵𝗶𝘀 𝗶𝘀 𝗵𝗼𝘄 𝘆𝗼𝘂 𝘀𝗵𝗶𝗽 𝗮𝗻 𝗮𝗴𝗲𝗻𝘁 𝘄𝗵𝗼 𝗿𝗲𝗮𝘀𝗼𝗻𝘀 𝗔𝗡𝗗 𝗿𝗲𝘀𝗽𝗼𝗻𝗱𝘀 𝗶𝗻 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻. Shoutout to Anca-Ioana Martin for helping shape this lesson and write the deep-dive article. And of course... big thanks to my co-creator Miguel Otero Pedrido for the ongoing collab. 🔗 Link to Lesson 4 in the comments.

160

2025/06/14

I need your opinion 🫵 If you've used the LLM Engineer's Handbook to bring your AI project idea to life, Maxime Labonne and I would love to hear about it! 𝘎𝘪𝘷𝘪𝘯𝘨 𝘺𝘰𝘶 𝘵𝘩𝘦 𝘤𝘩𝘢𝘯𝘤𝘦 𝘵𝘰 𝘦𝘢𝘳𝘯 $500. Our bestseller, LLM Engineer's Handbook, has helped thousands build and deploy their own LLM and RAG systems from scratch. 𝗙𝗶𝗿𝘀𝘁, as a writer and educator, I would love to see how Maxime's and my book helped you in your AI Engineering journey. As we've written this book out of passion, that will mean the world to us. 𝗦𝗲𝗰𝗼𝗻𝗱𝗹𝘆, Packt is organizing a contest where you share on social media what you've built and how the book helped you navigate the spaghetti world of AI. The first winner will receive $500. The next five spots will earn a free Packt subscription, giving them access to all Packt's books. 𝘠𝘰𝘶 𝘤𝘢𝘯 𝘴𝘶𝘣𝘮𝘪𝘵 𝘵𝘩𝘦 𝘱𝘰𝘴𝘵 𝘶𝘯𝘵𝘪𝘭 𝘔𝘢𝘺 25! 🔗 Find more details here: https://lnkd.in/dExZAc5i Looking forward to seeing what you've built!

2025/06/11

In 2024, everyone was chasing AI hype. In 2025, people are finally beginning to ask the most important question: 𝗖𝗮𝗻 𝘆𝗼𝘂 𝗯𝘂𝗶𝗹𝗱 𝘀𝗼𝗺𝗲𝘁𝗵𝗶𝗻𝗴 𝗿𝗲𝗮𝗹? If your answer to that is "no", don't worry, I've got you... My friend, @shawtalebi, has put together one of the most practical programs to teach you how to build actual AI projects. It's called 𝗧𝗵𝗲 𝗔𝗜 𝗕𝘂𝗶𝗹𝗱𝗲𝗿𝘀 𝗕𝗼𝗼𝘁𝗰𝗮𝗺𝗽. Over the course of 6 weeks, you'll go deep on: → LLMs and prompt engineering → RAG and embeddings → Fine-tuning and evaluation → Tool use and agent flows → AI project management frameworks And you'll also ship real projects, such as: → A RAG chatbot over blog content → A local document QA assistant → An AI-powered job scraper and dashboard → A fine-tuned text classifier → A structured survey summarize All with expert guidance, peer feedback, and clean, reusable code you can take into your next product or freelance project. What I love most about this program? It’s not tool-first. It’s not hype-first. It’s build-first. You’ll walk away with: - A repeatable system for shipping AI MVPs - The confidence to turn vague ideas into working prototypes - The clarity to ignore noise and focus on what matters Want it? The link is in the comments. P.S. Use code PAUL100 for $100 off - the next cohort kicks off June 6th.

2025/06/11

One of the best talks I had on AI, LLMs, RAG and how to build and ship real-world products. 100% recommend Nicolay Christopher Gerold podcast. One of the best out there ↓

Nicolay Christopher Gerold

3 days ago

"I see LangChain and similar tools as low-code solutions. Good for prototyping, but I'd throw them away for any serious project" Today on How AI Is Built, I have the chance to talk to Paul Iusztin, who's spent 8 years in AI - from writing CUDA kernels in C++ to building modern LLM applications at Decoding ML. His philosophy is refreshingly simple: stop overthinking, start building, and let patterns emerge through use. He uses LangChain and similar tools for quick prototyping - maybe an hour or two to validate an idea - then throws them away completely. "They're low-code tools," he says. "Not good frameworks to build on top of." Yes, it's more work upfront. But when you need to debug or scale, you'll thank yourself. In the podcast, we also cover: - Why fine-tuning is almost always the wrong choice (shoutout to Hamel Husain) - The "just-in-time" learning approach for staying sane in AI - Building writing assistants that actually preserve your voice - Why robots, not chatbots, are the real endgame Full episode below. ♻️ Pay it forward by sharing ♻️

2025/06/11

Super excited to see what you’ve built! 🤘

Packt

3 days ago

Our bestseller *LLM Engineer’s Handbook* has helped thousands build and deploy their own large language models from scratch — now, it’s your turn to show the world what you’ve built! 🎬 Share a short video demonstrating the #LLM you designed using the 'LLM Engineer’s Handbook'. Tell us about your process, what you built, and how the book helped you get there. 🏆 What’s in it for you? 🥇 First prize: $500 🏅 First five Runner-ups: A free Packt subscription to keep learning and building *Create a post (video/still) telling us:* 1. What you built 2. How LLM Engineer’s Handbook helped 3. Any exciting breakthroughs or challenges you overcame ✅ *To participate:* Post about what you have built on LinkedIn, Twitter, Youtube (any other and as many channels as possible) (Tip: brownie points if you’re posting a video) 1. Tag Packt in your post 2. Tag the authors - Paul Iusztin Maxime Labonne 3. Use #BuildwithLLMEnggHB 4. Fill so we know you’ve entered - https://packt.link/MKbh0 📅 Last date to submit: May 25 📣 Winners announced: May 27 *Remember:* 🏅Our Expert Panel will select the winner and the runner-ups. 🏆The best projects will be featured by Packt. 📚 Haven’t read the book yet? Grab your copy here: https://packt.link/dZAxf Let’s build #LLMs, inspire others, and celebrate innovation together. 💡🔧 #BuildWithLLMEnggHB

2025/06/14

LangChain and Llama index are great entry points for building LLM apps. But it’s a huge red flag you're using them in production. Why? Because most LLM frameworks are just like low-code tools. → Great for exploring concepts → Fast to build a demo → Terrible when you need control The moment your system demands: → Custom memory flows → Non-trivial evaluation pipelines → Agent logic across multiple tools → Database-level optimizations You hit a wall. And no amount of chaining can fix it. My advice? If your app depends on data ingestion, embedding, retrieval, and synthesis, just build those pieces from scratch. It’s the only way to - → Know what’s actually happening under the hood → Tune for latency and scale → Own your system end-to-end We unpacked this in depth during the latest DataFramed podcast DataCamp. Maxime and I talked about what it actually takes to ship real-world AI systems. Want to check it out? The link is in the comments.

2025/06/14

You don’t become an AI engineer by tweaking someone else’s notebook. You do it by building real systems, end-to-end. That’s exactly what these 5 open-source courses teach you to do. At Decoding ML, we were tired of surface-level tutorials that only scratched the surface of LLMs, RAG, and AI agents. So we built the kind of learning experience we wished we had when starting: - Project-based - Opinionated - Production-ready. No fake data or hand-waving over infra. Just real-world projects backed by engineering best practices: ✅ Modular Python architecture ✅ Full-stack MLOps + LLMOps ✅ RAG, agents, and evaluation systems ✅ Fine-tuning, serving, and containerization ✅ Building full-fledged end-to-end systems. And yes… you’ll need to sweat through the hard parts. Because these aren’t one-notebook tutorials or weekend demos. These are full-stack, real-world AI systems with multiple components, modular architecture, and production-level complexity. We teach you how to: - Connect custom pipelines across ingestion, retrieval, and inference - Orchestrate agents with memory, reasoning, and tool use - Containerize, serve, and version your models like a real AI engineer - Monitor, evaluate, and iterate using observability best practices Here's exactly what you'll build: 𝟭. 𝗣𝗵𝗶𝗹𝗼𝗔𝗴𝗲𝗻𝘁𝘀 (𝘄𝗶𝘁𝗵 The Neural Maze) Build a character simulation engine powered by RAG agents, memory, and real-time inference. → Learn LangGraph, RAG agents, Observability, and shipping agents as real-time APIs. 𝟮. 𝗦𝗲𝗰𝗼𝗻𝗱 𝗕𝗿𝗮𝗶𝗻 𝗔𝗜 𝗔𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁 Chat with your knowledge base using a custom agentic RAG system. → Learn modular RAG pipelines, fine-tuning LLMs, full-stack deployment, and LLMOps. 𝟯. 𝗔𝗺𝗮𝘇𝗼𝗻 𝗧𝗮𝗯𝘂𝗹𝗮𝗿 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗦𝗲𝗮𝗿𝗰𝗵 Build a natural language product RAG search engine for structured data. → Learn hybrid retrieval leveraging tabular data, embeddings, and metadata filtering. 𝟰. 𝗟𝗟𝗠 𝗧𝘄𝗶𝗻 Create your own digital AI replica that reflects your knowledge and communication style. → Learn LLM fine-tuning, RAG, vector DBs, and building end-to-end LLMOps systems. 𝟱. 𝗛&𝗠 𝗥𝗲𝗮𝗹-𝗧𝗶𝗺𝗲 𝗥𝗲𝗰𝗼𝗺𝗺𝗲𝗻𝗱𝗲𝗿 Deploy a neural fashion recommender on Kubernetes using Hopsworks + KServe. → Learn real-time recommender systems, LLM-augmented recsys, and MLOps workflows. Everything is FREE. All you have to do is: → Clone the GitHub repo → Open the Substack lesson → Run the code + follow the guide → Remix it and build your own production AI system If you're serious about going from "learning AI" to actually shipping it, this is where to start. The link is in the comments.

Document:

5 Free Courses on Shipping AI to Production

2025/06/11

LangChain suggests you should take our PhiloAgents course to get into AI agents ready for production 🥂 Such an amazing work Miguel Otero Pedrido Love this collaboration!

LangChain

4 days ago

🤖🎓 PhiloAgents Build AI agents that impersonate philosophers with LangGraph in this OSS repo covering RAG implementation, real-time conversations, and system architecture with FastAPI & MongoDB integration. Start building philosophical agents! 🚀 https://lnkd.in/gJ9NyH8X

2025/06/14

Hugging Face released a new open-source course on The Model Context Protocol (MCP) The course is divided into 4 units. These will take you from the basics of Model Context Protocol to a final project implementing MCP in an AI application. 🔗 Check it out: https://lnkd.in/d9awb4dJ

107

2025/06/10

You can’t build human-like agents without human-like memory. But most builders skip this part entirely. They focus on prompts, tools, and orchestration. But forget the system that holds it all together... Memory. In humans, memory is layered: → Working memory for what's happening right now → Semantic memory for facts and general knowledge → Procedural memory for skills and habits → Episodic memory for lived experience Agents are no different. If you want believable, useful, context-aware AI... You MUST architect memory intentionally. Here’s a breakdown of short and long-term memory types: - 𝗦𝗵𝗼𝗿𝘁-𝘁𝗲𝗿𝗺 𝗺𝗲𝗺𝗼𝗿𝘆 Stores active conversation threads and recent steps. This is your context window. Lose it, and your agent resets after every turn. For long-term memory, we have: - 𝗦𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗺𝗲𝗺𝗼𝗿𝘆 Factual world knowledge retrieved through vector search or RAG. Think: “What’s the capital of France?” or “What is stoicism?” - 𝗣𝗿𝗼𝗰𝗲𝗱𝘂𝗿𝗮𝗹 𝗺𝗲𝗺𝗼𝗿𝘆 Defines what your agent knows to do, encoded directly in your code. From simple templates to complex reasoning flows—this is your logic layer. - 𝗘𝗽𝗶𝘀𝗼𝗱𝗶𝗰 𝗺𝗲𝗺𝗼𝗿𝘆 Stores user-specific past interactions. It’s what enables continuity, personalization, and learning over time. In our 𝗣𝗵𝗶𝗹𝗼𝗔𝗴𝗲𝗻𝘁𝘀 𝗰𝗼𝘂𝗿𝘀𝗲, we show how to wire all of this together. → Using MongoDB for structured memory → Using LangGraph (by LangChain) to control memory flow → Using Groq for real-time LLM inference → And even using Opik (by @company_cometml) to evaluate how memory shapes performance TL;DR: A smart agent isn’t one that just thinks well... It’s one that remembers well, too. 🔗 Learn more here: https://lnkd.in/d5ySvC_s

241

2025/06/10

Unpopular opinion: fine-tuning is not hard. You know what is? Choosing HOW to fine-tune. There was one rule we stuck by when we began training our summarization LLM in the Second Brain course - Use a toolbelt that just works for 99% of use cases and ignore the 1% of edge cases that require GPU wizardry or DevOps magic. Here’s what we landed on: 🛠 𝗧𝗥𝗟 – 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗙𝗮𝗰𝗲’𝘀 𝗯𝗮𝘁𝘁𝗹𝗲-𝘁𝗲𝘀𝘁𝗲𝗱 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗶𝗻𝗴 𝗹𝗶𝗯𝗿𝗮𝗿𝘆 Perfect for both SFT and preference alignment. Maintained, well-documented, and up-to-date with the latest algorithms. ⚡️ 𝗨𝗻𝘀𝗹𝗼𝘁𝗵 – 𝗟𝗶𝗴𝗵𝘁𝘄𝗲𝗶𝗴𝗵𝘁 𝗳𝗶𝗻𝗲-𝘁𝘂𝗻𝗶𝗻𝗴 𝗮𝘁 𝗶𝘁𝘀 𝗯𝗲𝘀𝘁 Built by Daniel Han and Michael Han (Unsloth), Unsloth AI is making waves- and for good reason: → 2x faster training → Up to 80% less VRAM usage → GGUF quantization for local deployment → Works with Llama.cpp and Ollama → Actively fixing bugs in open models alongside Meta, Google, and Microsoft We used it to fine-tune a Llama 3.1 8B model on a T4 GPU: - 70% less VRAM - Full fine-tuning on commodity hardware - Same results for a fraction of the cost 📊 𝗖𝗼𝗺𝗲𝘁 – 𝗧𝗿𝗮𝗰𝗸 𝗲𝘃𝗲𝗿𝘆𝘁𝗵𝗶𝗻𝗴 𝘁𝗵𝗮𝘁 𝗺𝗮𝘁𝘁𝗲𝗿𝘀 Your training logs shouldn’t live in screenshots. Comet helped us version runs, compare experiments, and debug without chaos. — The result? Fast, reproducible, and low-cost fine-tuning that scales. If you’re building your own fine-tuning pipeline, this trio will carry you far. Unless you enjoy bleeding-edge pain… there’s no reason to reinvent this setup. — Full breakdown in Lesson 5 of the PhiloAgents course (Link in comments)

212

2025/06/13

Claude’s leaked system prompt just confirmed what we all suspected: Vertical > General (No AGI). The best LLMs won’t do everything*.* They’ll do one thing extremely well. I read all 22,000 words of Claude's leaked system prompt… It wasn’t some vague, high-level “you are a helpful assistant” instruction set. It was a deeply engineered blueprint custom-built for one job. → Code-heavy tasks in JavaScript and Python Here’s what stood out (and what it signals about where LLMs are heading): 𝟭. 𝗜𝘁 𝘂𝘀𝗲𝘀 𝗫𝗠𝗟 𝘁𝗼 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝗶𝘁𝘀 𝘁𝗵𝗶𝗻𝗸𝗶𝗻𝗴 No, “You are a helpful assistant.” This is industrial-grade logic. It segments instructions into reusable XML tags: Each one acts like a callable function in a reasoning engine. 𝟮. 𝗧𝗼𝗼𝗹 𝘂𝘀𝗲 𝗶𝘀𝗻’𝘁 𝗷𝘂𝘀𝘁 𝗮𝗹𝗹𝗼𝘄𝗲𝗱 (𝗶𝘁’𝘀 𝗲𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗲𝗱) Claude is taught how to use tools like a software engineer: ✅ When to call ❌ When not to ⚠️ Use memory first 🔄 Limit to 1–2 calls 📏 Over 5? Follow a strict workflow Not “call a tool,” but design a workflow. 𝟯. 𝗠𝗼𝗱𝗲𝗿𝗮𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗹𝗲𝗴𝗮𝗹 𝘀𝗮𝗳𝗲𝘁𝘆 𝗮𝗿𝗲 𝗵𝗮𝗿𝗱𝗰𝗼𝗱𝗲𝗱 “Claude is happy to write creative content involving fictional characters, but avoids writing content involving real, named public figures.” Even moderation is framed as a behavior, not a filter. 𝟰. 𝗜𝘁 𝘁𝗲𝗮𝗰𝗵𝗲𝘀 𝗖𝗹𝗮𝘂𝗱𝗲 𝗵𝗼𝘄 𝘁𝗼 𝗿𝗲𝗮𝘀𝗼𝗻 𝘀𝘁𝗲𝗽-𝗯𝘆-𝘀𝘁𝗲𝗽 Want Claude to count words or characters? “It explicitly counts... assigning a number to each. It only answers once it has performed this step.” Want it to analyze books or code? “Claude should provide a summary from its internal knowledge, and only search when necessary.” This is instruction tuning in the wild. 𝟱. 𝗜𝘁 𝗶𝗻𝗰𝗹𝘂𝗱𝗲𝘀 𝘂𝘀𝗮𝗴𝗲 𝗴𝘂𝗶𝗱𝗲𝘀 𝗳𝗼𝗿 𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰 𝘁𝗲𝗰𝗵 𝘀𝘁𝗮𝗰𝗸𝘀 Yes, inside the system prompt. - How to use TailwindCSS - When to reach for lodash vs. vanilla JS - What Claude should do when reading .env files - How to parse messy CSVs - Which React libraries to use for graphs… This is a fine-tuned developer assistant pretending to be general-purpose. 𝗦𝗼 𝘄𝗵𝗮𝘁’𝘀 𝘁𝗵𝗲 𝗯𝗶𝗴 𝘁𝗮𝗸𝗲𝗮𝘄𝗮𝘆? Unlike GPT or Gemini’s system prompts, which are short, abstract, and vague, Claude’s is specific, opinionated, and operational. It’s not trying to be everything. It’s trying to do certain things very well. → Code in JS and Python → Use tools with precision → Write with context and restraint → Reason step-by-step → Stay within legal and ethical boundaries And that explains why Claude is so good at what it does (and not great at everything else). If you’re building agentic systems or advanced assistants, go read the prompt. It’s a masterclass in instruction design. ♻️ Share this to help someone in your network :)

211

2025/06/14

90% of AI engineers are dangerously abstracted from reality. They work with: → Prebuilt models → High-level APIs → Auto-magical cloud tools But here’s the thing - If you don’t understand how these tools actually work, you’ll always be guessing when something breaks. That’s why the best AI engineers I know go deeper... They understand: How Git actually tracks changes. How Redis handles memory. How Docker isolates environments. If you’re serious about engineering, you'd go build the tools you use. And it’s why I recommend CodeCrafters.io (YC S22) You won’t just learn tools. You’ll rebuild them (from scratch). → Git, Redis, Docker, Kafka, SQLite, Shell... → Step by step, test by test → In your favorite language (Rust, Python, Go, etc.) It’s perfect for AI engineers who want to: → Level up their backend + system design skills → Reduce debugging time in production → Build apps that actually scale under load And most importantly... → Stop being a model user → Start being a systems thinker If I had to level up my engineering foundations today, CodeCrafters is where I’d start. The ink is in the comments. P.S. We only promote tools we use or would personally take. P.S.S. Subscribe with my affiliate link to get a 40% discount :)

195

2025/06/11

RAG isn’t your bottleneck. Blind deployment is. Everyone’s obsessed with squeezing more performance out of their retrieval pipelines. Better chunking Better embeddings Better reranking All great. But none of that matters if you can’t fix what you don’t see. 90% of people building agents today don’t actually know what their agents are doing (especially when they go into production): → Is the reasoning solid? → Are prompt tweaks helping or hurting? → Is performance degrading silently over time? By the time you notice something’s off... it’s already too late. That’s why 𝗟𝗲𝘀𝘀𝗼𝗻 𝟱 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗵𝗶𝗹𝗼𝗔𝗴𝗲𝗻𝘁𝘀 course is all about observability. Agents that produce ROI don't just sound smart... They are also measurable, versioned, and constantly improving. Here’s what we cover in this lesson: → How to monitor complex LLM traces in real-time using Opik → How to version every prompt change for reproducibility → How to generate eval sets and benchmark your agents → How to run online and offline evaluation across your pipelines → How observability fits into your LLMOps stack This is the part of agentic AI that separates demo projects from production systems. Huge thanks to Anca Ioana Muscalagiu for the deep-dive article. And as always, shout-out to Miguel Otero Pedrido for building this with me. Want to dive into lesson 5? Here you go - 📝 Article: https://lnkd.in/dRYgHyid 🎥 Video: https://lnkd.in/dEQ_Yv7n

193

2025/06/11

This year, I gave my first EVER in-person talk. And the one thing I feared most… actually happened. Let me explain. Those who've been following me for a while would know I made a scary promise to myself: “Stop hiding behind a keyboard. Start showing up in real life.” So when I was invited to speak at QCon Software Development Conferences - one of Europe’s biggest software and AI conferences - I said had no choice but to say, "yes." Even though I was terrified. My talk was on The Data Backbone of LLM Systems. A 60-minute deep dive into the infrastructure behind real-world RAG, LLMs and LLMOps. The room was packed with senior engineers from companies like Netflix, Google, Confluent, and MongoDB. And 30 seconds before I started… My clicker broke. No slides. No backup. Just me, 120 people, and a frozen screen. But something kicked in... I tossed the clicker aside, walked to my laptop, and started speaking - manual slide switching and all. And somehow… it worked. The presentation wasn’t perfect (I wasn’t expecting), but I learned a lot in what to do at my future talks. Still, I managed to: → Scored 93% (vs conference average of 83%) → Deliver every insight I came to share → Walk off stage knowing I’d crushed one of my biggest fears It was a personal turning point. I’m proud of the lessons I shared on stage ... and I’m even prouder of the one I learned off-stage: 𝗖𝗼𝘂𝗿𝗮𝗴𝗲 𝗰𝗼𝗺𝗽𝗼𝘂𝗻𝗱𝘀. Excited to see at what conference I will talk next! Thank you QCon Software Development Conferences for the platform. And thank you to everyone who showed up - you made this milestone unforgettable.

164

2025/06/08

95% of agents never leave the notebook. And it’s not because the code is bad... It’s because the system around them doesn’t exist. Here's my point: Anyone can build an agent that works in isolation. The real challenge is shipping one that survives real-world conditions (e.g., live traffic, unpredictable users, scaling demands, and messy data). That's exactly what we tackled in 𝗟𝗲𝘀𝘀𝗼𝗻 𝟭 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗵𝗶𝗹𝗼𝗔𝗴𝗲𝗻𝘁𝘀 𝗰𝗼𝘂𝗿𝘀𝗲. We started by asking, "What does an agent need to survive in production?" And decided on 4 things - It needs an LLM to run in real-time. A memory to understand what just happened. A brain that can reason and retrieve factual information. And a monitor to ensure it all works under load. So we designed a system around those needs. The frontend is where the agent comes to life. We used Phaser to simulate a browser-based world. But more important than the tool is the fact that this layer is completely decoupled from the backend. (so game logic and agent logic evolve independently) The backend, built in FastAPI, is where the agent thinks. We stream responses token-by-token using WebSockets. All decisions, tool calls, and memory management happen server-side. Inside that backend sits the agentic core - a dynamic state graph that lets the agent reason step-by-step. The agent is orchestrated by LangGraph and powered by Groq for real-time inference speeds. It can ask follow-up questions, query external knowledge, or summarize what’s already been said (all in a loop). When the agent needs facts, it queries long-term memory. We built a retrieval system that mixes semantic and keyword search, using cleaned, de-duplicated philosophical texts crawled from the open web. That memory lives in MongoDB and gets queried in real time. Meanwhile, short-term memory tracks the conversation thread across turns. Without it, every new message would be a reset. With it, the agent knows what’s been said, what’s been missed, and how to respond. But here’s the part most people skip: observability. If you want to improve your system, you need to see and measure what it's doing. Using Opik (by Comet), we track every prompt, log every decision, and evaluate multi-turn outputs using automatically generated test sets. Put it all together and you get a complete framework that remembers, retrieves, reasons, and responds in a real-world environment. Oh... and we made the whole thing open source. 🔗 Link: https://lnkd.in/d8-QbhCd P.S. Special shout out to my co-creator Miguel Otero Pedrido

309

2025/06/14

90% of RAG systems struggle with the same bottleneck: (And better LLMs are not the solution) It's retrieval. And most teams don’t realize it because they rush to build without proper evaluation. Before I tell you how to fix this, let me make something clear - 𝗡𝗮𝗶𝘃𝗲 𝗥𝗔𝗚 𝗶𝘀 𝗲𝗮𝘀𝘆. You chunk some docs, embed them, drop a top_k retriever on top, and call it a pipeline. Getting it production-ready? That’s where most teams stall. → They get hallucinations. → They miss key info. → Their outputs feel... off. Why? Because the quality of generation is downstream of the quality of context. ... and naive RAG often pulls in irrelevant or partial chunks that confuse the LLM. If you're serious about improving your system, here's the progression that actually works: 𝗦𝘁𝗲𝗽 𝟭: 𝗙𝗶𝘅 𝘁𝗵𝗲 𝗕𝗮𝘀𝗶𝗰𝘀 These “table-stakes” upgrades outperform fancy models most of the time: → Smarter Chunking - Dynamic over fixed-size. Respect structure. → Chunk Size Tuning - Too long = loss in the middle. Too short = fragmented context. → Metadata Filtering - Boosts precision by narrowing scope semantically and structurally. → Hybrid Search - Combine vector + keyword filtering. 𝗦𝘁𝗲𝗽 𝟮: 𝗟𝗮𝘆𝗲𝗿 𝗼𝗻 𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗥𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 When basic techniques aren’t enough: → Re-ranking (learned or rule-based) → Small-to-Big Retrieval: Retrieve sentences, synthesize larger windows. → Recursive Retrieval (e.g., LlamaIndex) → Multi-hop + agentic retrieval: When you need reasoning across documents. 𝗦𝘁𝗲𝗽 𝟯: 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗲 𝗼𝗿 𝗗𝗶𝗲 𝗧𝗿𝘆𝗶𝗻𝗴 There's no point iterating blindly. Do the following: → End-to-End eval - Is the output good? Ground truths, synthetic evals, user feedback. → Component-level eval - Does the retriever return the right chunks? Use ranking metrics like MRR, NDCG, success@k. 𝗦𝘁𝗲𝗽 𝟰: 𝗙𝗶𝗻𝗲-𝘁𝘂𝗻𝗶𝗻𝗴 = 𝗟𝗮𝘀𝘁 𝗥𝗲𝘀𝗼𝗿𝘁 Don’t start here. Do this only when: → Your domain is so specific general embeddings fail. → Your LLM is too weak to synthesize even when context is correct. → You’ve squeezed all juice from prompt + retrieval optimizations. Fine-tuning adds cost, latency, and infra complexity. It’s powerful, but only when everything else is dialed in. 𝗡𝗼𝘁𝗲: These notes are from a talk over a year old. And yet... most teams are still stuck in Step 0. That tells you something - The surface area of RAG is small. But building good RAG is still an unsolved craft. Let’s change that. Want to learn to implement advanced RAG systems yourself? The link is in the comments. 𝗜𝗺𝗮𝗴𝗲 𝗰𝗿𝗲𝗱𝗶𝘁: LlamaIndex and Jerry Liu

267

2025/06/12

Everyone likes to talk about models, prompts, and performance hacks. But no one teaches you how to ship. In 𝗟𝗲𝘀𝘀𝗼𝗻 𝟲 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗵𝗶𝗹𝗼𝗔𝗴𝗲𝗻𝘁𝘀 𝗰𝗼𝘂𝗿𝘀𝗲, we fix that. We go from messy PoC to clean architecture. Here’s what you’ll learn: → How to organize your Python project like a professional engineer → Why the “app” folder mindset saves you months of debugging later → How to use Docker, .env configs, and modular code. → Why reproducibility and portability matter as much as inference speed → The real difference between hacking an agent and engineering one The goal was to teach you how to build a real system that's durable. Huge thanks to Miguel Otero Pedrido for co-creating this lesson with me. (His engineering brain pushed this to the next level) If you're stuck in notebook purgatory and want to break out, this lesson’s for you. Lesson 6 is now live! (Link in the comments)

245

Want to drive more opportunities from LinkedIn?

Content Inspiration, AI, scheduling, automation, analytics, CRM.

Get all of that and more in Taplio.

Try Taplio for free

Famous LinkedIn Creators to Check Out

Amelia Sordell 🔥

@ameliasordell

Klowt builds personal brands. I founded the business after realising that the best leads came throu...

228k

Followers

Vaibhav Sisinty ↗️

@vaibhavsisinty

I'm an engineer turned marketer, now a founder. I've worked at Uber and Klook, focusing on marketi...

451k

Followers

Sabeeka Ashraf

@sabeekaashraf

On January 8th my "one day" became DAY ONE ... 7 days earlier I downgraded my life into a suitcase....

20k

Followers

Matt Gray

@mattgray1

Over the last decade, I’ve built 4 successful companies and a community of over 14 million people. ...

Followers

Daniel Murray

@daniel-murray-marketing

Hi! I’m Daniel. I’m the creator of The Marketing Millennials and the founder of Authority, a B2B Lin...

150k

Followers

Shlomo Genchin

@shlomogenchin

Hey! Here are 3 ways I can help you: 1️⃣ Talks and Workshops: I'll show your team, or students, how...

49k

Followers

Sam G. Winsbury

@sam-g-winsbury

We turn entrepreneurs into credible thought leaders through personal branding so they can scale thei...

49k

Followers

Ash Rathod

@ashrathod

You already know storytelling is essential for your business and brand. But storytelling is much m...

73k

Followers

Richard Moore

@richardjamesmoore

⏩You know how all the clients you'll ever work with are on LinkedIn, right? But you struggle to gene...

105k

Followers

Izzy Prior

@izzyprior

No matter how outrageously amazing your mission is, it's likely you're not seeing the results you ne...

82k

Followers

Andy Mewborn

@amewborn

I use to be young & cool. Now I do b2b SaaS. Husband. Dad. Ironman. Founder of Distribute // Co-fo...

215k

Followers

Wes Kao

@weskao

Wes Kao is an entrepreneur, coach, and advisor who writes at newsletter.weskao.com. She is co-founde...

107k

Followers

Justin Welsh

@justinwelsh

Over the last decade, I helped build two companies past a $1B valuation and raise over $300M in vent...

Followers

Sahil Bloom

@sahilbloom

Sahil Bloom is the New York Times Bestselling author of The 5 Types of Wealth: A Transformative Guid...

Followers

Luke Matthews

@lukematthws

LinkedIn has changed. You need to change too. Hey I'm Luke, I've been marketing for 5+ years on ...

188k

Followers

Tibo Louis-Lucas

@thibaultll

Founder Prev Taplio & Tweet Hunter (sold) Building Typeframes & revid.ai Invested in animstats.com ...

Followers

Guillaume Moubeche

@-g-

If you’re here, that's because you know that your personal growth will drive your business growth 🚀...

80k

Followers

Austin Belcak

@abelcak

CultivatedCulture.com/Coaching // I teach people how to land jobs they love in today's market withou...

Followers