Blog

AI Engineering, AI Agent, and AI System Blog

2025

LLMs as Vector Program Databases: A New Mental Model

François Chollet presents a novel way to understand LLM prompting by drawing parallels with Word2Vec and vector programs.
Cloudflare Workers: A Natural Platform for AI Agent Infrastructure

Sunil Pai shares compelling technical and business reasons for building AI agent systems on Cloudflare Workers.
Task Tracking: The Missing Infrastructure for AI Agent Systems

Sunil Pai explores how traditional project management concepts could evolve into essential infrastructure for AI agent orchestration.
Qodo Merge 1.0: The Evolution of AI Code Review

Elana Krasner explores how Qodo Merge addresses key challenges in AI-assisted code reviews through context-aware, adaptive feedback systems.
Multi-Agent Design: Applying Human Organization Principles to AI Systems

The creator of Aster Agents shares insights on designing effective multi-agent systems using proven organizational principles.
BAML: Bringing Software Engineering Rigor to LLM Development

Prashanth Rao, AI Engineer working with graphs at Kùzu, Inc, explores BAML, a new domain-specific language that promises to streamline LLM integration with better developer experience.
The Hidden Cost of AI-Assisted Development: Skill Erosion

Namanyay Goel, a senior developer, shares a candid reflection on how AI tools are inadvertently degrading core programming skills.
AX: The Next Evolution in AI Agent Software Design

Mathias Biilmann, CEO of Netlify, introduces Agent Experience (AX) as a crucial new paradigm in software development.
MCP: A Universal Protocol for AI Tool Integration

Vincent Lambert introduces the Model Context Protocol (MCP), an open standard that simplifies how AI models connect to data sources and tools.
ReAG: Moving Beyond Traditional RAG Through Direct Reasoning

An exploration of Reasoning-Augmented Generation (ReAG), a new approach that replaces complex retrieval pipelines with direct LLM reasoning.
Custom AI Developer Tools: The Return of Malleable Software

Geoffrey Litt demonstrates how AI-generated development tools can enhance programming productivity and enjoyment without writing application code.
Local LLMs as Search Judges: Cost-Effective Relevance Evaluation

Doug Turnbull, a search engineer, demonstrates how to use Qwen 2.5 for high-precision, low-cost search relevance evaluation.
Building ReAct: A Simple Python Implementation

Simon Willison demonstrates how to implement the ReAct (Reason+Act) pattern with minimal code.
When is AI Actually Useful? A Practical Framework

Milan Cvitkovic takes a pragmatic look at evaluating AI system utility through total cost accounting.
Fuzzy APIs: A New Pattern in AI Engineering

Geoffrey Litt explores combining LLMs with specialized services through natural language interfaces.
Inside GitHub's AI Model Evaluation: Lessons from Copilot

GitHub shares their systematic approach to evaluating AI models for their flagship Copilot product.
Common Pitfalls in AI Engineering: Learning from Early Adopters

Chip Huyen shares valuable lessons about what not to do when building generative AI applications.
AgentEval: A Framework for Evaluating LLM Applications

Microsoft Research introduces a systematic approach to assess the utility of LLM-powered applications.
Engineering Complex AI Systems: Lessons from Software Engineering

Grant Slatton explores how traditional software engineering principles can guide the engineering development of complex AI systems.
LLM Evaluation: Moving Beyond Manual Testing

Jeffrey Ip, Co-founder of Confident AI, which makes DeepEval, which is an open-source LLM evaluation framework, takes a look at how to evaluate LLM outputs for production applications systematically.
Memory and State Management in AI Agents: From Simple History to Event-Driven Systems

MotleyCrew.ai explores different patterns for managing memory and state in AI Agent systems
Adding Financial Capabilities to AI Agents: A Pattern Emerges From Stripe

Stripe introduces tools for integrating payments into AI Agent workflows, signaling the maturation of AI engineering practices
AI Engineering in 2025: The Gap Between Demos and Production

Sam Bhagwat, co-creator of Gatsby JS, shares insights on the current state of AI engineering and what it takes to build production-ready AI systems
HCI Research Methods: The Missing Link in AI Engineering

Dr. Ian Arawjo explains why AI Engineers should learn qualitative research methods from Human-Computer Interaction
AI Tinkerers: The Modern Homebrew Computer Club for AI

A look at the global community bringing together technical AI practitioners in over 100 cities worldwide.
Autonomous AI Systems in Practice: Lessons from Pippin the Digital Unicorn

Analyzing Yohei Nakajima's open-source experiment in building autonomous AI systems through the lens of Pippin, a 24/7 AI influencer
Evaluator-Optimizer LLM Workflow: A Pattern for Self-Improving AI Systems

Anthropic shares the Evaluator Optimizer LLM Workflow implementation of autonomous improvement loops in LLM applications
AI Agents, SaaS Margins, and Owning the Full Problem

Nikunj Kothari, Partner at Khosla Ventures, writes about why AI agents should break traditional SaaS models
Notes on how a world-class software engineer uses LLMs - Part 2

David Crawshaw, Co-Founder & CTO of Tailscale and formerly a Staff Software Engineer at Google, writes about how he programs with LLMs
Notes on how a world-class software engineer uses LLMs - Part 1

David Crawshaw, Co-Founder & CTO of Tailscale and formerly a Staff Software Engineer at Google, writes about how he programs with LLMs
LLM Observability and Monitoring

Ari Bajo, who is a machine learning engineer, writes about a comment explaining LLM Observability
Teaching LLMs to Code Review Like Senior Developers: A Context-First Approach

Namanyay Goel, who is building AI tools to enhance human potential, writes about teaching AI to read code like a fresh bootcamp grad, not a senior developer
LLM RAG Query Expansion For Better Prompting Results

Tuana Çelik, DevRel & AI Engineering at weaviate_io, writes about Advanced RAG: Query Expansion which is a way to xpand keyword queries to improve recall and provide more context to RAG.
Tree of Thoughts: Using Simple Search Algorithms with your AI Agents

Yao, Yu, Zhao, Shafran, Griffiths, Cao, and Narasimhan, researchers from Princeton University and Google DeepMind, introduce _Tree of Thoughts_ (ToT), which generalizes over the popular 'Chain of Thought' (CoT) approach to prompting language models, and enables exploration over coherent units of text ('thoughts')
Introducing R to Malawi - Establishing and Growing a Community

David Mwale, the R Users Malawi group organizer, recently spoke with the R Consortium about his efforts to establish and grow the R community in Malawi.
ReAct Prompting: A Strategic Look at Next-Gen LLM Interactions

Matt Payne, founder of Width AI, a Machine Learning and Data Science Consulting firm, writes about how they prompt for High-Quality Results from LLMs

2024