Notes on how a world-class software engineer uses LLMs

Link

HN Comments: How I program with LLMs (crawshaw.io)

What the article covers

This document is a summary of my personal experiences using generative models while programming over the past year. It has not been a passive process. I have intentionally sought ways to use LLMs while programming to learn about them. The result has been that I now regularly use LLMs while working and I consider their benefits net-positive on my productivity. (My attempts to go back to programming without them are unpleasant.)

It’s very early but so far the experience has been positive.

So I followed this curiosity, to see if a tool that can generate something mostly not wrong most of the time could be a net benefit in my daily work. The answer appears to be yes, generative models are useful for me when I program.

My Thoughts

Overall takeaway

It’s easy to go on Twitter/X and YouTube to find regular software programmers/engineers building with LLMs.

It’s rarer to find a truly world-class software engineer like David Crawshaw, Co-Founder and CTO of Tailscale and formerly a Staff Software Engineer at Google, writing about how he’s using LLMs and what’s working for him.

The three ways David finds LLMs to be helpful are:

Autocomplete: “This makes me more productive by doing a lot of the more-obvious typing for me”
Search: “If I have a question about a complex environment…I will get a far better answer asking any consumer-based LLM, o1, sonnet 3.5, etc, than I do using an old fashioned web search engine and trying to parse the details out of whatever page I land on.”
Chat-driven programming: “This is where I get the most value of LLMs, but also the one that bothers me the most. It involves learning a lot and adjusting how you program, and on principle I don’t like that.”

What makes this particularly interesting is that these use cases resonate across skill levels.

This reminds me of the quip “Stars! They’re Just Like Us,” in that I, too, use LLMs for Autocomplete, Search, and Chat-driven programming.

The paragraph that resonated with me the most is:

Let me try to motivate this for the skeptical. A lot of the value I personally get out of chat-driven programming is I reach a point in the day when I know what needs to be written, I can describe it, but I don’t have the energy to create a new file, start typing, then start looking up the libraries I need. (I’m an early-morning person, so this is usually any time after 11am for me, though it can also be any time I context-switch into a different language/framework/etc.) LLMs perform that service for me in programming. They give me a first draft, with some good ideas, with several of the dependencies I need, and often some mistakes. Often, I find fixing those mistakes is a lot easier than starting from scratch.

The key point here is that even world-class programmers eventually run out of energy to continue doing the work that they love.

The LLM lowers the barrier enough that it lets them continue programming (maybe not at full bore, but at least at some bore).

Until AGI / ASI takes over, the key is to find ways to use and build tools that make it easier for humans to accomplish their goals.

This blog post is part one, so I’ll go deeper into the different parts of the article tomorrow.

For now, here are some practical implementations for AI Engineering and AI Agents from the overall takeaways and the paragraph quoted above.

Practical Implementation

LLMs and AI Agents should lower the activation energy required to achieve a goal.

Goal achievement progression generally follows:

Come up with a goal
Look at ways to solve it
Stress-test potential solutions
Figure out how to achieve it within constraints
Start the process
React/Correct as you get closer to the end
Achieve the goal

The LLM System should lower the activation energy or completely remove that step for the user at each step.

The more sophisticated the goal, the longer each step will take, so either breaking it down or partially solving it allows humans to keep moving forward.

Scale & Performance Considerations

When thinking about scaling LLM assistance for programming or other problem solving, a few considerations to consider:

Energy Levels & Cognitive Load

Different times of day/days of the week affect programming capacity
LLMs can help maintain productivity during low-energy periods
Balancing tool assistance with developer understanding

Context Management Load

IDE integration can bring too much complexity
A clean slate approach (browser-based) often works better
Manage the scope of context the LLM receives

Verification Overhead

Always need to compile and test generated code
More structure means more verification points
Trade-off between granularity and verification time

Having an LLM lower the barrier to problem solving is great, but given the current technology, it doesn’t offload the work; it just makes it less arduous.

Tools, no matter how advanced, require thoughtful integration.

The goal is to manage trade-offs, not eliminate them.

Trade-offs & Limitations

David’s article looks at some key trade-offs:

Flexibility vs. Reliability

Chat-driven programming requires learning new workflows
Non-deterministic service with changing behaviors
Benefits may not justify adjustment costs for all developers

Scope vs. Quality

LLMs do better with exam-style, contained questions
Large, complex workspaces can confuse LLMs
Need to scope requests for best results carefully

Structure vs. Speed

More package boundaries mean more precise context for LLMs
Additional structure adds overhead
Need to balance organization with development velocity

David concludes that while it’s a net positive, there’s still much work to be done.

He, in fact, plugs a project he’s working on to solve some of these issues.

Strategic Implications

For technical leaders and programmers, some strategic considerations to think through:

Tool Investment

Need specialized environments for LLM interaction
Different from traditional IDE integration
Tool (re)training to work better

Team Workflow

Changes how code structure decisions are made
Affects package organization strategies
Influences test writing approaches

Code Organization

Trend toward more specialized, smaller packages
Less emphasis on DRY principle
More focus on readability and maintainability

As LLMs drive faster development, teams will have to decide whether that means

more robust code gets written within the same time it takes today or
more code gets written within the same time it takes today but not as robust as it could possibly be (current standard of non-LLM-aided code)

Team Adoption Patterns

Based on David’s experiences, teams adopting LLM coding as the standard should consider the following:

Progressive Integration

Start with autocomplete, as it’s the lowest risk and has immediate benefits
Move to search-based usage
Finally, experiment with chat-driven programming

Environment Setup

Create clean, isolated environments for LLM interaction
Establish clear guidelines for when to use LLMs
Build feedback loops for what works/doesn’t work

Cultural Shift

Accept that LLMs produce imperfect first drafts
Embrace verification and more rigorous testing as part of the workflow
Focus on review and improvement rather than perfect generation

The change in mindset will drive more results.

David shares in the article that based on his (non-rigorous) data, he has seen:

it appears from my records that for every two hours of programming I do now, I accept more than 10 autocomplete suggestions, use LLM for a search-like task once, and program in a chat session once.

Key Takeaways for AI Agent Development

David’s experiences with LLMs in programming suggest some fundamental principles for AI Agent development:

Energy Management

AI Agents should reduce cognitive load
Focus on making tasks more manageable, not just automated
Help users maintain productivity through energy peaks and valleys

Context Awareness

Keep interactions focused and well-scoped
Provide clean environments for specific tasks
Avoid overwhelming users with complexity

Verification First

Build verification into the workflow
Make errors easy to spot and fix
Design for iteration rather than perfection

The goal isn’t to replace human judgment but to lower barriers and maintain productivity throughout the day.

As AI Agent tools evolve, success will come not from replacing human judgment but from finding the right balance between human insight and AI assistance.

Notes on how a world-class software engineer uses LLMs - Part 1

Link

What the article covers

My Thoughts

Overall takeaway

Practical Implementation

Scale & Performance Considerations

Trade-offs & Limitations

Strategic Implications

Team Adoption Patterns

Key Takeaways for AI Agent Development

Recommended Articles

Notes on how a world-class software engineer uses LLMs - Part 1

Link

What the article covers

My Thoughts

Overall takeaway

Practical Implementation

Scale & Performance Considerations

Trade-offs & Limitations

Strategic Implications

Team Adoption Patterns

Key Takeaways for AI Agent Development

Recommended Articles

Stay ahead of the curve on AI Agents and AI Engineering.Join thousands of readers who receive exclusive insights, tutorials, and tool recommendations every day.

Stay ahead of the curve on AI Agents and AI Engineering.

Join thousands of readers who receive exclusive insights, tutorials, and tool recommendations every day.