In Praise of “Normal” Engineers

June 19, 2025June 19, 2025 mipsytipsy10x engineers, crossposted, engineering culture, high performing teams, teams21 Comments

This article was originally commissioned by Luca Rossi (paywalled) for refactoring.fm, on February 11th, 2025. Luca edited a version of it that emphasized the importance of building “10x engineering teams” . It was later picked up by IEEE Spectrum (!!!), who scrapped most of the teams content and published a different, shorter piece on March 13th.

This is my personal edit. It is not exactly identical to either of the versions that have been publicly released to date. It contains a lot of the source material for the talk I gave last week at #LDX3 in London, “In Praise of ‘Normal’ Engineers” (slides), and a couple weeks ago at CraftConf.

In Praise of “Normal” Engineers

Most of us have encountered a few engineers who seem practically magician-like, a class apart from the rest of us in their ability to reason about complex mental models, leap to non-obvious yet elegant solutions, or emit waves of high quality code at unreal velocity.

I have run into any number of these incredible beings over the course of my career. I think this is what explains the curious durability of the “10x engineer” meme. It may be based on flimsy, shoddy research, and the claims people have made to defend it have often been risible (e.g. “10x engineers have dark backgrounds, are rarely seen doing UI work, are poor mentors and interviewers”), or blatantly double down on stereotypes (“we look for young dudes in hoodies that remind us of Mark Zuckerberg”). But damn if it doesn’t resonate with experience. It just feels true.

The problem is not the idea that there are engineers who are 10x as productive as other engineers. I don’t have a problem with this statement; in fact, that much seems self-evidently true. The problems I do have are twofold.

Measuring productivity is fraught and imperfect

First: how are you measuring productivity? I have a problem with the implication that there is One True Metric of productivity that you can standardize and sort people by. Consider, for a moment, the sheer combinatorial magnitude of skills and experiences at play:

Are you working on microprocessors, IoT, database internals, web services, user experience, mobile apps, consulting, embedded systems, cryptography, animation, training models for gen AI… what?
Are you using golang, python, COBOL, lisp, perl, React, or brainfuck? What version, which libraries, which frameworks, what data models? What other software and build dependencies must you have mastered?
What adjacent skills, market segments, or product subject matter expertise are you drawing upon…design, security, compliance, data visualization, marketing, finance, etc?
What stage of development? What scale of usage? What matters most — giving good advice in a consultative capacity, prototyping rapidly to find product-market fit, or writing code that is maintainable and performant over many years of amortized maintenance? Or are you writing for the Mars Rover, or shrinkwrapped software you can never change?

Also: people and their skills and abilities are not static. At one point, I was a pretty good DBRE (I even co-wrote the book on it). Maybe I was even a 10x DB engineer then, but certainly not now. I haven’t debugged a query plan in years.

“10x engineer” makes it sound like 10x productivity is an immutable characteristic of a person. But someone who is a 10x engineer in a particular skill set is still going to have infinitely more areas where they are normal or average (or less). I know a lot of world class engineers, but I’ve never met anyone who is 10x better than everyone else across the board, in every situation.

Engineers don’t own software, teams own software

Second, and even more importantly: So what? It doesn’t matter. Individual engineers don’t own software, teams own software. The smallest unit of software ownership and delivery is the engineering team. It doesn’t matter how fast an individual engineer can write software, what matters is how fast the team can collectively write, test, review, ship, maintain, refactor, extend, architect, and revise the software that they own.

Everyone uses the same software delivery pipeline. If it takes the slowest engineer at your company five hours to ship a single line of code, it’s going to take the fastest engineer at your company five hours to ship a single line of code. The time spent writing code is typically dwarfed by the time spent on every other part of the software development lifecycle.

If you have services or software components that are owned by a single engineer, that person is a single point of failure.

I’m not saying this should never happen. It’s quite normal at startups to have individuals owning software, because the biggest existential risk that you face is not moving fast enough, not finding product market fit, and going out of business. But as you start to grow up as a company, as users start to demand more from you, and you start planning for the survival of the company to extend years into the future…ownership needs to get handed over to a team. Individual engineers get sick, go on vacation, and leave the company, and the business has got to be resilient to that.

If teams own software, then the key job of any engineering leader is to craft high-performing engineering teams. If you must 10x something, 10x this. Build 10x engineering teams.

The best engineering orgs are the ones where normal engineers can do great work

When people talk about world-class engineering orgs, they often have in mind teams that are top-heavy with staff and principal engineers, or recruiting heavily from the ranks of ex-FAANG employees or top universities.

But I would argue that a truly great engineering org is one where you don’t HAVE to be one of the “best” or most pedigreed engineers in the world to get shit done and have a lot of impact on the business.

I think it’s actually the other way around. A truly great engineering organization is one where perfectly normal, workaday software engineers, with decent software engineering skills and an ordinary amount of expertise, can consistently move fast, ship code, respond to users, understand the systems they’ve built, and move the business forward a little bit more, day by day, week by week.

Any asshole can build an org where the most experienced, brilliant engineers in the world can build product and make progress. That is not hard. And putting all the spotlight on individual ability has a way of letting your leaders off the hook for doing their jobs. It is a HUGE competitive advantage if you can build sociotechnical systems where less experienced engineers can convert their effort and energy into product and business momentum.

A truly great engineering org also happens to be one that mints world-class software engineers. But we’re getting ahead of ourselves, here.

Let’s talk about “normal” for a moment

A lot of technical people got really attached to our identities as smart kids. The software industry tends to reflect and reinforce this preoccupation at every turn, from Netflix’s “we look for the top 10% of global talent” to Amazon’s talk about “bar-raising” or Coinbase’s recent claim to “hire the top .1%”. (Seriously, guys? Ok, well, Honeycomb is going to hire only the top .00001%!)

In this essay, I would like to challenge us to set that baggage to the side and think about ourselves as normal people.

It can be humbling to think of ourselves as normal people, but most of us are in fact pretty normal people (albeit with many years of highly specialized practice and experience), and there is nothing wrong with that. Even those of us who are certified geniuses on certain criteria are likely quite normal in other ways — kinesthetic, emotional, spatial, musical, linguistic, etc.

Software engineering both selects for and develops certain types of intelligence, particularly around abstract reasoning, but nobody is born a great software engineer. Great engineers are made, not born. I just don’t think there’s a lot more we can get out of thinking of ourselves as a special class of people, compared to the value we can derive from thinking of ourselves collectively as relatively normal people who have practiced a fairly niche craft for a very long time.

Build sociotechnical systems with “normal people” in mind

When it comes to hiring talent and building teams, yes, absolutely, we should focus on identifying the ways people are exceptional and talented and strong. But when it comes to building sociotechnical systems for software delivery, we should focus on all the ways people are normal.

Normal people have cognitive biases — confirmation bias, recency bias, hindsight bias. We work hard, we care, and we do our best; but we also forget things, get impatient, and zone out. Our eyes are inexorably drawn to the color red (unless we are colorblind). We develop habits and ways of doing things, and resist changing them. When we see the same text block repeatedly, we stop reading it.

We are embodied beings who can get overwhelmed and fatigued. If an alert wakes us up at 3 am, we are much more likely to make mistakes while responding to that alert than if we tried to do the same thing at 3pm. Our emotional state can affect the quality of our work. Our relationships impact our ability to get shit done.

When your systems are designed to be used by normal engineers, all that excess brilliance they have can get poured into the product itself, instead of wasting it on navigating the system itself.

How do you turn normal engineers into 10x engineering teams?

None of this should be terribly surprising; it’s all well known wisdom. In order to build the kind of sociotechnical systems for software delivery that enable normal engineers to move fast, learn continuously, and deliver great results as a team, you should:

Shrink the interval between when you write the code and when the code goes live.

Make it as short as possible; the shorter the better. I’ve written and given talks about this many, many times. The shorter the interval, the lower the cognitive carrying costs. The faster you can iterate, the better. The more of your brain can go into the product instead of the process of building it.

One of the most powerful things you can do is have a short, fast enough deploy cycle that you can ship one commit per deploy. I’ve referred to this as the “software engineering death spiral” … when the deploy cycle takes so long that you end up batching together a bunch of engineers’ diffs in every build. The slower it gets, the more you batch up, and the harder it becomes to figure out what happened or roll back. The longer it takes, the more people you need, the higher the coordination costs, and the more slowly everyone moves.

Deploy time is the feedback loop at the heart of the development process. It is almost impossible to overstate the centrality of keeping this short and tight.

Make it easy and fast to roll back or recover from mistakes.

Developers should be able to deploy their own code, figure out if it’s working as intended or not, and if not, roll forward or back swiftly and easily. No muss, no fuss, no thinking involved.

Make it easy to do the right thing and hard to do the wrong thing.

Wrap designers and design thinking into all the touch points your engineers have with production systems. Use your platform engineering team to think about how to empower people to swiftly make changes and self-serve, but also remember that a lot of times people will be engaging with production late at night or when they’re very stressed, tired, and possibly freaking out. Build guard rails. The fastest way to ship a single line of code should also be the easiest way to ship a single line of code.

Invest in instrumentation and observability.

You’ll never know — not really — what the code you wrote does just by reading it. The only way to be sure is by instrumenting your code and watching real users run it in production. Good, friendly sociotechnical systems invest heavily in tools for sense-making.

Being able to visualize your work is what makes engineering abstractions accessible to actual engineers. You shouldn’t have to be a world-class engineer just to debug your own damn code.

Devote engineering cycles to internal tooling and enablement.

If fast, safe deploys, with guard rails, instrumentation, and highly parallelized test suites are “everybody’s job”, they will end up nobody’s job. Engineering productivity isn’t something you can outsource. Managing the interfaces between your software vendors and your own teams is both a science and an art. Making it look easy and intuitive is really hard. It needs an owner.

Build an inclusive culture.

Growth is the norm, growth is the baseline. People do their best work when they feel a sense of belonging. An inclusive culture is one where everyone feels safe to ask questions, explore, and make mistakes; where everyone is held to the same high standard, and given the support and encouragement they need to achieve their goals.

Diverse teams are resilient teams.

Yeah, a team of super-senior engineers who all share a similar background can move incredibly fast, but a monoculture is fragile. Someone gets sick, someone gets pregnant, you start to grow and you need to integrate people from other backgrounds and the whole team can get derailed — fast.

When your teams are used to operating with a mix of genders, racial backgrounds, identities, age ranges, family statuses, geographical locations, skill sets, etc — when this is just table stakes, standard operating procedure — you’re better equipped to roll with it when life happens.

Assemble engineering teams from a range of levels.

The best engineering teams aren’t top-heavy with staff engineers and principal engineers. The best engineering teams are ones where nobody is running on autopilot, banging out a login page for the 300th time; everyone is working on something that challenges them and pushes their boundaries. Everyone is learning, everyone is teaching, everyone is pushing their own boundaries and growing. All the time.

By the way — all of that work you put into making your systems resilient, well-designed, and humane is the same work you would need to do to help onboard new engineers, develop junior talent, or let engineers move between teams.

It gets used and reused. Over and over and over again.

The only meaningful measure of productivity is impact to the business

The only thing that actually matters when it comes to engineering productivity is whether or not you are moving the business materially forward.

Which means…we can’t do this in a vacuum. The most important question is whether or not we are working on the right thing, which is a problem engineering can’t answer without help from product, design, and the rest of the business.

Software engineering isn’t about writing lots of lines of code, it’s about solving business problems using technology.

Senior and intermediate engineers are actually the workhorses of the industry. They move the business forward, step by step, day by day. They get to put their heads down and crank instead of constantly looking around the org and solving coordination problems. If you have to be a staff+ engineer to move the product forward, something is seriously wrong.

Great engineering orgs mint world-class engineers

A great engineering org is one where you don’t HAVE to be one of the best engineers in the world to have a lot of impact. But — rather ironically — great engineering orgs mint world class engineers like nobody’s business.

The best engineering orgs are not the ones with the smartest, most experienced people in the world, they’re the ones where normal software engineers can consistently make progress, deliver value to users, and move the business forward, day after day.

Places where engineers can get shit done and have a lot of impact are a magnet for top performers. Nothing makes engineers happier than building things, solving problems, making progress.

If you’re lucky enough to have world-class engineers in your org, good for you! Your role as a leader is to leverage their brilliance for the good of your customers and your other engineers, without coming to depend on their brilliance. After all, these people don’t belong to you. They may walk out the door at any moment, and that has to be okay.

These people can be phenomenal assets, assuming they can be team players and keep their egos in check. Which is probably why so many tech companies seem to obsess over identifying and hiring them, especially in Silicon Valley.

But companies categorically overindex on finding these people after they’ve already been minted, which ends up reinforcing and replicating all the prejudices and inequities of the world at large. Talent may be evenly distributed across populations, but opportunity is not.

Don’t hire the “best” people. Hire the right people.

We (by which I mean the entire human race) place too much emphasis on individual agency and characteristics, and not enough on the systems that shape us and inform our behaviors.

I feel like a whole slew of issues (candidates self-selecting out of the interview process, diversity of applicants, etc) would be improved simply by shifting the focus on engineering hiring and interviewing away from this inordinate emphasis on hiring the BEST PEOPLE and realigning around the more reasonable and accurate RIGHT PEOPLE.

It’s a competitive advantage to build an environment where people can be hired for their unique strengths, not their lack of weaknesses; where the emphasis is on composing teams rather than hiring the BEST people; where inclusivity is a given both for ethical reasons and because it raises the bar for performance for everyone. Inclusive culture is what actual meritocracy depends on.

This is the kind of place that engineering talent (and good humans) are drawn to like a moth to a flame. It feels good to ship. It feels good to move the business forward. It feels good to sharpen your skills and improve your craft. It’s the kind of place that people go when they want to become world class engineers. And it’s the kind of place where world class engineers want to stick around, to train up the next generation.

<3, charity

Another observability 3.0 appears on the horizon

March 24, 2025May 11, 2025 mipsytipsyobservability 2.02 Comments

Groan. Well, it’s not like I wasn’t warned. When I first started teasing out the differences between the pillars model and the single unified storage model and applying “2.0” to the latter, Christine was like “so what is going to stop the next vendor from slapping 3.0, 4.0, 5.0 on whatever they’re doing?”

Matt Klein dropped a new blog post last week called “Observability 3.0”, in which he argues that bitdrift’s Capture — a ring buffer storage on mobile devices — deserves that title. This builds on his previous blog posts: “1000x the telemetry at 0.01x the cost”, “Why is observability so expensive?”, and “Reality check: Open Telemetry is not going to solve your observability woes”, wherein he argues that the model of sending your telemetry to a remote aggregator is fundamentally flawed.

I love Matt Klein’s writing — it’s opinionated, passionate, and deeply technical. It’s a joy to read, full of fun, fiery statements about the “logging industrial complex” and backhanded… let’s call them “compliments”… about companies like ours. I’m a fan, truly.

In retrospect, I semi regret the “o11y 2.0” framing

Yeah, it’s cheap and terribly overdone to use semantic versioning as a marketing technique. (It worked for Tim O’Reilly with “Web 2.0”, but Tim O’Reilly is Tim O’Reilly — the exception that proves the rule.) But that’s not actually why I regret it.

I regret it because a bunch of people — vendors mostly, but not entirely — got really bristly about having “1.0” retroactively applied to describe the multiple pillars model. It reads like a subtle diss, or devaluation of their tools.

One of the principles I live my life by is that you should generally call people, or groups of people, what they want to be called.

That is why, moving forwards, I am going to mostly avoid referring to the multiple pillars model as “o11y 1.0”, and instead I will call it the … multiple pillars model. And I will refer to the unified storage model as the “unified or consolidated storage model, sometimes called ‘o11y 2.0’”.

(For reference, I’ve previously written about why it’s time to version observability, what the key difference is between o11y 1.0 vs 2.0, and had a fun volley back and forth with Hazel Weakly on versioning observabilities: mine, hers.)

Why do we need new language?

It is clearer than ever that a sea change is underway when it comes to how telemetry gets collected and stored. Here is my evidence (if you have evidence to the contrary or would like to challenge me on this, please reach out — first name at honeycomb dot io, email me!!):

Every single observability startup that was founded before 2021, that still exists, was built using the multiple pillars model … storing each type of signal in a different location, with limited correlation ability across data sets. (With one exception: Honeycomb.)
Every single observability startup that was founded after 2021, that still exists, was built using the unified storage model, capturing wide, structured log events, stored in a columnar database. (With one exception: Chronosphere.)

The major cost drivers in an o11y 1.0 — oop, sorry, in a “multiple pillars” world, are 1) the number of tools you use, 2) cardinality of your data, and 3) dimensionality of your data — or in other words, the amount of context and detail you store about your data, which is the most valuable part of the data! You get locked in a zero sum game between cost and value.

The major cost drivers in a unified storage world, aka “o11y 2.0”, are 1) your traffic, 2) your architecture, and 3) density of your instrumentation. This is important, because it means your cost growth should roughly align with the growth of your business and the value you get out of your telemetry.

This is a pretty huge shift in the way we think about instrumentation of services and levers of cost control, with a lot of downstream implications. If we just say “everything is observability”, it robs engineers of the language they need to make smart decisions about instrumentation, telemetry and tools choices. Language informs thinking and vice versa, and when our cognitive model changes, we need language to follow suit.

(Technically, we started out by defining observability as differentiated from monitoring, but the market has decided that everything is observability, so … we need to find new language, again. 😉)

Can we just … not send all that data?

My favorite of Matt’s blog posts is “Why is observability so expensive?” wherein he recaps the last 30 years of telemetry, gives some context about his work with Envoy and the separation of control planes / data planes, all leading up to this fiery proposition:

“What if by default we never send any telemetry at all?”

As someone who is always rooting for the contrarian underdog, I salute this. 🫡

As someone who has written and operated a ghastly amount of production services, I am not so sure.

Matt is the cofounder and CTO of Bitdrift, a startup for mobile observability. And in the context of mobile devices and IoT, I think it makes a lot of sense to gather all the data and store it at the origin, and only forward along summary statistics, until or unless that data is requested in fine granularity. Using the ring buffer is a stroke of genius.

Mobile devices are strictly isolated from each other, they are not competing with each other for shared resources, and the debugging model is mostly offline and ad hoc. It happens whenever the mobile developer decides to dig in and start exploring.

It’s less clear to me that this model will ever serve us well in the environment of highly concurrent, massively multi-tenant services, where two of the most important questions are always what is happening right now, and what just changed?

Even the 60-second aggregation window for traditional metrics collectors is a painful amount of lag when the site is down. I can’t imagine waiting to pull all the data in from hundreds or thousands of remote devices just to answer a question. And taking service isolation to such an extreme effectively makes traces impossible.

The hunger for more cost control levers is real

I think there’s a kernel of truth there, which is that the desire to keep a ton of rich telemetry detail about a fast-expanding footprint of data in a central location is not ultimately compatible with what people are willing or able to pay.

The fatal flaw of the multiple pillars model is that your levers of control consist of deleting your most valuable data: context and detail. The unified storage (o11y 2.0) model advances the state of the art by giving you tools that let you delete your LEAST valuable data, via tail sampling.

In a unified storage model, you should also have to store your data only once, instead of once per tool (Gartner data shows that most of their clients are using 10-20 tools, which is a hell of a cost multiplier.)

But I also think Matt’s right to say that these are only incremental improvements. And the cost levers I see emerging in the market that I’m most excited about are model agnostic.

Telemetry pipelines, tiered storage, data governance

The o11y 2.0 model (with no aggregation, no time bucketing, no indexing jobs) allows teams to get their telemetry faster than ever… but it does this by pushing all aggregation decisions from write time to read time. Instead of making a bunch of decisions at the instrumentation level about how to aggregate and organize your data… you store raw, wide structured event data, and perform ad hoc aggregations at query time.

Many engineers have argued that this is cost-prohibitive and unsustainable in the long run, and…I think they are probably right. Which is why I am so excited about telemetry pipelines.

Telemetry pipelines are the slider between aggregating metrics at write time (fast, cheap, painfully limited) and shipping all your raw, rich telemetry data off to a vendor, for aggregating at read time.

Sampling, too, has come a long way from its clumsy, kludgey origins. Tail-based sampling is now the norm, where you make decisions about what to retain or not only after the request has completed. The combination of fine-grained sampling + telemetry pipelines + AI is incredibly promising.

I’m not going to keep going into detail here because I’m currently editing down a massive piece on levers of cost control, and I don’t want to duplicate all that work (or piss off my editors). Suffice it to say, there’s a lot of truth to what Matt writes… and also he has a way of skipping over all the details that would complicate or contradict his core thesis, in a way I don’t love. This has made me vow to be more careful in how I represent other vendors’ offerings and beliefs.

Money is not always the most expensive resource

I don’t think we’re going to get to “1000x the telemetry at 0.01x the cost”, as Matt put it, unless we are willing to sacrifice or seriously compromise some of the other things we hold dear, like the ability to debug complex systems in real time.

Gartner recently put out a webinar on controlling observability costs, which I very much appreciated, because it brought some real data to what has been a terribly vibes-based conversation. They pointed out that one of the biggest drivers of o11y costs has been that people get attached to it, and start using it heavily. You can’t claw it back.

I think this is a good thing — a long overdue grappling with the complexity of our systems and the fact that we need to observe it through our tools, not through our mental map or how we remember it looking or behaving, because it is constantly changing out from under us.

I think observability engineering teams are increasingly looking less like ops teams, and more like data governance teams, the purest embodiment of platform engineering goals.

When it comes to developer tooling, cost matters, but it is rarely the most important thing or the most valuable thing. The most important things are workflows and cognitive carrying costs.

Observability is moving towards a data lake model

Whatever you want to call it, whatever numeric label you want to slap on it, I think the industry is clearly moving in the direction of unified storage — a data lake, if you will, where signals are connected to each other, and particular use cases are mostly derived at read time instead of write time. Where you pay to store each request only one time, and there are no dead ends between signals.

Matt wrote another post about how OpenTelemetry wasn’t going to solve the cost crisis in o11y … but I think that misses the purpose of OTel. The point of OTel is to get rid of vendor lock-in, to make it so that o11y vendors compete for your business based on being awesome, instead of impossible to get rid of.

Getting everyone’s data into a structured, predictable format also opens up lots of possibilities for tooling to feel like “magic”, which is exciting. And opens some entirely different avenues for cost controls!

In my head, the longer term goals for observability involve unifying not just data for engineering, but for product analytics, business forecasting, marketing segmentation… There’s so much waste going on all over the org by storing these in siloed locations. It fragments people’s view of the world and reality. As much as I snarked on it at the time, I think Hazel Weakly’s piece on “The future of observability is observability 3.0” was incredibly on target.

One of my guiding principles is that ✨data is made valuable by context.✨ When you store it densely packed together — systems, app, product, marketing, sales — and derive insights from a single source of truth, how much faster might we move? How much value might we unlock?

I think the new few years are going to be pretty exciting.

Generative AI is not going to build your engineering team for you

June 10, 2024December 21, 2024 mipsytipsyartificial intelligence, code generation, generative AI, hiring, junior engineers, llmsLeave a comment

Originally posted on the Stack Overflow blog on June 10th, 2024

When I was 19 years old, I dropped out of college and moved to San Francisco. I had a job offer in hand to be a Unix sysadmin for Taos Consulting. However, before my first day of work I was lured away to a startup in the city, where I worked as a software engineer on mail subsystems.

I never questioned whether or not I could find work. Jobs were plentiful, and more importantly, hiring standards were very low. If you knew how to sling HTML or find your way around a command line, chances were you could find someone to pay you.

Was I some kind of genius, born with my hands on a computer keyboard? Assuredly not. I was homeschooled in the backwoods of Idaho. I didn’t touch a computer until I was sixteen and in college. I escaped to university on a classical performance piano scholarship, which I later traded in for a peripatetic series of nontechnical majors: classical Latin and Greek, musical theory, philosophy. Everything I knew about computers I learned on the job, doing sysadmin work for the university and CS departments.

In retrospect, I was so lucky to enter the industry when I did. It makes me blanch to think of what would have happened if I had come along a few years later. Every one of the ladders my friends and I took into the industry has long since vanished.

To some extent, this is just what happens as an industry matures. The early days of any field are something of a Wild West, where the stakes are low, regulation nonexistent, and standards nascent. If you look at the early history of other industries—medicine, cinema, radio—the similarities are striking.

There is a magical moment with any young technology where the boundaries between roles are porous and opportunity can be seized by anyone who is motivated, curious, and willing to work their asses off.

It never lasts. It can’t; it shouldn’t. The amount of prerequisite knowledge and experience you must have before you can enter the industry swells precipitously. The stakes rise, the magnitude of the mission increases, the cost of mistakes soars. We develop certifications, trainings, standards, legal rites. We wrangle over whether or not software engineers are really engineers.

Nowadays, you wouldn’t want a teenaged dropout like me to roll out of junior year and onto your pager rotation. The prerequisite knowledge you need to enter the industry has grown, the pace is faster, and the stakes are much higher, so you can no longer learn literally everything on the job, as I once did.

However, it’s not like you can learn everything you need to know at college either. A CS degree typically prepares you better for a life of computing research than life as a workaday software engineer. A more practical path into the industry may be a good coding bootcamp, with its emphasis on problem solving and learning a modern toolkit. In either case, you don’t so much learn “how to do the job” as you do “learn enough of the basics to understand and use the tools you need to use to learn the job.”

Software is an apprenticeship industry. You can’t learn to be a software engineer by reading books. You can only learn by doing…and doing, and doing, and doing some more. No matter what your education consists of, most learning happens on the job—period. And it never ends! Learning and teaching are lifelong practices; they have to be, the industry changes so fast.

It takes a solid seven-plus years to forge a competent software engineer. (Or as most job ladders would call it, a “senior software engineer”.) That’s many years of writing, reviewing, and deploying code every day, on a team alongside more experienced engineers. That’s just how long it seems to take.

Here is where I often get some very indignant pushback to my timelines, e.g.:

“Seven years?! Pfft, it took me two years!”

“I was promoted to Senior Software Engineer in less than five years!”

Good for you. True, there is nothing magic about seven years. But it takes time and experience to mature into an experienced engineer, the kind who can anchor a team. More than that, it takes practice.

I think we have come to use “Senior Software Engineer” as shorthand for engineers who can ship code and be a net positive in terms of productivity, and I think that’s a huge mistake. It implies that less senior engineers must be a net negative in terms of productivity, which is untrue. And it elides the real nature of the work of software engineering, of which writing code is only a small part.

To me, being a senior engineer is not primarily a function of your ability to write code. It has far more to do with your ability to understand, maintain, explain, and manage a large body of software in production over time, as well as the ability to translate business needs into technical implementation. So much of the work is around crafting and curating these large, complex sociotechnical systems, and code is just one representation of these systems.

What does it mean to be a senior engineer? It means you have learned how to learn, first and foremost, and how to teach; how to hold these models in your head and reason about them, and how to maintain, extend, and operate these systems over time. It means you have good judgment, and instincts you can trust.

Which brings us to the matter of AI.

It is really, really tough to get your first role as an engineer. I didn’t realize how hard it was until I watched my little sister (new grad, terrific grades, some hands on experience, fiendishly hard worker) struggle for nearly two years to land a real job in her field. That was a few years ago; anecdotally, it seems to have gotten even harder since then.

This past year, I have read a steady drip of articles about entry-level jobs in various industries being replaced by AI. Some of which absolutely have merit. Any job that consists of drudgery such as converting a document from one format to another, reading and summarizing a bunch of text, or replacing one set of icons with another, seems pretty obviously vulnerable. This doesn’t feel all that revolutionary to me, it’s just extending the existing boom in automation to cover textual material as well as mathy stuff.

Recently, however, a number of execs and so-called “thought leaders” in tech seem to have genuinely convinced themselves that generative AI is on the verge of replacing all the work done by junior engineers. I have read so many articles about how junior engineering work is being automated out of existence, or that the need for junior engineers is shriveling up. It has officially driven me bonkers.

All of this bespeaks a deep misunderstanding about what engineers actually do. By not hiring and training up junior engineers, we are cannibalizing our own future. We need to stop doing that.

People act like writing code is the hard part of software. It is not. It never has been, it never will be. Writing code is the easiest part of software engineering, and it’s getting easier by the day. The hard parts are what you do with that code—operating it, understanding it, extending it, and governing it over its entire lifecycle.

A junior engineer begins by learning how to write and debug lines, functions, and snippets of code. As you practice and progress towards being a senior engineer, you learn to compose systems out of software, and guide systems through waves of change and transformation.

Sociotechnical systems consist of software, tools, and people; understanding them requires familiarity with the interplay between software, users, production, infrastructure, and continuous changes over time. These systems are fantastically complex and subject to chaos, nondeterminism and emergent behaviors. If anyone claims to understand the system they are developing and operating, the system is either exceptionally small or (more likely) they don’t know enough to know what they don’t know. Code is easy, in other words, but systems are hard.

The present wave of generative AI tools has done a lot to help us generate lots of code, very fast. The easy parts are becoming even easier, at a truly remarkable pace. But it has not done a thing to aid in the work of managing, understanding, or operating that code. If anything, it has only made the hard jobs harder.

If you read a lot of breathless think pieces, you may have a mental image of software engineers merrily crafting prompts for ChatGPT, or using Copilot to generate reams of code, then committing whatever emerges to GitHub and walking away. That does not resemble our reality.

The right way to think about tools like Copilot is more like a really fancy autocomplete or copy-paste function, or maybe like the unholy love child of Stack Overflow search results plus Google’s “I feel lucky”. You roll the dice, every time.

These tools are at their best when there’s already a parallel in the file, and you want to just copy-paste the thing with slight modifications. Or when you’re writing tests and you have a giant block of fairly repetitive YAML, and it repeats the pattern while inserting the right column and field names, like an automatic template.

However, you cannot trust generated code. I can’t emphasize this enough. AI-generated code always looks quite plausible, but even when it kind of “works”, it’s rarely congruent with your wants and needs. It will happily generate code that doesn’t parse or compile. It will make up variables, method names, function calls; it will hallucinate fields that don’t exist. Generated code will not follow your coding practices or conventions. It is not going to refactor or come up with intelligent abstractions for you. The more important, difficult or meaningful a piece of code is, the less likely you are to generate a usable artifact using AI.

You may save time by not having to type the code in from scratch, but you will need to step through the output line by line, revising as you go, before you can commit your code, let alone ship it to production. In many cases this will take as much or more time as it would take to simply write the code—especially these days, now that autocomplete has gotten so clever and sophisticated. It can be a LOT of work to bring AI-generated code into compliance and coherence with the rest of your codebase. It isn’t always worth the effort, quite frankly.

Generating code that can compile, execute, and pass a test suite isn’t especially hard; the hard part is crafting a code base that many people, teams, and successive generations of teams can navigate, mutate, and reason about for years to come.

So that’s the TLDR: you can generate a lot of code, really fast, but you can’t trust what comes out. At all. However, there are some use cases where generative AI consistently shines.

For example, it’s often easier to ask chatGPT to generate example code using unfamiliar APIs than by reading the API docs—the corpus was trained on repositories where the APIs are being used for real life workloads, after all.

Generative AI is also pretty good at producing code that is annoying or tedious to write, yet tightly scoped and easy to explain. The more predictable a scenario is, the better these tools are at writing the code for you. If what you need is effectively copy-paste with a template—any time you could generate the code you want using sed/awk or vi macros—generative AI is quite good at this.

It’s also very good at writing little functions for you to do things in unfamiliar languages or scenarios. If you have a snippet of Python code and you want the same thing in Java, but you don’t know Java, generative AI has got your back.

Again, remember, the odds are 50/50 that the result is completely made up. You always have to assume the results are incorrect until you can verify it by hand. But these tools can absolutely accelerate your work in countless ways.

One of the engineers I work with, Kent Quirk, describes generative AI as “an excitable junior engineer who types really fast”. I love that quote—it leaves an indelible mental image.

Generative AI is like a junior engineer in that you can’t roll their code off into production. You are responsible for it—legally, ethically, and practically. You still have to take the time to understand it, test it, instrument it, retrofit it stylistically and thematically to fit the rest of your code base, and ensure your teammates can understand and maintain it as well.

The analogy is a decent one, actually, but only if your code is disposable and self-contained, i.e. not meant to be integrated into a larger body of work, or to survive and be read or modified by others.

And hey—there are corners of the industry like this, where most of the code is write-only, throwaway code. There are agencies that spin out dozens of disposable apps per year, each written for a particular launch or marketing event and then left to wither on the vine. But that is not most software. Disposable code is rare; code that needs to work over the long term is the norm. Even when we think a piece of code will be disposable, we are often (urf) wrong.

In that particular sense—generating code that you know is untrustworthy—GenAI is a bit like a junior engineer. But in every other way, the analogy fails. Because adding a person who writes code to your team is nothing like autogenerating code. That code could have come from anywhere—Stack Overflow, Copilot, whatever. You don’t know, and it doesn’t really matter. There’s no feedback loop, no person on the other end trying iteratively to learn and improve, and no impact to your team vibes or culture.

To state the supremely obvious: giving code review feedback to a junior engineer is not like editing generated code. Your effort is worth more when it is invested into someone else’s apprenticeship. It’s an opportunity to pass on the lessons you’ve learned in your own career. Even just the act of framing your feedback to explain and convey your message forces you to think through the problem in a more rigorous way, and has a way of helping you understand the material more deeply.

And adding a junior engineer to your team will immediately change team dynamics. It creates an environment where asking questions is normalized and encouraged, where teaching as well as learning is a constant. We’ll talk more about team dynamics in a moment.

The time you invest into helping a junior engineer level up can pay off remarkably quickly. Time flies. ☺️ When it comes to hiring, we tend to valorize senior engineers almost as much as we underestimate junior engineers. Neither stereotype is helpful.

People seem to think that once you hire a senior engineer, you can drop them onto a team and they will be immediately productive, while hiring a junior engineer will be a tax on team performance forever. Neither are true. Honestly, most of the work that most teams have to do is not that difficult, once it’s been broken down into its constituent parts. There’s plenty of room for lower level engineers to execute and flourish.

The grossly simplified perspective of your accountant goes something like this. “Why should we pay $100k for a junior engineer to slow things down, when we could pay $200k for a senior engineer to speed things up?” It makes no sense!

But you know and I know—every engineer who is paying attention should know—that’s not how engineering works. This is an apprenticeship industry, and productivity is defined by the output and carrying capacity of each team, not each person.

There are lots of ways a person can contribute to the overall velocity of a team, just like there are lots of ways a person can sap the energy out of a team or add friction and drag to everyone around them. These do not always correlate with the person’s level (at least not in the direction people tend to assume), and writing code is only one way.

Furthermore, every engineer you hire requires ramp time and investment before they can contribute. Hiring and training new engineers is a costly endeavor, no matter what level they are. It will take any senior engineer time to build up their mental model of the system, familiarize themselves with the tools and technology, and ramp up to speed. How long? It depends on how clean and organized the codebase is, past experience with your tools and technologies, how good you are at onboarding new engineers, and more, but likely around 6-9 months. They probably won’t reach cruising altitude for about a year.

Yes, the ramp will be longer for a junior engineer, and yes, it will require more investment from the team. But it’s not indefinite. Your junior engineer should be a net positive within roughly the same time frame, six months to a year, and they develop far more rapidly than more senior contributors. (Don’t forget, their contributions may vastly exceed the code they personally write.)

In terms of writing and shipping features, some of the most productive engineers I’ve ever known have been intermediate engineers. Not yet bogged down with all the meetings and curating and mentoring and advising and architecture, their calendars not yet pockmarked with interruptions, they can just build stuff. You see them put their headphones on first thing in the morning, write code all day, and cruise out the door in the evening having made incredible progress.

Intermediate engineers sit in this lovely, temporary state where they have gotten good enough at programming to be very productive, but they are still learning how to build and care for systems. All they do is write code, reams and reams of code.

And they’re energized…engaged. They’re having fun! They aren’t bored with writing a web form or a login page for the 1000th time. Everything is new, interesting, and exciting, which typically means they will do a better job, especially under the light direction of someone more experienced. Having intermediate engineers on a team is amazing. The only way you get them is by hiring junior engineers.

Having junior and intermediate engineers on a team is a shockingly good inoculation against overengineering and premature complexity. They don’t yet know enough about a problem to imagine all the infinite edge cases that need to be solved for. They help keep things simple, which is one of the hardest things to do.

If you ask, nearly everybody will wholeheartedly agree that hiring junior engineers is a good thing…and someone else should do it. This is because the long-term arguments for hiring junior engineers are compelling and fairly well understood.

We need more senior engineers as an industry
Somebody has to train them
Junior engineers are cheaper
They may add some much-needed diversity
They are often very loyal to companies who invest in training them, and will stick around for years instead of job hopping
Did we already mention that somebody needs to do it?

But long-term thinking is not a thing that companies, or capitalism in general, are typically great at. Framed this way, it makes it sound like you hire junior engineers as a selfless act of public service, at great cost to yourself. Companies are much more likely to want to externalize costs like those, which is how we got to where we are now.

However, there are at least as many arguments to be made for hiring junior engineers in the short term—selfish, hard-nosed, profitable reasons for why it benefits the team and the company to do so. You just have to shift your perspective slightly, from individuals to teams, to bring them into focus.

Let’s start here: hiring engineers is not a process of “picking the best person for the job”. Hiring engineers is about composing teams. The smallest unit of software ownership is not the individual, it’s the team. Only teams can own, build, and maintain a corpus of software. It is inherently a collaborative, cooperative activity.

If hiring engineers was about picking the “best people”, it would make sense to hire the most senior, experienced individual you can get for the money you have, because we are using “senior” and “experienced” as a proxy for “productivity”. (Questionable, but let’s not nitpick.) But the productivity of each individual is not what we should be optimizing for. The productivity of the team is all that matters.

And the best teams are always the ones with a diversity of strengths, perspectives, and levels of expertise. A monoculture can be spectacularly successful in the short term—it may even outperform a diverse team. But they do not scale well, and they do not adapt to unfamiliar challenges gracefully. The longer you wait to diversify, the harder it will be.

We need to hire junior engineers, and not just once, but consistently. We need to keep feeding the funnel from the bottom up. Junior engineers only stay junior for a couple years, and intermediate engineers turn into senior engineers. Super-senior engineers are not actually the best people to mentor junior engineers; the most effective mentor is usually someone just one level ahead, who vividly remembers what it was like in your shoes.

A healthy team is an ecosystem. You wouldn’t staff a product engineering team with six DB experts and one mobile developer. Nor should you staff it with six staff+ engineers and one junior developer. A good team is composed of a range of skills and levels.

Have you ever been on a team packed exclusively with staff or principal engineers? It is not fun. That is not a high-functioning team. There is only so much high-level architecture and planning work to go around, there are only so many big decisions that need to be made. These engineers spend most of their time doing work that feels boring and repetitive, so they tend to over-engineer solutions and/or cut corners—sometimes at the same time. They compete for the “fun” stuff and find reasons to pick technical fights with each other. They chronically under-document and under-invest in the work that makes systems simple and tractable.

Teams that only have intermediate engineers (or beginners, or seniors, or whatever) will have different pathologies, but similar problems with contention and blind spots. The work itself has a wide range in complexity and difficulty—from simple, tightly scoped functions to tough, high-stakes architecture decisions. It makes sense for the people doing the work to occupy a similar range.

The best teams are ones where no one is bored, because every single person is working on something that challenges them and pushes their boundaries. The only way you can get this is by having a range of skill levels on the team.

The bottleneck we face now is not our ability to train up new junior engineers and give them skills. Nor is it about juniors learning to hustle harder; I see a lot of solid, well-meaning advice on this topic, but it’s not going to solve the problem. The bottleneck is giving them their first jobs. The bottleneck consists of companies who see them as a cost to externalize, not an investment in their—the company’s—future.

After their first job, an engineer can usually find work. But getting that first job, from what I can see, is murder. It is all but impossible—if you didn’t graduate from a top college, and you aren’t entering the feeder system of Big Tech, then it’s a roll of the dice, a question of luck or who has the best connections. It was rough before the chimera of “Generative AI can replace junior engineers” rose up from the swamp. And now…oof.

Where would you be, if you hadn’t gotten into tech when you did?

I know where I would be, and it is not here.

The internet loves to make fun of Boomers, the generation that famously coasted to college, home ownership, and retirement, then pulled the ladder up after them while mocking younger people as snowflakes. “Ok, Boomer” may be here to stay, but can we try to keep “Ok, Staff Engineer” from becoming a thing?

Lots of people seem to think we don’t need junior engineers, but nobody is arguing that we need fewer senior engineers, or will need fewer senior engineers in the foreseeable future.

I think it’s safe to assume that anything deterministic and automatable will eventually be automated. Software engineering is no different—we are ground zero! Of course we’re always looking for ways to automate and improve efficiency, as we should be.

But large software systems are unpredictable and nondeterministic, with emergent behaviors. The mere existence of users injects chaos into the system. Components can be automated, but complexity can only be managed.

Even if systems could be fully automated and managed by AI, the fact that we cannot understand how AI makes decisions is a huge, possibly insurmountable problem. Running your business on a system that humans can’t debug or understand seems like a risk so existential that no security, legal or finance team would ever sign off on it. Maybe some version of this future will come to pass, but it’s hard to see it from here. I would not bet my career or my company on it happening.

In the meantime, we still need more senior engineers. The only way to grow them is by fixing the funnel.

No. You need to be able to set them up for success. Some factors that disqualify you from hiring junior engineers:

You have less than two years of runway
Your team is constantly in firefighting mode, or you have no slack in your system
You have no experienced managers, or you have bad managers, or no managers at all
You have no product roadmap
Nobody on your team has any interest in being their mentor or point person

The only thing worse than never hiring any junior engineers is hiring them into an awful experience where they can’t learn anything. (I wouldn’t set the bar quite as high as Cindy does in this article; while I understand where she’s coming from, it is so much easier to land your second job than your first job that I think most junior engineers would frankly choose a crappy first job over none at all.)

Being a fully distributed company isn’t a complete dealbreaker, but it does make things even harder. I would counsel junior engineers to seek out office jobs if at all possible. You learn so much faster when you can soak up casual conversations and technical chatter, and you lose that working from home. If you are a remote employer, know that you will need to work harder to compensate for this. I suggest connecting with others who have done this successfully (they exist!) for advice.

I also advise companies not to start by hiring a single junior engineer. If you’re going to hire one, hire two or three. Give them a cohort of peers, so it’s a little less intimidating and isolating.

I have come to believe that the only way this will ever change is if engineers and engineering managers across our industry take up this fight and make it personal.

Most of the places I know that do have a program for hiring and training entry level engineers, have it only because an engineer decided to fight for it. Engineers—sometimes engineering managers—were the ones who made the case and pushed for resources, then designed the program, interviewed and hired the junior engineers, and set them up with mentors. This is not an exotic project, it is well within the capabilities of most motivated, experienced engineers (and good for your career as well).

Finance isn’t going to lobby for this. Execs aren’t likely to step in. The more a person’s role inclines them to treat engineers like fungible resources, the less likely they are to understand why this matters.

AI is not coming to solve all our problems and write all our code for us—and even if it was, it wouldn’t matter. Writing code is but a sliver of what professional software engineers do, and arguably the easiest part. Only we have the context and the credibility to drive the changes we know form the bedrock for great teams and engineering excellence..

Great teams are how great engineers get made. Nobody knows this better than engineers and EMs. It’s time for us to make the case, and make it happen.

In Praise of “Normal” Engineers

Measuring productivity is fraught and imperfect

Engineers don’t own software, teams own software

The best engineering orgs are the ones where normal engineers can do great work

Let’s talk about “normal” for a moment

Build sociotechnical systems with “normal people” in mind

How do you turn normal engineers into 10x engineering teams?

Shrink the interval between when you write the code and when the code goes live.

Make it easy and fast to roll back or recover from mistakes.

Make it easy to do the right thing and hard to do the wrong thing.

Invest in instrumentation and observability.

Devote engineering cycles to internal tooling and enablement.

Build an inclusive culture.

Diverse teams are resilient teams.

Assemble engineering teams from a range of levels.

The only meaningful measure of productivity is impact to the business

Great engineering orgs mint world-class engineers

Don’t hire the “best” people. Hire the right people.

In retrospect, I semi regret the “o11y 2.0” framing

Why do we need new language?

Can we just … not send all that data?

The hunger for more cost control levers is real

Telemetry pipelines, tiered storage, data governance

Money is not always the most expensive resource

Observability is moving towards a data lake model

The software industry is growing up

Software is an apprenticeship industry

What does it mean to be a “senior engineer”?

We need to stop cannibalizing our own future

Writing code is the easy part

It’s easy to generate code, and hard to generate good code

How working engineers really use generative AI

Generative AI is a little bit like a junior engineer

But generative AI is not a member of your team

We underestimate the cost of hiring seniors, and overestimate the cost of hiring juniors

You do not have to be a senior engineer to add value

The long term arguments for hiring junior engineers

The short term arguments for hiring junior engineers

A healthy, high-performing team has a range of levels

The bottleneck we face is hiring, not training

Nobody thinks we need fewer senior engineers

Should every company hire junior engineers?

Nobody is coming to fix our problems for us