MPX Group
Operator Brief · No. 02
Operator Brief · No. 02

How Your Business Runs in Software 3.0

By Michael Pietrzak Reading 13 min Published May 2026

You started using Claude this year. Your team did too. There is a whole vocabulary getting thrown around and nobody is stopping to define it. This brief is the cheat sheet — and the framework to put you a year ahead.

You signed up for Claude or ChatGPT this year. Your team probably did too. You have already noticed there is a whole vocabulary getting thrown around as if everyone knows what it means. Context window. Vibe coding. Agentic engineering. Spec. Prompt. Most operators are nodding along and hoping nobody asks them to define any of it.

That stops here. This brief is the cheat sheet. I am going to define every term that matters, in plain English, with examples from running an actual business. By the end, you will know more about how to use this stuff than ninety percent of the business owners in your market. That is not a flex. It is just that almost nobody is taking the time to actually explain it.

The frame for this brief comes from a man named Andrej Karpathy. He helped found OpenAI. He ran the self-driving program at Tesla. He is one of three or four people on the planet whose opinion on where this technology is going actually matters. Last month he gave a talk naming what just happened to software and where it is headed next. Most of what I am about to walk you through started with what he said in that thirty minutes. The rest is what I have learned running it in my own companies.

Read this once. The next time a vendor, a consultant, or somebody on your team uses one of these words, you will know exactly what they mean — and whether they are doing it right.

I.Three Generations

The three generations of software

Software has gone through three generations. Most operators are still operating like the first one. Some are using the second without realizing it. The third is what just changed everything.

1.0
You write code.
Somebody types instructions in a programming language and the machine runs them exactly as written. The spreadsheet macro in your accounting workbook. The script that emails your weekly report.
2.0
You teach a network.
Instead of writing rules, engineers feed millions of examples to a machine and let it learn the rules. The spam filter on your inbox. Face unlock on your phone. Lane-keeping in your truck.
3.0
You write English.
The machine is already trained. You program it by typing what you want in plain English. "Draft a client follow-up about the design changes from these meeting notes."

Three generations. Same machine. Different programmer.

That last one is the one that matters. When you type "draft a client follow-up email about the design changes" into Claude, you are writing a Software 3.0 program. The English is the program. The model is the computer. The output is the result. You did not need a developer. You did not need a database. You wrote a sentence and got back work.

In Software 1.0, the programmer wrote the program. In Software 2.0, the math nerd wrote the program. In Software 3.0, you write the program. The person running the company. You no longer need a developer in the room to make software do what you want. You need somebody who can write down clearly what they want done — and that somebody is you.

The bottleneck stopped being engineering talent. It became the ability to describe a business clearly. That is the part you have been doing your whole career.
II.The Context Window

The context window — Claude's desk

The first new word: context window. It sounds technical. It is not.

Picture Claude as a brilliant consultant who started ten seconds ago and is going to be let go in twenty minutes. He has no memory of yesterday. No relationship with your team. No idea what your business does. He has one thing — a desk. Whatever you put on that desk, he can see and use. Whatever is not on the desk, he cannot see at all.

The desk is the context window. It is the block of text Claude can hold in his head during one conversation. Whatever you type, paste, upload, or attach is on the desk. Whatever you did not put on the desk — your client history, your brand guide, your operating manual, the email from a vendor last month — does not exist for him.

This is why most owners using Claude get mediocre results. They walk in, give him no desk and no context, ask a vague question, and judge the technology by the answer they got back. Of course it was mediocre. The consultant did not know anything about the business.

Your operating docs stopped being documentation. They became the materials you put on the desk every morning.

The operators who win the next decade are the ones who treat every important written-down thing in their company — the brand voice, the customer-fit criteria, the bid template, the decision rules — as material that goes onto the desk before any work begins. Without that, the agent is guessing. With it, the agent is running the company the way you would.

III.Verifiable or Not

The one sort that decides everything

The next word you will hear: verifiable. It means one thing — can a machine check whether the answer is right?

A math problem is verifiable. Two plus two equals four; the machine knows. Code that has to pass a test is verifiable — it passes or it doesn't. Reconciling a bank statement is verifiable — the totals balance or they don't.

A well-written email is not verifiable. There is no machine that can grade tone. Whether a client is a fit for your business is not verifiable. Whether to walk away from a deal is not verifiable. These are matters of taste, judgment, and context.

This matters because of how Claude learns. The model gets dramatically better at verifiable work and barely better at non-verifiable work. Nobody can tell the model "your taste improved this quarter." So the model improves like a savant — incredible at things that can be checked, surprising at things that cannot.

It is why the same model that can rewrite a 100,000-line code base will tell you to walk to the car wash a block away when you obviously should drive. The first task has a scoring function. The second does not.

For an operator the implication is simple. Every recurring workflow in your company is either checkable by a machine, or it is not.

Verifiable

Push hardest here.
  • Reconciling QuickBooks — totals balance or they don't
  • Assembling documents against a checklist
  • Translating a contractor bid into a lender's template
  • Schedule conflicts and double-bookings
  • Verifying invoice line items against a contract
  • Compliance checks against a fixed list

Not verifiable

Stay in the loop yourself.
  • Brand voice on outbound communications
  • How to negotiate a particular deal
  • Whether a new client is a fit
  • When to walk away from a project
  • Which vendor to trust
  • When to hire and who

The left column is where you push hardest. The right column is where you stay in the loop yourself. Confusing the two is the most common mistake an operator will make this year. Companies that automate the right-hand work end up shipping confident, wrong, expensive output. Companies that refuse to automate the left-hand work pay a person to do something a script does better and faster.

Sort your workflows by this one filter. The answer to "where do I start with AI" falls right out of the page.

IV.Apps That Shouldn't Exist

The reports that should not exist anymore

Here is a story worth knowing. There is a small app called MenuGen. You take a photo of a restaurant menu, the app sends back pictures of every dish — useful for travelers reading menus in a language they cannot read. To build it the old way you needed three pieces of software stitched together. First, a piece that reads text out of images (this is called OCR — optical character recognition). Second, an image generator that draws each dish. Third, code that lays the pictures back onto the menu image.

That was the old way. Then the new way showed up. Take the same photo. Hand it to Claude or Gemini. Type "make me a version of this menu with a picture of every dish next to each item." Get the finished image back. No three pieces of software. No OCR. No image-stitching code. One sentence. One model. The finished output.

The lesson is bigger than menus. There is a whole category of apps, dashboards, and reports your business is paying for, or paying people to produce, that are about to disappear — because the model can just produce the output in one shot from the same inputs.

Half the dashboards and reports your company maintains are about to look like that menu app. Nobody re-reads them, and the model can render them from scratch every time you ask.

The test is short. For every dashboard you pay for and every recurring report your team produces, ask one question: could the model render this fresh on demand from the same inputs, with one prompt? If yes, the artifact does not need to exist. The team is maintaining something that one sentence into Claude can produce from scratch in three seconds, every time you want it.

This does not mean kill your mission control. Anything that captures input or holds state — your CRM, your scheduler, your project management software — stays. The test applies to the read-once, read-only stuff. The weekly summary nobody opens twice. The dashboard that gets a Monday glance. The report that piles up in an inbox. Most companies are carrying ten of them.

V.Vibe vs Agentic

The two ways to use AI — and only one compounds

This is the most important section in the brief. There are two ways to use AI in a business. They look identical from the outside. They are completely different jobs underneath.

Way one: vibe coding. The name comes from the early days of all this. It means you walk up to Claude, you describe what you want, the model gives you something back, and you use what it gave you without checking it carefully. The cold email goes out. The query runs. The contract draft gets sent to the lawyer. The dashboard ships. You "vibed" with it — it felt right, you sent it.

Vibe coding raises the floor. Anyone can do it now. The salesperson with no engineering background can write a working little app over the weekend. The bookkeeper can have Claude generate a custom report. That is a real change and it is mostly good. The catch is that quality is whatever the model happened to land on that day. Nobody read it. Mistakes sneak in. Tone drifts. Wrong dates, wrong dollar amounts, wrong names propagate downstream.

Way two: agentic engineering. Different game entirely. Same tools — same Claude, same ChatGPT. Different discipline. Here, the work is set up the way a small engineering team would set it up, except the engineers are agents. There is a written-down standard for what the work should look like. There is a way to check the output before it ships. There is a place to log mistakes so they do not repeat. There is one source of truth the agent reads every time it runs. The team is two people and a stack of agents, and they are producing what used to take twenty.

Vibe coding

Raises the floor.
  • Anyone can do it, no training required
  • Type a request, paste the answer back, send it
  • Nobody reads what came out carefully
  • Quality is wherever the model landed that day
  • Mistakes, drift, and surprises slip through

Agentic engineering

Raises the ceiling.
  • Written-down standards the agent reads every time
  • A check before any work ships
  • One source of truth all the agents read from
  • A way for mistakes to feed back into the standard
  • Same quality bar as before, ten to twenty times the output
Vibe coding is doing the same work a little faster. Agentic engineering is running a different company.

Almost every operator is doing vibe coding without realizing it. Somebody on the team uses ChatGPT to draft a contract and sends it without re-reading. Somebody has Claude write a database query and runs it on the live database. Somebody generates outbound emails in a voice the brand would never have approved. None of that is wrong on its face. None of it compounds either. There is no system underneath, just a person and a model.

The operators who figured this out early are on the other side. They have written-down standards. They have a check before output ships. They have one place every agent reads from. They are running entire departments with two people instead of twenty. The gap widens every week.

You have a choice in front of you. Stay where most companies are, with the team using AI loosely, or cross over to the other side. The rest of this brief is what crossing over looks like.

VI.The Spec

A spec is just a written-down standard

The last new word: spec. Short for "specification." Sounds technical. It is not.

A spec is a written description of what you want, clear enough that somebody who has never met you can do the job and get it right the first time. You already write specs — you just call them by different names. The job sheet you hand a subcontractor is a spec. The brief you give a graphic designer is a spec. The instructions you give a new hire on day one is a spec. The bid sheet, the punch list, the change order — all specs.

In an agentic-engineering setup, the spec becomes the thing the agent reads to do the work. The human owns the spec. The agent owns the execution against it. The split is clean. You decide what good looks like and where the line is. The agent fills in the blanks underneath.

A spec is a document an agent can execute against. Most operating docs are not specs. They are notes.

Here is where most companies fall short. They have operating documents — but those documents are not specs. They are SOPs in a Drive folder nobody opens. Slide decks from last year's offsite. Tribal knowledge in the founder's head. None of that is something an agent can read and run. A spec is different. It names what you own, what the agent owns, what the rules are, what good looks like, and what failure looks like. It is written so an agent can read it cold and execute against it without supervision.

If you want to know whether you are ready for agentic engineering, look at your operating documents and ask a single question: could a smart consultant who started today run my company off these without asking me anything? If the answer is no, you do not have specs. You have notes. The first job is to turn the notes into specs.

VII.In Production

What this actually looks like in a real business

None of this is theory. Three examples from the companies I am running right now, each one a small piece of agentic engineering in production. Read them as the concepts you just learned, but doing real work.

1

The bookkeeper that gets sharper every week.

Every Sunday morning, a small program reads the bank feed for the interior-design firm I am acquiring. It posts the entries it is sure about straight into QuickBooks — using last year's chart of accounts as the written-down standard. The ambiguous entries (a new vendor it has not seen before, a deposit with no memo, a refund that does not match an invoice) get bundled and sent to me as a single text message with six or eight questions. I answer the questions from wherever I am. The next Sunday's run reads my answers, posts the entries, and writes the new rules into the standard so it does not have to ask me again next time. Bookkeeping went from a four-hour Sunday to twelve minutes — and the standard sharpens every week.

2

The lender package that assembles itself.

A real estate deal at our holding company generates thirty-plus documents that a lender, title company, escrow agent, and insurer all want — each one in a different bundle, in a different order. We keep a small tracker inside the deal folder listing what each party still needs as rows. A program reads the open rows, pulls the right files from the deal's documents, stages them in a clean folder, and drafts the cover email in the right voice for the right counterparty. Eight minutes instead of four hours. Every new request becomes a new row, and every future deal inherits the work.

3

The dashboard that reads the company.

One file is the single source of truth for what is happening across the business. Every program in the stack reads it — the morning briefing, the desktop dashboard, the email triage, the deal trackers. They all read from the same place, so they are all looking at the same picture. Each morning at 6:00 AM Central, a program writes the day's briefing into the same folder all the agents read. The briefing is not curated by someone. It is the company looking at itself.

Notice the shape underneath all three. A written-down standard the agent reads. Work that is verifiable. A way for mistakes to feed back into the standard. The three concepts you just learned, running in the open. None of these are demos. They are what running my own day looks like.

VIII.Where You Stand

Where most companies stand — and where you have a choice

Most operators are sitting on the wrong side of this line and do not know it yet.

Their team is using AI. The CFO drafts a board memo with ChatGPT. Sales uses it for cold emails. A PM summarizes meetings with it. None of that is wrong. None of it compounds either. It is vibe coding for work that should be agentic engineering.

The work to cross over is not buying a bigger AI subscription. It is harder than that. Sort your workflows by what is verifiable. Kill the reports nobody reads. Turn your operating manual into a real spec. Name, in writing, what you own and what the agent owns. Stand up a way for mistakes to feed back into the standard. Put a verification gate on anything important before it ships.

The good news is you now have the vocabulary. Three generations of software. The context window. Verifiable work versus not. Vibe coding versus agentic engineering. The spec. Every owner you know is going to be hearing these words for the next year — you can already use them, and use them correctly.

AppsPrompts
DocumentationThe desk Claude reads
DashboardsRendered on demand
Vibe codingAgentic engineering
SOPs nobody readsSpecs the agent runs

From here you have two real choices.

Option one — take this brief and start. The work is not magic. The frame is in your hands and the tools are off-the-shelf. Plenty of operators are going to do this themselves and pull ahead of their market. If you are technically curious and have the time, the path is open.

Option two — bring somebody in who has already done it. Somebody who lets you skip the learning curve and start with the practices, the specs, and the agent setups already running. That is the work I do at MPX. I build the standards, the verification gates, and the agent setups that move a company across this line, and I stay on long enough that it keeps working after I leave.

Either path works. What does not work is staying where most companies are.

This is what the next generation of software looks like for an operator.

Anything short of this and the next decade goes to somebody who built it first.