Why the Incumbents Can't Ship This

The most common question we get about autonomous software is also the best one: won't Salesforce / HubSpot / Instantly just do this themselves?

They will certainly try. Every incumbent has shipped or announced a copilot, and some of them are genuinely good. So let me be precise about the claim, because it is not "incumbents are slow" or "incumbents can't do AI." They aren't and they can.

The claim is narrower and harder: a copilot is the ceiling of what their architecture and their business model allow, and the distance from copilot to autonomous software is not a feature gap. It's a rewrite of the assumptions underneath the product. I'm the engineer of our founding team, so this is the technical half of the argument; my co-founder Thomas covers the business-model half in The End of the Seat. The two halves reinforce each other, which is exactly why the trap is so hard to escape.

01Every layer assumes a human session

Take any mature SaaS product and look at what's actually load-bearing in its architecture. It's not the features. It's an assumption so old it's invisible: somebody is logged in.

The data model is human-paced. State machines advance when a user clicks. Records have status fields that a person is expected to move. Half-finished work lives in browser state, draft modals, and the operator's head — not in the database. There is no first-class representation of "a job in progress," because the job in progress was the human's session.
Validation lives in the UI. Twenty years of edge-case handling is encoded as form validation, confirmation dialogs, tooltips, and "are you sure?" modals. The API beneath is thinner and more permissive than the UI, because the UI was the real guardrail. An autonomous operator works the API path — the path with the least accumulated wisdom.
Errors resolve to a human. When something goes sideways, the system's deep recovery strategy is: show an error, let the user figure it out. That's a fine strategy when a user is there. Autonomy needs durable execution — workflows that survive restarts, retry with judgment, escalate with context, and reconcile partial failures — woven through every operation. You cannot sprinkle that on afterwards; it changes how every write is structured.
Permissions model people, not policies. Roles, seats, SSO — who may click what. Autonomy needs a different question answered in the schema itself: what is the software allowed to decide alone, what needs approval, and what are the spending limits? Stakes, budgets, and autonomy levels per action class. If guardrails aren't first-class data, "autonomous mode" is a liability with a toggle.
Audit explains users, not decisions. Their logs answer "who changed this field." An autonomous system must answer "why did you email this person, and what did it cost?" — every action traceable to a goal, a decision, and a ledger entry. We learned this the expensive way: an LLM that moves money or sends email without a cost ledger and a decision audit isn't a product, it's an incident report with a waiting period.

None of these are exotic. Each is a known pattern. The problem is that in a mature codebase the human-session assumption isn't a component you can swap — it's load-bearing in all of them at once, and it's coupled to the thing enterprises pay for: predictability. Which brings us to the part that isn't technical at all.

02The copilot is the local maximum

Given that architecture, what's the rational AI strategy? Exactly what they're all doing: put an assistant in the corner of the screen. A copilot fits perfectly because it changes nothing — it lives in UI-land, drafts text into the same forms, and leaves the human to click the same buttons. The session assumption survives; the seat gets a feature.

This is why I say copilots are the ceiling, not the path. The copilot pattern is what the architecture yields under pressure. Going further — letting the software run jobs end-to-end, overnight, unattended — means confronting every bullet in the list above simultaneously, in a codebase where every one of them has a decade of features sitting on top.

And here the technical trap interlocks with the economic one. The rewrite is merely expensive; the destination is the real deterrent. Their revenue is priced per seat, per human operator. Software that removes the operator attacks the denominator of their own pricing model. Christensen wrote the book on this: it's not that incumbents don't see disruption coming, it's that their best customers and best margins vote, every quarter, against becoming the thing that's coming. (Thomas takes this apart properly in his essay.)

So the honest version of "won't they just do this?" is: they would have to re-architect the product and re-price the business at the same time, while the current product and the current pricing are still paying everyone's salaries. Companies have done this — it has a name, "betting the company" — and it's rare for a reason.

03What building autonomy-first looks like

The inverse claim has to be earned too: what do you actually get by starting from zero with autonomy as the ground assumption? Concretely, in our stack, the differences are structural:

Jobs, not sessions, are the unit of work. Everything Otto does — research a market, source a play's accounts, run an outreach sequence — is a durable workflow with persisted state. A deploy, a crash, a rate-limited provider: the job resumes. No browser tab is ever the source of truth.
Guardrails are schema, not vibes. Budgets, daily pacing caps, per-action stakes, and autonomy levels are rows and columns. The operator brain proposes; a deterministic layer checks feasibility against the guardrails and either applies, queues for approval, or refuses. The LLM decides what; typed code decides how much is allowed.
Every euro and every decision has a ledger line. LLM calls, enrichment costs, sends — attributed to the product, the play, the decision that caused them. "Why did you do that, and what did it cost?" is a query, not a forensic investigation.
The human's controls are part of the data model. Approve/auto modes per decision class, hand-edits treated as ground truth, a kill switch that the system re-checks before acting — designed in, because the steering wheel is the product as much as the engine is.

Could an incumbent build all this? Of course — as a new product, on a new architecture, cannibalizing their own seats. At which point they're not shipping a feature. They're founding a competitor to themselves that happens to share a logo.

04Becoming, not shipping

That's the asymmetry in one line: for us, autonomy is the foundation; for them, it's a renovation under a building that's open for business.

History rhymes here. The on-prem giants didn't lose to SaaS because they couldn't write web apps — they lost because multi-tenancy, subscriptions, and continuous deployment contradicted the license-and-maintenance machine that fed them. They bought their way into the new era a decade late, at conglomerate prices. The same shape repeats: when the new thing inverts a load-bearing assumption, incumbents don't ship it. They eventually acquire it.

Being autonomous software is something a company would have to become. We chose to start as one.

The proof, as always, is a URL away: outbound.ottosoftwares.com. Drop your website in and watch software with no operator do an operator's job.

— Benoit