Monday, 10 March 202516 min read

What Comes After Vibe Coding

Your phone screen is a luminous rectangle containing a grid of rectangles, each one pretending the others don't exist.

Tap one rectangle and it fills the screen with the weather. Another shows your inbox.

Open your browser rectangle, type in "airbnb.com" and you get another rectangle.

None of these rectangles talk to each other. None of them know you just received an email that changes everything.

And yet, what are we vibe coding today?

More rectangles.

Faster than ever.

Since November, two trends point to where personal computing is heading next.

The first is that musicians, designers and other non-coders are now vibe coding apps with Claude Code. I wrote an article about that here.

The second is easy to miss, because it produces nothing visible at all.

When Peter Steinberger sent a voicemail to OpenClaw over WhatsApp on a whim, he didn't expect it to work. But the agent figured out which tools it needed, decoded the audio message, and sent a push notification back.

No app. No rectangle.

Just a continuous conversation with an AI model using its 'claws' to stitch files and APIs together.

The first trend makes rectangles easier and faster to build. The second makes them unnecessary.

OpenClaw gives us a glimpse at an app-free future where skills and tools can be composed together on a whim.

Which raises the questions: What role do apps have in the future? Will we even need apps when our phones become our personal agents? And how will dynamic rendering make us long for rectangles once more?

How We Got Here

The rectangle is not some fundamental unit of computing. It was a necessary abstraction that allowed us to divide up the work.

Word processors, spreadsheets and image editors all required teams of engineers to build. Packaging everything into rectangle-shaped apps that sat on top of Windows or Web APIs allowed companies to raise money and do one thing well.

Later on, iPhones brought rectangles to the masses. Each storing data in its own silo, showing you its own ads, and nudging you with its own notifications.

The rectangle became a unit of commerce, often the product of a VC-backed company hoping to get acquired.

But OpenClaw is pointing to a world where the best UI is no UI. Where many tasks happen in the background without any user intervention.

This poses an existential risk to apps and vibe coders alike.

Untangling Apps

If I asked you "What is an app?" and you said "an interface over data", you'd be right.

But this scientific description glosses over the fact that the user interface is the only way a company has to sway you emotionally.

The best apps deliberately fuse data with delightful experiences: Tinder's swipe, Uber's real-time map, a guitar synth that allows you to bend the strings on-screen.

Ask most people to think of "Uber", they'll probably picture the app on their phone.

And yet behind the app is a complex web of legal agreements with cities and drivers, a ride matching engine, a reputation system, and an interface designed to extract maximum revenue from its users.

We humans may have an emotional connection with apps we've been using for years, but personal agents like OpenClaw aren't going to be tapping and swiping around the screens.

They want direct access to the data!

And this is where Big App gets nervous.

Take away the interface and a lot of their tricks to retain you and extract more money stop working.

An animated "Surge Price" countdown timer won't make your lobster's heart beat any faster.

The entire app economy is built on the assumption that a human is looking at a screen. Want to see who's behind the blurred-out square in a dating app? You'll need to subscribe to premium.

Remove the screen and the old business model falls apart.

What Stays, What Goes

The kind of apps you find in the App Store today are being squeezed from both ends.

Vibe coding shows unlimited demand for personalized interfaces. AI agents don't need them at all.

Anyone paying attention to both trends can sense everything is about to change, so let's start with what won't: the backend.

The backend Services that match riders with drivers, process payments, route deliveries, etc. aren't going anywhere.

These systems will remain reassuringly deterministic. Nobody wants a creative interpretation of a bank transfer.

But go up a layer, and consider the web or app interface that calls these APIs.

Until now, custom interfaces were like private chefs. Only millionaires could afford to hire a team to cook exactly what they wanted. Everyone else ate at McDonald's - the same menu, designed once, served to billions.

Minor variations (color themes, dark/light mode) became switches and sliders in a Settings screen. A screen that most non-technical people never go near out of fear. "If it ain't broke, don't fix it."

Deterministic mass-market interfaces were never what we wanted - it's simply all we could make before AI came along.

Interfaces From Descriptions

We already have an existence proof that dynamic interfaces are possible.

Give two different image generation models the same detailed prompt and they'll both produce something faithful to that description. Not pixel-identical. But recognizably the same scene.

ChatGPT's Image 1.5 model. Prompt: A dramatic, cinematic product-ad parody in the style of the Brawndo commercials from the movie Idiocracy. A glowing, oversized energy-drink-style can labeled "VIBE CODE" with electric neon green and purple lightning bolts radiating from it. The can is being held up triumphantly by a pair of hands emerging from a laptop screen. Instead of electrolytes, the can's tagline reads "IT'S WHAT BUILDERS CRAVE." The background is an explosion of floating app UI windows, deploy buttons, and database icons being absorbed into the can like a vortex. Hyper-saturated colors, lens flares, absurd corporate energy, mountain-dew-ad-meets-tech-startup aesthetic. Shot like a Super Bowl commercial still frame. Photorealistic with exaggerated commercial lighting and a subtle tongue-in-cheek tone. Purple colors.

The identical prompt given to Nano Banana

Dynamic interfaces could work the same way. Here's what a specification for one might look like:

"When the user's ride is two minutes away, render a full-width map showing the car's live position relative to the user. Use the service's brand colors for the vehicle marker. Animate the route as a flowing line, not a static path. When the car arrives, collapse the map to a quarter of the screen and surface the driver's name and vehicle details prominently."

Let's call that an Intent. No code, pixel coordinates or responsive breakpoints. You can see the interface in your head and so can a model.

Now imagine it springing to life on your screen in real time, adapted to the device, screen orientation and in your favorite colors and styles.

Someone still has to write these specifications and create surprise and delight. Designers aren't going anywhere.

Now I know what you're thinking. Interfaces aren't images. What about error states, cancellations, network failures?

But the models already understand all of this.

They've seen millions of apps. They know what to show when a ride gets canceled, the same way they know to add error handling when you vibe code today. Intents describe what's different, what matters. The model fills in the rest.

Veo 3.1 simulating Slack

Another objection is likely to be consistency. Don't we need our interfaces to look and behave the same way every time?

The answer is, it depends. For money transfers or signing a legal document onscreen, yup - deterministic interfaces are going to hang around for a while.

But the screen that shows you the weather, nearby restaurants, or your calendar for the day? They need to be correct, but nobody ever wanted these interfaces to look the same on every device.

What It Will Feel Like

When we figure out how to tap into the immense hardware already inside our modern phones, dynamic rendering will bring everything alive.

You'll talk to the screen and it will adjust in real time. "Make it more green." "Bring back the map from yesterday."

Want something to stay in the same place? Flick it, squeeze it, lock it where you want it. Muscle memory is preserved.

Drop things on other things and the AI figures out your intention, or asks you if it's unsure. Drag your calendar to the map icon and suddenly a live route appears.

You notice your friend using their phone on the train and love their look and feel, so you butt your phones together and pull their styles across and blend them with your own.

Their taste, your data.

Then Taylor Swift releases a style pack for $9 and your Uber now moves around on the real-time map in a glittery pink haze. Every new style just another track on the mixing desk. A starting point for further customization.

Take this a step further and it's not hard to imagine groups of people creating software experiences together.

Several people around a large screen. The AI recognizes each voice, already knows what each person cares about. One describes a character, another adjusts the lighting, a third pushes back on a narrative choice.

Software creation stops being a solo gig and becomes more like jamming in a band.

Nobody Wants Perfection

Dynamic rendering will make all day-to-day computing perfect. And that's the problem.

Imagine you've just moved into a new home. New furniture, your favorite colors, just the right amount of clutter. Everything arranged exactly the way you want it.

Like a partner who's "too nice", perfect gets boring quickly.

So you close the front door and walk outside. New places, new people, new experiences to talk about when you get home. Then the pendulum swings too far.

Soon you need a vacation from your vacation. Nobody wants to live in Disneyland forever.

Designed Experiences

If our home screens show us everything we want to see, then we'll need something to take us out of our comfort zone.

Think about the last time a game really made you think, or a movie changed your entire perspective on life. Someone made a deliberate choice to challenge you and make you uncomfortable.

Where your home screen adapts to you, a Designed Experience will force you to adapt to it.

Your favorite film director doesn't care about your color palette or favorite font. That's the point.

Some full-screen rectangles will remain stubbornly deterministic for years to come, just like today's AAA games, where the creator controls every pixel.

But most will be rendered in real time from rich specifications: character descriptions, narrative arcs, interaction rules, deep lore.

The rectangles that survive will be the ones with something worth saying.

JK Rowling mapping out the Harry Potter lore

The Code Nobody Reads

Getting more technical for a moment, you may be wondering what we'll be coding all this in?

Will it be TypeScript where requestAnimationFrame() takes four tokens to output, or will we be using the Next.js app router which requires Claude Code to call mkdir -p multiple times to nest routes in folders?

I think not.

Last year I spent a few hours creating a toy language called Protocol X to explore more token-efficient deterministic code generation.

It was a fun experiment, but I quickly realized I was barking up the wrong tree.

If you're building a high-stakes backend service (payments, medical), you need the ability to manually read the code in a language you understand. It's your reputation on the line after all.

For everything else, you need to be writing specifications - not deterministic code.

If the specifications are clear enough, and you've described what the software should do with sufficient precision, then the code can be regenerated on the fly.

The choice of language or framework becomes irrelevant.

New hardware comes out? Regenerate the code, optimized for the new architecture. New security vulnerability discovered? Regenerate with the fix and re-verify.

This is not a new idea. Cough cough formal verification. But it makes me more sure than ever that brittle deterministic code has no future in user interfaces.

Verifying Specifications

The biggest objection to all of this is also the most important one: you can only verify what you can specify.

True. If your specification is incomplete then the generated code might be perfectly spec-correct, and still do the wrong thing.

Someone once asked me why the Ledger hardware crypto wallet couldn't just be installed inside the MacBook at the factory. The question is more profound than it seems.

Two devices from two companies, made in separate factories and purchased separately, are monumentally harder to compromise.

Likewise frontier models are trained on different data, have different quirks, and compete neck-and-neck to win benchmarks and discover vulnerabilities.

Anyone who's writing code today already runs it through competing frontier models to check for bugs and security exploits. The same will be true for Specs tomorrow.

Consent Theater

If the future of personal computing is personal agents and dynamic rendering, and I believe it is, there's one more thing that's missing: Constraints.

Uber isn't letting your OpenClaw book rides directly until it can satisfy its lawyers that you, the human, understood and agreed to the terms.

Confirming the destination, viewing the fare, accepting surge pricing, pressing confirm. Each step a tiny act in a legal ceremony.

An agent will have to perform these rituals on your behalf ("Yes, I confirm I told the user about the 40% price increase") and somehow prove to the backend that it has.

We know most of these rituals like the EU cookie banner and EULA you never read are bullshit.

Taxes well spent

But some are not.

When you transfer ten thousand dollars, or sign to authorize a medical procedure, you very much want to be in the loop.

So constraints could look something like:

"Require biometric confirmation for payments over a thousand dollars." "The user must understand the exact amount and confirm it twice."

On payments specifically, it's no longer sufficient to simply ask "are you the cardholder?". You want your agent to act on your behalf, even allowing it to buy concert tickets while you're asleep.

The problem is that consent and action are now separated in time.

Verifiable Intents

A few days ago, Mastercard and Google revealed a new standard: Verifiable Intent - an open spec that creates a cryptographic delegation chain.

You work with your agent to set rules (spending limits, merchant allowlists, budget caps) and sign them via passkey or biometric. Then, when your agent transacts, the backend can cryptographically verify it stayed within scope.

The idea being that each party sees only what it needs. The store gets proof the agent is authorized but not your full identity. The issuer gets the audit trail but not the merchant's pricing.

It's early days but Big App needs to pay attention and start thinking how they can adapt their APIs for personal agents.

Apps that don't will simply be driven by computer use until a better alternative comes along.

The Money Problem

Being able to transact with money and being able to make money are two separate things.

Pay-per-use APIs and subscriptions will survive. But advertising is the real unknown.

I know this won't land well with my Claude-Max-subscribing readers, but advertising is the only way millions of people access YouTube and Gmail for free. The screen was where ads lived. No screen, no free tier.

The likely first attempt is ads inside the agent UI itself. This is almost certainly what OpenAI will do when they ship 'OpenAIClaw' later this year. But that's just rebuilding the rectangle inside the agent.

Every major frontier lab has committed to keeping model outputs free from advertising. But once the AI bubble finally bursts, the economic pressure to break that promise will be enormous.

Nobody knows how to replace the revenue screen-based advertising brings in. I don't know either. And there's something else I'm uncertain about.

The Last Open Platform

The open, permissionless, URL-addressable web is the last major computing platform not controlled by a corporation. Anyone can spin up a server, sell things online, and avoid the 30% Apple tax.

Openness is the reason the web, Bitcoin, and now OpenClaw, quickly spawned meetups all over the world.

If advertising dies and no replacement emerges, the free open web dies with it. The VC-backed frontier labs that can afford to operate without ads, at least for a while, become the new gatekeepers by default.

In a closed platform your agent can only discover and talk to services the company has approved. Exactly how OpenAI Apps work today.

Luckily, the early signs are encouraging. Sites are already publishing llms.txt files. Some expose Skill.md files directly.

The next step is something like a Spec.md - a single file describing the Services, Intents, and Constraints that agents need to interact with a backend and draw a dynamic interface in real time.

But if the economics don't work out, openness won't survive on goodwill alone.

We'll trade one set of rectangles for another set of walls.

After the Rectangle

So what comes after vibe coding?

If these diverging trends eventually connect back together, as I think they will, we'll be left with two things: Deterministic Services and Dynamic Interfaces. Together these form Experiences.

Tomorrow's vibe coder won't know what a Service, Intent or Constraint is, any more than they know what a React hook is today. They'll be speaking, pushing, dragging, and one day thinking new ideas into existence in real time.

The Experiences they make can be shared (for a price), forked, and forever customized.

New platforms will make everyone a 'vibe coder' in the way smartphones made everyone a photographer. And yet, you'll still want to hire a professional to record your wedding.

Designed Experiences will survive - the more edgy and opinionated the better. Everything else will be shown when, and only if, you need it - on a fluid screen that rearranges itself around what you're doing.

All this raises deeper questions.

Who owns your data when it is no longer scattered across apps? Can this run on our existing platforms or do we need a new operating system? And what happens when an intelligence that knows everything about you is also the one deciding what you see?

That's the next article.