From Demo Video to Real Automation
Eighteen months ago, Computer Use was still an Anthropic demo: an AI moves a mouse cursor, clicks buttons, types into fields. Impressive, but slow, error-prone and barely usable in production. Today, in spring 2026, browser agents have reached a maturity that makes real workflows automatable.
We now use them in several client projects – for data migrations, research tasks and integrations with systems that don't offer an API. Not every use case has suddenly become automatable. But the list of sensible deployments has grown a lot longer.
How the Technology Has Changed
The old generation of browser automation – Selenium, Puppeteer, Playwright – works through CSS selectors and DOM paths. As soon as a page's HTML changes, the scripts break. Every redesign of a third-party tool meant days of maintenance.
The new generation works through visual understanding. The agent sees the screen the way a human does – as pixels or via the accessibility tree – and decides on the basis of what it sees. "Click the save button" is an instruction it understands, even if the button moves tomorrow or looks different. That makes browser agents robust against UI changes that would shred classic scripts.
Three further improvements have arrived alongside this:
Speed: The screenshot → model → action roundtrip has dropped from several seconds to under one. That means workflows with dozens of steps can now run in minutes instead of hours.
Reliability: Recognition accuracy for UI elements sits above 95% with the leading models. Not perfect, but good enough for supervised workflows.
Cost: A typical browser workflow now costs between 5 cents and 1 euro – depending on length. That makes automation viable even for medium volumes.
Where Browser Agents Shine
Clear strengths emerge from our projects:
Legacy system integration: When an old ERP or CRM system has no API, the only options used to be paying a person to click through the system or maintaining a fragile Selenium script. Browser agents are the new, more robust alternative.
Data extraction from web pages: Research tasks that pull data together from various sites are a natural fit. Unlike classic web scraping, the agent doesn't need to be programmed for each page individually – it understands each one ad hoc.
Repetitive admin tasks: Filing applications in government portals, placing orders through supplier portals, uploading invoices into someone else's accounting system. Tasks where a human does the same steps in a foreign UI every day – ideal candidates.
End-to-end tests: Instead of patching test scripts after every UI change, an agent can be told to "walk through the full checkout flow and report anything unusual." Test maintenance shifts from code to instructions.
Where They Still Fall Short
Just as important: where browser agents still fail today.
High-frequency operations: When something needs to run a hundred times a minute, a browser agent is too slow and too expensive. API-based or classic script solutions remain the right choice here.
Security-critical actions: Banking, money transfers, signing contracts – anything with financial or legal consequences. Even a 1% error rate is unacceptable here.
Captcha and bot detection: Many sites detect automated use and block it. A browser agent is legitimate when it operates the operator's own systems or accounts. On third-party platforms it can violate the terms of service – a legal minefield.
Very long workflows: Tasks with more than 30–40 steps accumulate errors. If the agent clicks the wrong button at step 27, the world is off from there onwards. Longer workflows should be broken into stages with intermediate validation.
The Architecture That Works
The production-ready browser agent setups we've built follow a similar pattern:
Separation of orchestrator and browser agent: An orchestrator system (often a workflow tool or a dedicated backend component) decides which workflows run. The browser agent gets clearly scoped tasks with a defined success criterion.
Sandboxed execution: The agent runs in an isolated browser environment – container, dedicated user account, restricted network access. That keeps the blast radius of any error contained.
Validation layer: After each workflow, a second mechanism validates the result – often an API call, a database check or a second agent – to confirm the desired outcome. The agent's own self-report isn't enough.
Human-in-the-loop for edge cases: When the agent is uncertain or hits something unexpected, it escalates to a human. Calibrating that escalation threshold deliberately matters more than the model choice.
The Legal and Ethical Pitfalls
Browser agents operate in a grey area. Three points should be settled up front:
Permission from the platform operator. When an agent runs on your own site or in your own account, no problem. On third-party platforms, ToS violations can become an issue. A quick look at the terms of service saves trouble later.
Data protection. Browser agents see everything on the screen – including data the agent doesn't actually need. Anyone exposing the system to patient, customer or employee data has to document and secure those data flows.
Liability for errors. When a browser agent triggers the wrong order, submits the wrong form or writes a mistake into a third-party system – who's liable? That belongs in the contract with the vendor and in your internal processes.
What This Means for Businesses
Browser agents are the missing piece for automation in companies with a heterogeneous system landscape. Anyone running customer or employee workflows that today get clicked through foreign UIs by hand should at least evaluate in 2026 what can be automated. The savings are real, the implementation effort has become manageable.
At nh labs, we start projects like this with a quick mapping: which workflows are frequent enough, clear enough and tolerant enough for browser automation? That list becomes the first pilot – almost always with ROI inside three months.
Conclusion
Browser agents have arrived in 2026 where code agents stood in 2024: not a fit everywhere, but surprisingly good for the right tasks. Anyone gathering first-hand experience now is building an automation capability that will be strategically important over the next few years. Those who wait for the technology to be "fully mature" give away one or two years of optimisation potential – and hand their competitors the head start.