Agentability.io — Rohan Saraf

The question

AI agents are increasingly being asked to operate software — navigate interfaces, extract information, take actions, complete tasks. But most software was designed assuming the operator has eyes, a mouse, and a mental model built from decades of interacting with visual interfaces. Agents have none of these.

When an AI agent fails to complete a task in a software system, where does the failure originate? In the agent? Or in the software that was never designed to be agent-operable?

Agentability.io is a research project built to investigate that question and produce a framework for answering it.

Agent Factors Engineering

I coined the term Agent Factors Engineering (AFE) to name the discipline this research points toward. It is the agentic-era successor to Human Factors Engineering — the same structural question applied to a new class of operator.

HFE asked: what does the human need to understand and operate this system? AFE asks: what does the agent need to understand and operate this system? The questions are structurally identical. The answers are different — because agents parse rather than read, query rather than navigate, and fail from ambiguity and structural inconsistency rather than cognitive load.

AFE produces 8 principles for agent-era software design. Full framework on the Frameworks page →

The audit pipeline

The live tool at agentability.io accepts a URL and returns an Agentability Score (0–100) across the 8 AFE principles. The pipeline:

Playwright crawl — headless Chromium renders the page and extracts DOM structure, ARIA attributes, semantic markup

Automated checks — rule-based evaluation of structural properties (labels, ARIA roles, status indicators, API endpoints)

LLM evaluation — Claude Sonnet assesses subjective properties (chunking quality, transparency of intent, handoff legibility)

Composite score — weighted aggregate → Agentability Tier (Agent-Ready / Developing / Lagging / Agent-Blind)

The Agentability Index

Beyond the audit tool, agentability.io publishes the Agentability Index: a ranking of approximately 100 widely-used software products evaluated against the AFE rubric. The Index is the research artifact — it makes the state of the industry legible at a glance.

The finding so far: most production software scores in the Lagging tier (20–34). Very few products score in the Developing range. Agent-Ready software is rare. This is the baseline against which the discipline of AFE will be measured as the agentic era develops.

What this is not

Agentability is not a website audit tool. It is not a competitor to Lighthouse or axe. It is a design research project investigating what software becomes in a world where AI agents are primary operators, not assistants.

The audit tool is a means of generating data. The AFE framework is the contribution. The Agentability Index is the evidence that the framework surfaces real patterns across real software at scale.

Infrastructure

Self-hosted on a Hetzner CX23 server (2 vCPU, 4GB RAM). n8n orchestrates the audit workflow. A Dockerised Playwright container handles crawling. Supabase stores audit results. The public gateway runs at n8n.icuboid.in/webhook/public-audit.

Running this on constrained infrastructure is intentional — it demonstrates that serious research infrastructure doesn't require enterprise cloud spend, and it forces me to understand every constraint in the system I'm studying.

The tiers

Agent-Ready≥ 45

Developing35–44

Lagging20–34

Agent-Blind< 20

Live site

agentability.io →

Run an audit on any URL

Full framework