Blog / Research

How AI shopping agents actually see your website

When Claude, ChatGPT, or Perplexity browses your store on a shopper's behalf, it doesn't see what you see. It reads the accessibility tree, structured data, screenshots, and the URL. Here's the mental model — with the parts that actually trip agents up on real e-commerce sites.

By Yann BorieFounder, Serge22. Mai 202612 min read

TL;DR. AI shopping agents like Claude, ChatGPT, and Perplexity read four things on your website: the accessibility tree, a screenshot, the structured data in JSON-LD blocks, and the URL. They do not read your CSS or your visual design. A "beautifully designed" e-commerce site can score 34 out of 100 on agent product findability while a plain site scores 82 — because agents read a different document than your designer painted on. The fix is structural: real <button> elements, accessible names on variant selectors, schema.org Product data, and an unblocked robots.txt. Visibility (GEO) and click-time task completion are two different layers — llms.txt matters for the first, the DOM matters for the second. Most "AI readiness" advice today blurs them.

In this article:

The agent isn't reading your page — it's reading your accessibility tree
What the agent's working surface looks like
What it sees vs. what it cares about
The contrarian section: skip "AI SEO" for now
What to look at first, if you're going to look at one thing
Frequently asked questions

A Product Director at a Swiss retailer told me recently:

"I tried it. I asked Claude to find AA batteries on our store and add them to my cart. It opened our homepage, scrolled, clicked into the wrong category, gave up, and went to a competitor's site."

She wanted to know what Claude had actually seen on her store when it gave up. Not what a human sees — a human can spot a category nav at a glance. What was the agent's working surface? Where did it lose the thread?

This is the question we get most often from the Product Director / Front End Lead pair we work with. Once they've felt the failure firsthand, they need a mental model to bring back to their team. This post is that mental model — written for someone who's already responsible for the checkout, not for someone learning HTML for the first time.

The agent isn't reading your page. It's reading your accessibility tree.

When Claude Desktop, ChatGPT Operator, or a browser-use agent lands on your storefront, it does not take your fully-rendered visual page and try to parse the colors, layout, and design language. That's a tempting mental model — agents are big and capable, surely they "see the page" — but it's wrong, and it's the source of most agent-readiness misunderstandings.

What the agent actually reads, on every step, is:

The accessibility tree (the same tree a screen reader consumes — built from semantic HTML, ARIA roles, accessible names, and form labels)
A screenshot of the rendered viewport (yes, agents look at pixels — but as a secondary signal, used to confirm what the a11y tree already told them)
Structured data in <script type="application/ld+json"> blocks (schema.org Product, Offer, Breadcrumb)
The URL of the page they're on (and any og: meta tags they happened to fetch)

Notice what's not on that list: your CSS. Your fonts. Your hover states. Your Tailwind utility classes. Your design tokens. None of it reaches the agent's reasoning loop unless it changes the accessibility tree or the screenshot.

This is why a "beautifully designed" e-commerce site can score 34 out of 100 on agent product findability and a plain, ugly site can score 82. The agent is reading a different document than your designer was painting on.

What the agent's working surface looks like

Imagine your product detail page for a CHF 299 laptop. A human sees something like:

Big hero image. Headline: Laptop X-15. Price: CHF 299. A row of color swatches. A row of size buttons. An "Add to Cart" button. Reviews below.

When Claude visits that page, here's roughly what its working surface looks like (compressed for readability):

heading[level=1]: "Laptop X-15"
image: laptop-x-15-hero.jpg [no alt text]
text: "CHF 299"
generic: [color swatches container]
  button: [no accessible name]
  button: [no accessible name]
  button: [no accessible name]
generic: [size container]
  generic[role=button]: "S"
  generic[role=button]: "M"
  generic[role=button]: "L"
generic[role=button]: "Add to cart"

Notice the four problems an agent sees on this very typical page:

The hero image has no alt — the agent doesn't know whether it's a decorative banner or the actual product photo
The color swatches are buttons with no accessible name — the agent can see something is clickable but cannot tell what selecting each one does (which color is "burgundy" vs "navy" vs "silver"?)
The size selector is a <div role="button"> instead of a real <button> or <select> — semantically close, but a custom-built component that often doesn't handle keyboard or programmatic clicks reliably
The "Add to cart" CTA is also a <div> with role="button" — same problem

A human user clicks through this page without thinking. A typical agent looks at this tree, looks at the screenshot to confirm what the divs look like, makes its best guess, clicks something, and frequently ends up adding the wrong variant or being unable to click "Add to cart" at all because the click handler is attached to a parent element and never receives the event.

The page renders fine. The page is "well-designed." The page does not work for the agent.

What it sees vs. what it cares about

Once you accept that the agent's working surface is the a11y tree + screenshot + structured data, the next question is: which parts of that surface does the agent actually use to make decisions?

In our scanner work, we've identified five questions an agent has to answer in order on a product-find-and-add task. Each one consumes a specific input:

| The agent asks… | It uses… | What breaks it | |---|---|---| | Am I allowed on this site at all? | robots.txt, WAF response headers, Cloudflare/HUMAN bot scores | Aggressive bot protection that returns 403 or a CAPTCHA to a headless browser | | Can I find the catalog? | sitemap.xml, top-nav semantics, internal links with descriptive anchor text | Mega-menus rendered client-side with no anchor text, JavaScript-only navigation | | Can I find a specific product? | URL structure, schema.org Product, on-page heading hierarchy | URLs that look like /p/?id=4127 with no semantic slug, missing structured data | | Can I parse the product page? | accessible names on buttons, structured data prices, semantic variant selectors | Roleless <div> add-to-cart, color swatches with no labels, prices in CSS pseudo-elements | | Can I add to cart? | The "Add to cart" button being reachable, cart state visible after click | Cart state stored only in localStorage, "Add to cart" needing hover before it's clickable |

The pattern: structural properties of the DOM correlate strongly with agent task success. We can predict, without running an actual agent, which sites an agent will succeed on by looking at the structural signals above.

This is why we built a deterministic scanner — no LLM in the loop — that checks these properties and produces a score. Running a real agent on every URL would cost dollars per scan. Reading the DOM costs fractions of a cent.

The contrarian section: skip "AI SEO" for now

A lot of the writing on this topic right now is about getting your site listed in ChatGPT's answer, getting cited by Perplexity, getting picked up in Google's AI Overviews. That category has a name — Generative Engine Optimization, GEO — and there are a dozen tools selling against it (Athena, Profound, Scrunch, Peec, Otterly, Semrush AI Visibility, etc.).

Those tools are real and the problem is real. But it's not the problem we're describing in this post.

Visibility in an AI answer is upstream of the agent ever arriving on your site. Once the agent arrives, GEO doesn't help. The agent isn't reading ChatGPT's index of your site — it's reading the live DOM.

While we're being precise, here's the layer distinction that most "AI readiness" pitches blur. When a shopping agent is on your product page deciding whether to click "Add to cart," it does not refetch /llms.txt — it reads the accessibility tree of that page. Anthropic and Perplexity have both publicly stated they consult llms.txt during their upstream retrieval and indexing layer — the layer that decides which URL to send the agent to in the first place. That's real, and that's the GEO layer working. But it's a different layer than the one we measure. Publish an llms.txt for the upstream visibility play if you want; just don't ship it as your first task-completion fix, because the agent that's already on your page making click decisions isn't reading it.

This is the single most-confused point in AI-readiness marketing right now. A vendor pitching "fix your llms.txt" as the headline AI-readiness step is conflating two layers — the indexing layer (where llms.txt matters) and the task-completion layer (where it doesn't). Knowing which layer your buying decision is solving is the whole game.

The thing that determines whether the agent can buy from you, once it's arrived, is your DOM. Not your AI SEO posture. Not your structured data alone (necessary but not sufficient). Not your prompt engineering. The DOM.

This isn't a critique of the GEO crowd. They're solving a real problem on the visibility layer. It's a clarification: the user → agent → eshop journey has three legs, and if you only fix the upstream leg, the agent still gives up on the middle leg and the sale still goes to a competitor.

What to look at first, if you're going to look at one thing

If you've read this far and want one practical takeaway, here it is.

Open your top-three product detail pages in Chrome DevTools. In the Elements panel, click on your "Add to Cart" button. Then look at the Accessibility panel (it's a tab in DevTools — Cmd+Shift+P → "Show Accessibility").

You should see, for that button:

Role: button (not generic, not text, not nothing)
Name: "Add to cart" or similar (an accessible name the agent can read)
Keyboard: the button should be reachable by Tab and activatable by Enter or Space

If any of those three are wrong on your most important CTA, you have a finding to ship to your engineering team before lunch. A <div onClick> that looks like a button to a human is not a button to an agent.

This is the easiest 5-minute audit you can run, and it's the one we'd suggest before any of the larger structural work.

What we built so you don't have to do this by hand

The Serge scanner runs this audit (and ~40 others) across every page it can reach on your domain in about 30 seconds. It produces:

A score from 0 to 100 for how easy it is for an agent to find a product and add it to a cart on your site
An example product it found (proof that the crawler actually traversed your catalog)
A list of specific findings — each one with a one-line fix — sorted by how much they affect agent task completion

The scanner is free and every result is a public URL you can forward — you sign in with Google in one click to run one. We built it because the conversation with the Swiss Product Director happened too many times and the "let me show you" demo needed to be self-serve.

Paste your domain at serge.ai. It'll show you what an agent sees on your store in 30 seconds.

Frequently asked questions

What do AI shopping agents see on a website?

AI shopping agents like Claude, ChatGPT Operator, and Perplexity read four things on every page: the accessibility tree (built from semantic HTML, ARIA roles, accessible names, and form labels), a screenshot of the rendered viewport, structured data in <script type="application/ld+json"> blocks, and the URL plus any og: meta tags. They do not read your CSS, your fonts, your hover states, or your design language unless those affect the accessibility tree or screenshot.

Do AI shopping agents read llms.txt?

It depends on the layer. Anthropic and Perplexity have publicly confirmed they consult /llms.txt during their upstream retrieval/indexing layer — the layer that decides which URL to send an agent to. But once the agent is on your product page deciding whether to click "Add to cart," it does not refetch /llms.txt — it reads the accessibility tree of that page. So llms.txt is useful for the upstream visibility play (GEO), but it's not load-bearing for the task-completion layer that decides whether the agent actually buys. Publishing one is harmless; pitching it as your first AI-readiness fix conflates two distinct layers.

Why do AI agents fail on well-designed e-commerce sites?

Visual design quality does not predict agent task success. Agents read the accessibility tree, not the painted pixels. A beautifully designed site that uses <div onClick> for add-to-cart, color swatches without accessible names, and roleless custom variant selectors will score poorly for agent product findability — while a plain, semantic HTML site with real <button> elements and schema.org Product data will score well.

What is the most common AI-readiness failure on e-commerce sites?

The most common failure we observe is a non-semantic add-to-cart button — typically a <div> with an onClick handler instead of a real <button> element. Agents reading the accessibility tree see only text where they need to see an interactive control, so they cannot click it. Variant selectors without accessible names (color swatches rendered as unlabeled divs) are the second-most-common failure.

How is "agent readiness" different from "AI SEO" or GEO?

GEO (Generative Engine Optimization) measures whether your site appears as a citation or recommendation when a user asks an LLM a question. Agent readiness measures whether an agent can complete a task on your site once it has arrived. They're two different layers of the same problem. GEO is upstream (does the agent know about you?); agent readiness is the middle leg (once the agent arrives, can it find your products and add them to a cart?). Tools like Athena, Profound, and Perplexity-listing-trackers cover the upstream layer. We cover the middle one.

About the author. Yann Borie is the founder of Serge (Superstellar LLC, Zug, Switzerland). He spends his time instrumenting AI shopping agents on real e-commerce sites and turning what they fail at into shippable findings. Connect on LinkedIn.

Up next in this series: Will AI agents hurt my e-commerce sales? Here's what we found and Simple steps to make your online store AI-ready. Want to see how your store scores today? Run a free scan.

Run the scanner on your store

Free, deterministic, 30 seconds. See exactly which agent failures apply to your domain.

Scan your domain