Zum Inhalt springen

Compare · SimGym vs Serge

SimGym imagines shoppers. Serge runs the real agents.

Shopify's SimGym sends synthetic personas through your store to A/B test theme changes before you launch them. Serge runs a real Claude agent — the kind of software your customers already delegate purchases to — through your live storefront, and shows the exact step where a purchase fails.

Short version: SimGym is Shopify's AI-shopper research preview — hundreds of invented human personas browse your store in a contained environment so you can compare theme variants before shipping. It explicitly stops before checkout, pricing, and discounts, and never attempts a purchase. Serge is the opposite end: one real agent, on your live store, running an actual buying task — find the product, choose the variant, add to cart — and reporting exactly where it fails.

Side by side

What each tool tests

DimensionSimGymSerge
What it runsSynthetic human personas (invented shoppers) browsing your store in a contained environment to compare theme variants.A real Claude agent running a live buying task on your production storefront — the same class of agent software customers delegate purchases to (Claude, ChatGPT, Operator).
Buying taskNone — SimGym browses to A/B test themes and explicitly excludes checkout, pricing, and discounts.Every run is a real buying task — find the product, choose the variant, add to cart — reporting the exact step the agent quit. (It stops before payment.)
OutputHypotheses to A/B test — Shopify frames it as a hypothesis generator for design decisions, not a verdict.A paste-ready fix per failure: the schema, ARIA, DOM, or robots.txt change to ship so the next run can get through.
Footprint on your storeRuns many synthetic personas per simulation — a swarm of invented sessions in your store's environment.One real agent per journey test against your live store. A single measurement run, not a swarm.
PricingFree install plus per-simulation charges — a research preview, still evolving.Free deterministic scan. Pro CHF 159/mo — journey tests, regression alerts, 12-month analytics retention. On Shopify: $29/mo.

Facts about SimGym reflect its public research-preview positioning and may change — check Shopify's app listing for current scope and rates.

How they fit

Which question are you asking?

These tools answer different questions about different traffic. SimGym predicts how a design change might land with human shoppers before you ship it — a pre-launch theme lab. Useful when you're choosing between layouts.

Serge answers a question SimGym doesn't touch: when a real AI agent shops your live store for a customer, can it get through to the purchase? It runs a real agent — not personas — and stops at the real failure: the variant selector with no accessible name, the drawer cart the agent can't open, the checkout that assumes a human.

Mega menus, drawer carts, pop-ups — the elements simulated shoppers are reported to go quiet on — aren't test noise for Serge. When a real agent fails there, that's the finding. If you want to know whether a real agent can buy, not whether an imagined shopper prefers your theme, that's the gap Serge fills.

FAQ

SimGym vs Serge

Does SimGym test whether an AI agent can buy from my store?
No. SimGym simulates synthetic human personas to A/B test theme changes before launch, and it explicitly excludes checkout, pricing, and discounts. It's a hypothesis generator for design decisions. Whether a real agent can complete a purchase on your live store is the question Serge answers.
What's the difference between a synthetic shopper and a real agent?
SimGym's shoppers are invented personas running in a contained environment to predict human behaviour. Serge runs a real Claude agent — the same class of software your customers delegate purchases to, like ChatGPT and Operator — against your production store. One predicts a human's reaction; the other measures whether the agent already shopping for your customers can finish the job.
Can I use both?
Yes. Use SimGym before a theme launch to compare design variants; use Serge to confirm a real agent can still find a product and add it to the cart on the live store after you ship. They cover different stages — pre-launch hypothesis vs live-store measurement.
Will running Serge flood my analytics the way a bot swarm does?
No. Serge runs one real agent per journey test against your store, not hundreds of synthetic sessions. Simulation tools that push many personas through your store can leave synthetic traffic in your analytics; a Serge run is a single real session and won't distort your numbers.

Test the real thing

See whether a real agent can buy from your store

Free deterministic scan in 30 seconds, one-click sign-in. See the score your store gets and the structural fixes that move it — then run a real Agent Journey Test to watch an agent try to buy.