Skip to content

Public benchmark · launching June 2026

How is AI seeing
your product?

AgentRank is the public, neutral benchmark for how AI coding agents — Claude Code, Cursor, Codex, ChatGPT — discover, choose, and correctly invoke your APIs, SaaS products, and MCP servers.

~/agentrank — stripe-payments — live4 agents
$agentrank scan stripe-payments --tasks=50 --agents=4
streaming · plan-mode · v0.4
    • claude-code
      ··
    • codex
      ··
    • cursor
      ··
    • windsurf
      ··
    $running…
    scanning
    Stripe and Razorpay are trademarks of their respective owners. Not affiliated.

    0 companies on the waitlist · joining 23 this week

    The thesis

    The buyer of your software has changed.

    In 2026, AI coding agents make product-selection decisions on behalf of developers. When someone tells Claude “add payments,” the agent decides which provider, which endpoint, which version. The human never sees the alternatives.

    If your API gets picked, you don’t know why. If it gets passed over, you don’t know that either. The funnel that mattered for fifteen years — docs traffic, signups, dev evangelism — is being rerouted through an opaque selection layer your dashboards can’t see.

    AgentRank is the eyes on that selection layer. Same questions for everyone, scored 0–100, published in public, refreshed every time the agents update.

    See it work.

    Demo preview · representative scores from May 2026 single-agent runs. Full multi-agent leaderboards go live at launch.

    agentrank.jazel.co/score/stripe
    0
    / 100
    Stripe
    Payments · tested May 2026
    AI visibility0vs Razorpay 52
    Agent usage0vs Razorpay 38
    Docs quality0vs Razorpay 44
    Trust & authority0vs Razorpay 30

    Stripe and Razorpay scores from May 14, 2026 — Claude Sonnet 4.6 single-agent run. Sub-scores and category-leaderboard preview are illustrative until full multi-agent results publish at launch.

    Built for builders who ship to a new kind of user.

    API teams

    Know which agents pick you over your competitors — and which competitors they reach for when they don’t.

    Developer-experience leads

    See exactly where your docs confuse models — endpoint by endpoint, parameter by parameter.

    MCP server authors

    Get scored before you launch. Ship with a public AgentRank badge that says you tested.

    How AgentRank works.

    1. 01
      01

      Test against four frontier agents

      Claude Code, Cursor, Codex, ChatGPT — same SDK, same harness, same sandbox.

    2. 02
      02

      Run 50 standardized tasks per category

      Payments, auth, email, storage, search, database, CRM, analytics, comms, AI tooling.

    3. 03
      03

      Score across four dimensions

      Visibility, usage, docs quality, trust. Per-agent breakdown plus a single weighted total.

    4. 04
      04

      Public score, private dashboard

      Free score for anyone. Paid plan unlocks per-task traces, weekly re-runs, alerts, and CI checks.

    The pattern is here

    Backed by what we’re already watching.

    10,000+

    public MCP servers tracked

    Anthropic, 2026

    $930M

    Tegus + AlphaSense — AI-in-research thesis

    TechCrunch, 2025

    $45M

    Mintlify Series B for AI-native docs

    a16z, 2026

    Be early

    Be first to know your AgentRank.

    Beta partners get free Pro tier for 12 months at launch. We send one update a week. No spam, ever.

    0 companies waiting · 23 joined this week

    What we’re building.

    Phase 1June 2026

    Public scoring

    Payments, auth, email — first ten products in each, full per-agent breakdown.

    Phase 2Q3 2026

    CI integration

    GitHub action that fails a build when your AgentRank score regresses on a release.

    Phase 3Q4 2026

    Enterprise

    Custom corpora, private products, SSO, dedicated support, on-prem option.