Here's a scenario happening right now in your sector: one of your prospects opens ChatGPT and types "which provider do you recommend in [your field] in Belgium?" ChatGPT responds with 3 names. Yours isn't one of them — but two of your competitors' are.
This prospect now has a mental shortlist you didn't influence. They may still discover you later — but you're starting with a structural disadvantage in their evaluation.
According to Seer Interactive data (2026), brands cited in AI responses get approximately 120% more organic clicks per impression compared to uncited competitors on the same query. AI visibility generates trust before the first contact.
This guide gives you the complete method to measure where you stand against your competitors across the 5 main AI engines.
Why AI benchmarking is different from SEO benchmarking
In SEO, benchmarking a competitor is relatively straightforward: you look at their positions on your target keywords, their domain authority, their estimated traffic volume. The data is stable and verifiable.
AI benchmarking is more complex for three reasons:
1. Responses vary. LLMs don't give the same answer every time — the phrasing of the question, the time, the conversation context influence results. A competitor may appear in 60% of responses on one query and 20% on another.
2. Each platform is a distinct market. A competitor can dominate Perplexity and be invisible on Claude. According to an analysis of 680 million citations, only 11% of domains are cited by both ChatGPT and Perplexity. Benchmarking on a single platform gives a partial view.
3. It's not just frequency — it's mention quality. Being cited first with a detailed description doesn't have the same value as being mentioned in passing at the end of a response. The "how" matters as much as the "how often".
Step 1 — Define your query set
The foundation of the benchmark is a set of 20 to 30 prompts representing exactly the questions your prospects ask AI engines. This set must cover three categories:
Sector discovery queries
These are queries where a prospect is looking for a provider without knowing you:
"Which [your type of service] do you recommend for a B2B SME in [country]?"
"What are the best providers in [your field]?"
"I'm looking for an expert in [your speciality] — any suggestions?"
Comparison queries
These are queries where a prospect is evaluating their options:
"Compare [your company] with [competitor 1] and [competitor 2]"
"What's the difference between [your service] and [competitor service]?"
"Which players are recognised for [your field] in [country]?"
Verification queries
These are queries where a prospect is checking information about you or your competitors:
"What is [competitor name]? What are their services?"
"Is [competitor name] a reference in [field]?"
Step 2 — Build the benchmark grid
For each query and each AI engine, create a grid with these columns:
| Dimension | What we measure | |---|---| | Presence | Is your brand cited? (yes/no) | | Position | First cited, middle of list, last cited | | Description | Short mention or detailed description with services/advantages | | Link | Is your site linked? | | Sentiment | Positive, neutral, with reservations | | Competitors present | Which other players are cited in the same response |
Fill this grid for yourself AND for your 3 to 5 main competitors, on each AI engine tested.
Step 3 — Execute the benchmark reproducibly
Reproducibility is key. For your benchmark to be comparable over time, define a fixed protocol:
Frequency: monthly for regular tracking, quarterly minimum.
Conditions: use standard web interfaces (not API), in non-connected mode or with a dedicated account to avoid personalisation. Test each prompt 2 to 3 times to compensate for response variability.
Platforms to cover: ChatGPT (web version, search enabled), Perplexity (Pro mode if available), Gemini (gemini.google.com), Copilot (copilot.microsoft.com), Claude (claude.ai, web search enabled).
Documentation: capture screenshots of each response. AI responses change — without documentation, you can't compare over time.
Step 4 — Calculate your benchmark metrics
Once your grid is filled, calculate these 5 metrics:
Presence rate
Number of prompts where you appear / total number of prompts tested, per platform.
Example: "We appear in 4 out of 10 prompts tested on ChatGPT = 40% presence rate"
AI Share of Voice
Number of times you're cited / total citations on your prompt set, all competitors combined.
Example: "On our 10 ChatGPT prompts, there were 25 citations total (you + competitors). You appear 4 times = 16% AI Share of Voice"
Position score
Assign 3 points for a first citation, 2 for a middle citation, 1 for a last citation, 0 for absence. Calculate your average score and that of each competitor.
Multi-platform coverage
How many of the 5 AI engines you appear on at least once on your sector discovery prompts.
Competitive gap
For each competitor that outranks you, note: on which prompts they appear and you don't, and with what description.
Step 5 — Analyse the gaps and identify causes
Once metrics are calculated, the analysis is the most valuable part of the benchmark. Questions to ask:
"On which prompts does a competitor appear consistently but not me?" These are your priority queries — where you're losing prospects to a specific competitor.
"How is my competitor described vs how am I?" If a competitor is described as "recognised specialist in X" and you as "provider offering Y", the difference in perceived positioning is visible — and correctable.
"On which platform is my competitor strongest?" Each competitor generally has one or two platforms where they dominate. Identifying which allows you to understand their implicit strategy and the levers they use.
"What sources do LLMs cite when mentioning my competitor?" Sources cited with your competitor reveal their authority infrastructure — press articles, directories, third-party publications. This is your roadmap to catch up.
Concrete example: reading a benchmark
Imagine a Belgian HR consulting firm testing 10 prompts on ChatGPT and Perplexity.
ChatGPT results:
- Firm A (you): present on 3/10 prompts, always in 2nd or 3rd position, short description
- Firm B (competitor): present on 7/10 prompts, often in 1st position, detailed description mentioning their methodology
- Firm C (competitor): present on 5/10 prompts, variable position
What the benchmark reveals: Firm B dominates because ChatGPT clearly associates them with a specific methodology — they've built a strong entity with clear positioning. You appear less often and in less detail, suggesting a less defined or less documented entity in the sources ChatGPT consults.
Targeted catch-up plan:
- Clarify and repeat your distinctive methodology in your content
- Identify in which sources Firm B is mentioned (publications, directories, press)
- Work to appear there in the following 3 months
Available tools to automate the benchmark
The manual benchmark described above is sufficient to start and for teams with limited resources. For more regular and systematic monitoring, specialised tools exist: Otterly.ai, Profound, Peec AI, and Semrush Enterprise AIO allow automatic tracking of citations across multiple AI platforms with alerts and competitive benchmark reports.
These tools are particularly relevant when you have more than 5 competitors to track or want weekly rather than monthly tracking frequency.
Where to start?
Start with a manual benchmark on the 3 most representative prompts for your acquisition — those your prospects actually type to find a provider like you. Test them on ChatGPT and Perplexity for a first vision in less than an hour.
Our free scoring tool gives you a structured assessment of your visibility on the 5 AI engines, with a score per engine — a good starting point for your baseline.
For a complete competitive benchmark — with analysis of your 3 main competitors on the 5 AI engines and identification of priority gaps — our AI Diagnostic delivers a structured report within 5 business days.
Sources: Seer Interactive data on AI citation impact on organic traffic (2026), analysis of 680 million AI citations (Profound, 2025-2026), AI search competitor analysis benchmark framework (Stackmatix, 2026), AI visibility measurement guide (Medium, April 2026).