AI Citation Tracking in PropTech: How 4 AI Engines Cite the Same Brands Differently — research by Gemma Smith at PropSaaS Growth

We ran 25 buyer-language prompts across ChatGPT, Claude, Perplexity, and Google AI Overviews, tracking which of 15 PropTech brands appeared in each response. The finding: AI search doesn't behave as a single market. Each engine has its own citation mechanics, and they don't agree. The same query produces materially different brand citations on each engine. ChatGPT cites 3-5 tracked brands per prompt. Claude and Google AI Overviews land at 2-3. Perplexity tightens to 1-2. If you're "optimizing for AI search" as a single surface, you're optimizing for nothing.

The full dataset, methodology, and the strategic implications for B2B SaaS AEO follow.

Methodology and gaps

What we did

  • 25 prompts spanning category queries ("best property management software for 500 multifamily units"), comparison queries ("Yardi vs AppFolio"), problem-led queries ("how do property managers handle make-ready turns at scale"), entity probes ("tell me about RealPage"), and long-tail commercial queries ("cheapest PMS with good tenant screening").
  • 4 AI engines: ChatGPT (with web search), Claude (with web_search tool), Perplexity, Google AI Overviews.
  • 15 tracked PropTech brands: Yardi, AppFolio, RealPage, Entrata, Buildium, DoorLoop, TurboTenant, Avail, Hemlane, Stessa, Property Meld, MaintainX, Lessen, Findigs, Azibo.
  • Capture: for each response, we recorded which tracked brands appeared in the answer text (not just as source URLs), the first brand cited, and any in-the-moment surprise.

Model versions, for the record. AI engines change fast, so here is exactly what ran in May 2026. Claude was claude-sonnet-4-6 via the Anthropic API with the web_search tool. That is the one pinned, fully precise engine in the study.

ChatGPT and Perplexity were run in their consumer interfaces on whatever each shipped as the default model that month. We did not pin a specific model version, so treat those as "the May 2026 consumer default" rather than a fixed iteration. Google does not disclose which Gemini version powers AI Overviews. If you are reading this more than a few months out, assume the engines have moved and treat the specific citation numbers as a dated snapshot, not a standing benchmark.

Cost-bound choices we made up front

This methodology was deliberately built around what a practitioner can do without paid subscriptions. If you're considering buying or building an AEO measurement stack, the constraints we hit are worth knowing.

  • Claude — automated via the Anthropic API. We already had an API key (used elsewhere in our stack), so the Claude data collection was scriptable. ~$0.50 in API spend for 26 prompts. 4 of the originally planned 30 prompts hit Sonnet 4.6's input-token rate limit when the web_search tool inflated context past the cap, so the final Claude dataset covers 26 prompts.
  • ChatGPT — captured manually in the desktop UI. The OpenAI API would have automated this, but we deliberately chose not to spend on OpenAI credits for this study. Browser automation against chatgpt.com is blocked, so all 25 ChatGPT prompts were run by hand in the consumer interface.
  • Perplexity — captured manually, capped at 10 prompts. Perplexity Pro ($20/month) would have raised the search limit, but we deliberately chose not to upgrade. The free tier capped us at 10 searches per session, so the Perplexity dataset is the smallest in the study.
  • Google AI Overviews — captured manually from the SERP. No public API exists at any price tier. SerpAPI and equivalent commercial scrapers would have automated the capture (with a per-query cost), but we ran it by hand to keep this study runnable on a $0 budget for anyone reproducing the methodology.

The cost-bound shape of this methodology is itself the finding. Running this study without commercial subscriptions is realistic for any practitioner, but it forces methodology choices that the commercial AEO tracking tools paper over. Most of the dedicated tools sit on top of the same constrained engine surfaces, just with the API costs and rate limits absorbed into the subscription fee. Worth knowing before you decide what to pay for.

The headline finding: per-engine variance

We expected some variance. We didn't expect this much.

Brand Claude (26 prompts) Google AI Overviews (25) ChatGPT (25) Perplexity (10)
Buildium1713154
AppFolio1110144
DoorLoop67121
Yardi8353
TurboTenant7582
Avail6551
Property Meld4562
RealPage4252
Stessa3251
Entrata4322
Azibo2141
Lessen1211
Hemlane0110
MaintainX1000
Findigs0000

Three patterns in the table that matter for AEO strategy.

ChatGPT cites the most brands per prompt. Average ~3.5 tracked brands per response, vs ~2.5 for Claude and Google AI Overviews, ~2 for Perplexity. Easiest engine to appear on. Noisiest engine to be the dominant answer.

Yardi is over-represented on Claude. 8 citations on Claude versus 3 on Google AI Overviews. 2.7x difference. Yardi has decades of training-data presence baked into the model weights that doesn't match how the current web describes the category. If you compete with Yardi, their narrative dominance varies dramatically by engine. Optimizing your AEO based on "where Yardi shows up" will lead you to different priorities on different engines.

Findigs is uncited across all four engines. Zero. Despite being a real tenant-screening category player with paying customers. The same is nearly true for Hemlane, MaintainX, Azibo, Lessen. The bottom of the leaderboard is where the AEO opportunity sits.

Each AI engine is its own citation market. The brands losing on AI search have plenty of content. What's missing is entity clarity.

The duopoly and the DoorLoop surprise

Two patterns dominate the top of the citation table.

Buildium and AppFolio split the top two spots on every engine. Buildium leads on Claude (17), Google AI Overviews (13), and ChatGPT (15). AppFolio sits right behind on all three (11, 10, 14). On Perplexity they tie at 4 of 10 prompts. Combined, these two brands take roughly 30% of all tracked citations across the dataset.

This isn't surprising on its own. Buildium and AppFolio have spent a decade building the category. What is surprising is the gap.

DoorLoop is the AEO winner punching above its market share. 12 citations on ChatGPT (third place, ahead of Yardi). 7 on Google AI Overviews. 6 on Claude. Combined, DoorLoop is being cited at roughly the same frequency as AppFolio across the four engines.

The Ahrefs data shows why. DoorLoop ranks for 12,298 organic keywords vs AppFolio's 2,939, 4.2x more total. 3,369 of those rank in the top three vs AppFolio's 684, 4.9x more high-position rankings. AppFolio's organic traffic is higher in raw volume (113k vs 81k monthly), but DoorLoop's coverage, how many distinct queries their content can answer, is more than four times larger. AI engines have substantially more DoorLoop content to draw from when answering open PropTech queries.

The content strategy compounds the effect. DoorLoop publishes head-to-head comparison content for their competitors on their own blog. "Buildium vs. AppFolio vs. DoorLoop: Which Should You Choose?" sits at the top of Google's results for that comparison query and gets cited by AI engines when buyers ask. They own the comparison search surface for competitors who haven't published their own version.

Plus the off-site entity signal. DoorLoop's G2 profile has 204 reviews at 4.8 average rating. Strong review velocity, strong category signal. AI engines treat G2 as a primary source for B2B SaaS category questions.

Notable: DoorLoop's homepage doesn't ship JSON-LD schema markup on the front page. The AEO win is coming from volume, comparison content, and entity strength, not from schema discipline. Schema is on the AEO list. So are content volume, comparison content, and entity authority. DoorLoop's pattern suggests the latter three matter most when schema isn't yet in place.

The strategic implication: AppFolio and Buildium are defaults that incumbents protect with brand awareness. DoorLoop is the proof that a smaller PropTech can compete on AI citation rate with content scale, comparison content, and entity discipline. None of those moves require enterprise budget.

The uncited brands and the AEO opportunity

The most actionable finding in this study is not at the top of the citation count. It's at the bottom.

Five tracked brands sit at or near zero across all four engines:

  • Findigs (tenant screening): zero citations on all four engines. Despite a real business with paying customers in the tenant-screening category.
  • MaintainX: 1 citation on Claude, 0 elsewhere. A category leader in industrial CMMS that hasn't surfaced in PropTech AI search at all.
  • Hemlane: 0-2 citations depending on engine. Strong content presence on Google search, near-invisible on AI.
  • Azibo: 1-4 citations. Some engine pickup but inconsistent. Almost entirely in direct comparison queries (Stessa vs Azibo) rather than open category queries.
  • Lessen: 1-2 citations, almost entirely tied to the Property Meld comparison query.

If you operate one of these businesses, this is your AEO opportunity. The structural moves to get cited are well-documented (direct-answer openings, FAQ blocks with FAQPage schema, named entity binding across sameAs links, off-site mentions in Reddit, G2, podcast transcripts). None of the bottom-five brands appear to be running this playbook at scale. The first one that does will pick up citation share fast, because AI engines need to fill the answer slot and currently default to whichever brand has the most surface-area-of-presence on the web.

The bigger insight is the gap between revenue rank and citation rank. In every other channel a marketer measures (search rankings, paid CPCs, brand survey awareness), market position roughly correlates with visibility. AI citation rate doesn't follow that rule. Functional businesses with real revenue can have near-zero AI visibility. The work to fix it is on-page structural plus off-site entity work, and it compounds within months. Tier-1 share of voice is reachable from near-zero faster than it is in traditional SEO.

Engine-by-engine personalities

The data shows each engine has a distinct citation personality. Worth knowing before you decide where to focus.

ChatGPT (with web search): the generous citer

Cites the most brands per prompt of any engine (~3.5 average). Introduces the most untracked brands (14+ in our dataset including Innago, Baselane, RentSpree, TenantCloud, LandlordStudio, Latchel, HappyCo, REI Hub, AppWork, DocuSign, Rocket Lawyer). Most likely to surface long lists with positioning ladders, give specific pricing claims, and bring in adjacent ecosystem brands when the prompt borders them.

If your goal is "appear at all," ChatGPT is the easiest engine. If your goal is "be the dominant answer," ChatGPT is the noisiest.

Claude (with web_search): the heritage-weighted citer

Cites ~2.5 brands per prompt. Notable Yardi over-weighting (2.7x more than Google AI Overviews). Tendency to lead with the established enterprise brand even on small-landlord prompts. Probably reflects training-data composition more than current-web reality. If your competitor is a heritage brand, expect Claude to over-represent them. If you are the heritage brand, Claude is your strongest engine.

Google AI Overviews: the most balanced citer

Cites ~2.5 brands per prompt with the most even distribution across the leaderboard. The closest engine to "what would actually be useful." Gets the framing right on category prompts (small landlord vs enterprise). Tends to surface negative narratives when they exist (RealPage antitrust is heavily covered here). Tightly tied to the web's current consensus.

Perplexity: the citation-only citer

Tightest brand list per response (~2 brands average on the 10 prompts that completed). Citation-native by design; every answer comes with explicit sources. Lowest tolerance for general-knowledge responses. If it can't cite, it qualifies more carefully. Hardest engine to land on, but the highest-trust answer surface when you do.

The strategic implication: there's no single engine to "optimize for." Each one rewards different content and entity behaviors. The good news is that the structural work overlaps. The work to land on Perplexity (clean citable claims, named-entity binding, off-site mentions in source domains Perplexity trusts) also lands on Google AI Overviews, also lands on Claude, also lands on ChatGPT. Optimize for the structural underwriting and you appear on all four. The variance is in how much.

Comparison queries collapse the variance

A pattern emerged that wasn't in our hypothesis going in.

When a prompt named two brands directly ("Yardi vs AppFolio," "Buildium vs AppFolio," "Property Meld vs Lessen"), all four AI engines gave near-identical answers. Same two brands cited. Same positioning ladder ("AppFolio for mid-market, Yardi for enterprise"). Same comparison rubric. Variance collapsed to almost zero.

When the prompt was open-ended ("best property management software for 500 multifamily units," "what tools do property managers use for maintenance dispatch"), variance exploded. Engines disagreed on which 3-5 brands to name, in what order, with what framing.

This matters strategically.

If you're trying to win the head-to-head comparison query ("Brand X vs Brand Y"), your job is the same on every engine: be on both sides of comparison content, structure the comparison cleanly, get the schema right. The engines converge here, which means the work converges.

If you're trying to win the category-default question ("best X for Y"), the engines diverge sharply. You're now optimizing for four separate citation surfaces with four different selection mechanisms. The good news is that this is also where the AEO opportunity is biggest. Comparison queries are zero-sum (someone wins, someone loses). Category-default queries are open; you can pick up share without displacing a named competitor.

In practical AEO budgeting, this argues for prioritizing category-default content over head-to-head comparison content. Category content has more total upside and more engine-specific differentiation. Comparison content has lower variance but a smaller cap.

Inside one engagement: the 96-citation save

Working inside one engagement gave us the methodology insight that mattered most.

A few weeks ago we were preparing a 301-redirect plan for a construction-lending PropTech we work with. One URL on the list was a 2019 blog post carrying minimal Ahrefs traffic estimates and almost no backlink equity. Standard SEO logic said: redirect to the consolidated current piece, clean up the site, move on.

Before pulling the trigger, we ran the URL through AirOps's AEO citation tracking. The result: 96 active AI citations on the URL across two high-value lender prompts. Including the highest-intent commercial-lending prompt in the entire tracked set.

We reversed the redirect. Refreshed the page to match current product positioning, anchored a direct-answer block for the declining-citation prompt specifically, kept the URL live. The 96 citations stayed.

The process insight is the part worth taking.

Ahrefs and GSC can show a URL as functionally dead — low traffic, low link equity, redirect candidate — while AI citation tracking reveals it as a load-bearing asset on a separate signal entirely. AI citation rate doesn't map to traditional SEO metrics. A URL that looks redirectable on Ahrefs can be the most-cited URL you have on the prompts that actually convert.

The operational change at this engagement: before any blog redirect, we now run a 30-second citation check. It costs nothing and it's already caught 96 citations on a single URL we would have otherwise lost.

If you're running content cleanups in 2026 based only on Ahrefs and GSC, you're making redirect decisions on incomplete data. The signal that matters for AI search isn't in either tool yet.

What this means for B2B SaaS AEO strategy

Five strategic implications come out of the data.

1. There is no single AI search market to optimize for

ChatGPT, Claude, Perplexity, and Google AI Overviews each cite differently for the same query. The structural work to land on all four overlaps significantly (direct-answer openings, FAQ schema, entity binding, off-site mentions in trusted source domains). The citation surface itself is four separate markets with four selection mechanisms. Optimizing for "AI search" as a single thing produces a worse strategy than picking one engine to win first.

2. Category-default queries are where AEO leverage lives

Comparison queries collapse engine variance. Every engine gives the same answer for "Yardi vs AppFolio" because the named brands constrain the response. Open category queries diverge wildly. If you're allocating AEO budget, weight category content over comparison content. Bigger total upside, more engine-specific differentiation, less zero-sum competition.

3. Buy AEO tools for measurement, not for visibility

The tracking tools all work, and they all cost money. Neither fact moves your citation rate on its own. A category of tools automates what we did by hand in this study. Otterly, Knowatoa, Peec, Hall, and AirOps all track AI citations across engines, typically $30 to $150+ per month. They genuinely save time; this study took hours of manually running prompts.

AirOps is the one I reach for inside engagements, and it goes a step further than pure tracking: it logs content changes against citation outcomes, so you can see which edits actually moved the needle. But every tool in the category does the same core job, which is measurement.

The work that lifts citation rate is structural: direct-answer openings, FAQ schema, entity binding, off-site mentions. That craft is the same whether you pay for a tracker or run a free spreadsheet. Buy the tool if the manual time is your bottleneck. The subscription measures the problem; it doesn't fix it.

4. The losing brands' real gap is entity clarity

The brands losing on AI search have plenty of content. The under-cited brands at the bottom of the table aren't there because they don't publish. They're there because AI engines don't know what they are, what category they sit in, what they compete with, or who their named experts are.

The fix isn't more content. The fix is the entity stack, and in practice that's three concrete tasks:

  1. Clean Organization, Person, and Product schema on the site.
  2. Point the schema's sameAs attributes at every authoritative external node where the brand already exists: Crunchbase, Wikidata, LinkedIn company and founder profiles, G2, Capterra, official social accounts. This is the part most teams skip. Those sameAs links are how you tell an AI engine "this brand is one real, identifiable entity, and here is corroborating proof across the sources you already trust."
  3. Named author bylines on every blog post, plus off-site mentions in podcast transcripts and Reddit threads.

Most of the under-cited brands could 10x their citation rate in 90 days without writing a single new article, by fixing entity signals on what they already have.

5. AI citation rate is a separate signal from every other organic metric

It doesn't correlate cleanly with organic traffic, backlink count, brand awareness, or paid spend. The 96-citation save above is the proof point. Until your team is measuring AI citation rate directly, you're making content and SEO decisions on partial information. The fix is a 30-minute weekly prompt-set check at minimum.

Limitations and open questions

Limitations worth being honest about.

This study covers one category (PropTech), 15 tracked brands, 25 prompts, and 4 AI engines, at one point in time. The findings generalize to B2B SaaS AEO patterns in our experience, but a 30-day rerun of the same methodology would shift specific numbers. The brand citation table is a snapshot, not a benchmark.

The ChatGPT dataset was captured manually because consumer chatgpt.com blocks browser automation. There's a small consistency risk in manual capture; a human researcher might mis-attribute a brand mention more easily than a string-match parser would. We capped the risk by using the same researcher for all ChatGPT runs and the same definition of "cited" (named in the answer text, not in a source URL).

The Perplexity dataset is the smallest at 10 prompts. The free tier capped us before we could complete the run. Perplexity Pro would have lifted the limit, but the methodology decision was to capture what a typical practitioner would see, including the friction.

We tracked 15 specific brands. The data shows clearly that engines name 30+ relevant brands across the same prompts. Innago, Baselane, RentSpree, TenantCloud, LandlordStudio, REI Hub, RentRedi, Latchel, HappyCo, MaintainX, Rocket Lawyer, DocuSign, and others all appeared without being on the tracked list. Brand discovery is iterative. Anyone running this methodology should treat the brand list as the first hypothesis, not the canonical universe.

One data point worth flagging on the tracked list: Azibo wound down operations in late 2024. Full disclosure, since it affects how you should read the Azibo numbers: I led content and SEO at Azibo in 2023-2024 (the full case study is here). We kept the brand in the dataset because AI engines still cite it in PropTech queries based on archived content (and on direct comparison prompts like "Stessa vs Azibo").

It's a small finding in its own right: AI citation memory outlives active brand operations. If your company rebrands, exits, or pivots, expect to keep showing up in AI answers for months or years afterward, citing content you no longer control. Worth knowing for any operator considering closure, acquisition, or category pivot.

The biggest open question we can't yet answer: how much do these citation patterns shift week over week? AI engines update their training data, retrieval mechanisms, and source weighting on opaque cadences. The Property Meld and Lessen partnership finding (two engines surfaced the news independently) suggests the engines are at least somewhat current, but we don't have longitudinal data to say how fast a brand's citation rate can swing. We'd run this study again in 90 days to find out.

Frequently asked questions

How do you measure AI citation rate for your brand?

Pick 20-30 buyer-language prompts your buyers actually ask AI assistants about your category. Run each through ChatGPT, Claude, Perplexity, and Google AI Overviews. For each response, record whether your brand appears in the answer text, what position it appears in, and which competitors are cited alongside. The simplest version is a Google Sheet and 30 minutes per week. Dedicated AEO tools (AirOps, Otterly, Peec, Knowatoa) automate this at scale starting at $30-150/month. Both produce the same signal.

Which AI engine should B2B SaaS optimize for first?

Pick the engine with the highest fit between its citation behavior and your strategic goal. If you want to appear across a wide net of buyer queries, ChatGPT is the easiest first surface (it's the most generous citer). If you want to displace an established competitor on entity-level questions, Claude is where the gap is biggest (it over-weights heritage brands in training data). If you want share of voice on neutral category queries, Google AI Overviews has the most balanced distribution. Structural work overlaps across engines, so winning on one usually lifts the others.

Is AI search killing organic search?

No. AI search is changing what organic search means. In a 13-month B2B agency analysis published in Search Engine Land, AI-referred visitors converted at roughly 18%, higher than any other measured traffic source. The buyer journey now includes AI assistants alongside Google search, but click-through behavior and conversion mechanics downstream are mostly the same. The structural SEO work that earned rankings five years ago still earns AI citations today, plus a few new requirements (direct-answer openings, FAQ schema, named entity binding, off-site mentions).

Can small brands compete with established players on AI citation rate?

Yes, and faster than they can compete on traditional SEO. This study shows DoorLoop, a smaller PropTech brand, landing AI citations at roughly the same rate as AppFolio across multiple engines. AI engines weight current content signals (direct-answer structure, FAQ schema, entity clarity, off-site mentions) more heavily than they weight raw market position. A category challenger willing to do the structural work can reach tier-1 share of voice in 6-12 months, significantly faster than the same brand could displace an incumbent in regular search.

What's the cheapest way to start tracking AI citations for my brand?

A free Google Sheet, 20-30 buyer-language prompts, 30 minutes a week. Run each prompt manually through ChatGPT (with web search enabled), Claude, Perplexity, and Google (look for the AI Overview at the top of results). Note whether your brand name appears in the answer text and what competitors are cited alongside. That's the methodology used in this study. You can run citation tracking with $0 spend before deciding whether to buy a tool.

Gemma Smith

Gemma Smith, Founder, PropSaaS Growth

SEO, AEO, and content strategy for PropTech, FinTech, and B2B SaaS companies. 10+ years in PropTech. Active engagements with vertical SaaS platforms.