AEO · Answer Engine Optimisationintermediate3 min read

What is Speakable Schema?

Speakable schema is a structured data type that marks specific sections of a webpage as the most suitable content for text-to-speech delivery — optimising it for voice assistants and AI reading tools. Implemented via CSS selectors or XPath in a Speakable JSON-LD block, it tells Google Assistant (and potentially other AI systems) exactly which sections of your page are the clearest, most concise answers appropriate for being read aloud. It's a direct signal for AEO and voice search optimisation.

Fact-checked against 3 sourcesLast updated 8 June 2026
Key Takeaways
  • Speakable schema is currently limited to news articles in Google's official implementation — but signals voice-readiness to all AI systems.
  • Mark only genuinely speakable content — concise, standalone sentences that make sense read aloud without surrounding context.
  • Speakable sections should be 20–90 words — long enough to be informative, short enough for audio consumption.
  • This is an early-stage schema type — implementing it now positions content for future AI audio interfaces.
  • The content you mark as speakable should overlap with what you'd optimise for featured snippets — the formats align.

How Speakable Schema Works

Speakable schema uses the Speakable type in a JSON-LD block. You specify which page sections should be read aloud using CSS selectors (pointing to element IDs or class names) or XPath expressions.

When Google Assistant encounters a relevant query, it can retrieve the page and read aloud the marked speakable sections — providing an audio answer attributed to your site.

The markup structure: { "@context": "https://schema.org", "@type": "WebPage", "speakable": { "@type": "SpeakableSpecification", "cssSelector": [".article-headline", ".article-summary"] } }

Google's current implementation is focused on news publishers, but the underlying principle — identifying the most audio-appropriate content sections — applies broadly as voice and AI interfaces evolve.

What Content to Mark as Speakable

The best candidates for speakable markup: definition sentences (the first sentence of a definition, which directly explains the term), key takeaways (short, complete statements that stand alone), direct answers to common questions (the 40–60 word answer below an H2 question heading), and summary paragraphs.

Avoid marking: content that requires visual context (tables, charts), technical code examples, lists that don't make sense read linearly, and content with lots of parenthetical asides or complex sentence structures.

Think about the audio experience: 'Crawl budget is the number of pages Googlebot crawls on your site per day' reads well. 'As we discussed in the previous section, the following table shows...' does not.

Stay sharp

Most guides are already outdated.

One email a week. The search stuff that actually matters — what shifted, what died, and what to do about it.

Subscribe free →
Speakable SchemaAEO

A structured data specification (SpeakableSpecification) embedded in JSON-LD that signals to voice assistants and AI reading tools exactly which sections of a webpage are most appropriate for text-to-speech delivery — enabling attributed audio answers from your content.

SPEAKABLE SCHEMA: FROM PROPOSAL TO AI-ERA RELEVANCE
2017
Schema.org Proposal

The SpeakableSpecification type is formally proposed and added to the Schema.org vocabulary, establishing the technical foundation for identifying audio-appropriate content.

2018
Google Developer Beta Launch

Google introduces Speakable schema in beta for Google Assistant, initially restricting eligibility to news publishers in the United States to test real-world audio retrieval at scale.

2019
Google Search Central Documentation

Google publishes official Speakable structured data guidelines, confirming CSS selector and XPath implementation methods and defining content quality criteria for marked sections.

2023–2024
Renewed Relevance via Generative AI

The rise of AI-powered voice interfaces and large language model assistants increases the strategic value of Speakable markup as a direct signal for audio-optimised, machine-readable content selection.

✓ DO

Mark the first definitional sentence of a section — the one that directly answers 'what is this?'

Target concise standalone statements of 40–60 words that make complete sense without surrounding context

Use stable, specific CSS selectors like element IDs (#article-summary) to avoid unintended content being captured

Include your article headline in speakable markup to provide Google Assistant with clear attribution context

Test marked sections by reading them aloud yourself — if they sound natural and complete, they qualify

✗ DON'T

Mark content that references visuals: 'as shown in the chart below' is meaningless in audio

Include bulleted or numbered lists that rely on visual scanning rather than linear reading

Use overly broad CSS selectors like 'p' or '.content' that capture entire page bodies indiscriminately

Mark code blocks, technical syntax examples, or strings of URLs and file paths

Tag content with heavy parenthetical asides, em-dashes mid-clause, or complex nested sentence structures

REAL-WORLD EXAMPLE
News Publisher Speakable Implementation

A news site structures its article template so that the headline maps to '.article-headline' and the opening summary paragraph maps to '.article-summary'. Their JSON-LD block reads: { "@context": "https://schema.org", "@type": "WebPage", "speakable": { "@type": "SpeakableSpecification", "cssSelector": [".article-headline", ".article-summary"] } }. When a user asks Google Assistant 'What happened with [story topic]?', Assistant retrieves the page, reads the two marked sections aloud, and attributes the answer to the publisher — driving brand awareness through audio without a screen interaction.

CSS SELECTOR VS. XPATH: CHOOSING YOUR SPEAKABLE IMPLEMENTATION METHOD
CSS SelectorXPath Expression
Simpler syntax — familiar to front-end developers and SEOsMore powerful — can target elements by content, position, or attribute value
Example: ".article-summary", "#page-intro"Example: "/html/head/title", "//p[@class='summary']"
Best for modern, well-structured HTML with consistent class namingBetter suited to legacy CMS output or irregular DOM structures
Google's recommended and more commonly documented approachSupported by Google but less commonly used in published implementations
Limited to element class and ID targetingCan traverse parent-child relationships and conditional node selection
PRE-PUBLISH SPEAKABLE SCHEMA AUDIT
0/7 complete
JSON-LD block is placed in the <head> or immediately after the opening <body> tag
@type is set to 'WebPage' and speakable property uses 'SpeakableSpecification'
Each CSS selector or XPath expression resolves to exactly the intended element on the live page
Every marked section reads as a complete, self-contained statement when heard without surrounding content
No marked sections reference tables, images, charts, or other visual-only elements
Headline element is included in speakable selectors to provide attribution context for voice assistants
Validated using Google's Rich Results Test and Schema.org validator before deployment
Free Tool

How does your site score on AEO?

Paste your URL. Get a score and a fix list across all three disciplines. No form, no email.

Run Free Audit →

Frequently Asked Questions

Not directly. Speakable schema is an AEO signal targeted at voice and audio interfaces, not a traditional ranking factor. However, the content it points to — concise, standalone, directly answerable — overlaps significantly with featured snippet and PAA content. Optimising for speakable content quality tends to improve all answer engine placements.

It's low-risk and potentially valuable as a future-proofing measure. AI reading tools, voice assistants, and text-to-speech interfaces are growing. Marking your best answerable content now costs little and signals content structure to AI systems beyond just Google Assistant. Validate with Google's Rich Results Test and implement on your most query-relevant pages.

Sources & Further Reading
  • 1.Google — Speakable structured data documentation
  • 2.Schema.org — Speakable type
  • 3.Search Engine Land — Voice search schema