This document explains how structured data (schema.org markup) works on the Pulumi website.
Structured data helps search engines understand your content better, leading to:
- Rich search results (FAQs, events, articles)
- Better SEO and search visibility
- Enhanced AI citations (ChatGPT, Perplexity, Google AI Overviews)
- Improved content discoverability
The site automatically generates appropriate schema.org markup based on content type. You can also explicitly control which schema type to use.
| Schema Type | Schema.org Type | Use For | Example |
|---|---|---|---|
blog |
BlogPosting | Blog posts | /blog/my-post/ |
article |
TechArticle | Documentation, educational content | /docs/, /what-is/ |
faq |
FAQPage | FAQ pages with Q&A pairs | /docs/iac/faq/ |
howto |
HowTo | Step-by-step tutorials | /tutorials/ |
product |
SoftwareApplication | Product pages | /product/ |
event |
Event | Webinars, conferences, meetups | /events/ |
auto |
(various) | Intelligent auto-detection | Default behavior |
none |
(none) | No schema markup | Special cases |
By default (or when schema_type: auto), the system automatically determines the appropriate schema based on:
- type: blog → BlogPosting
- type: tutorials → HowTo
- type: webinars → Event
- type: docs → TechArticle
- type: what-is → FAQPage (if questions found) or TechArticle (if no questions)
- /faq in URL → FAQPage
- "faq" or "frequently asked" in title → FAQPage
- section: product → SoftwareApplication
- section: case-studies → TechArticle
If a page is detected as FAQ but has no Q&A content, it automatically falls back to Article schema to prevent invalid structured data.
You can override auto-detection by specifying schema_type in your page frontmatter:
---
title: "My Page Title"
schema_type: faq
---Use explicit declaration when:
- Auto-detection chooses the wrong schema type
- You want to be specific about SEO markup
- Content structure doesn't match typical patterns
- You need to prevent schema generation (
schema_type: none)
Leave as auto when:
- Your content follows typical patterns
- Page type clearly indicates schema type
- You want the system to make intelligent decisions
---
title: "Common Questions About Pulumi"
schema_type: faq
---
## What is Pulumi?
Pulumi is an infrastructure as code platform...
## How does Pulumi work?
Pulumi uses familiar programming languages...---
title: "Internal Planning Doc"
schema_type: none
------
title: "Understanding Kubernetes"
type: docs # Would normally get TechArticle
schema_type: howto # Override to HowTo if it's step-by-step
---Requirements:
- Must have H2 or H3 headers ending with
? - Each question must be followed by answer content
- Questions are extracted automatically from markdown
Valid Question Formats:
## How do I install Pulumi?
Answer content here...
## What languages does Pulumi support?
Answer content here...Invalid (won't be detected):
## Installation Steps
This is not a question...
### Getting Started
Not a question either...Google Guidelines:
- One answer per question
- Written by site (not user-submitted)
- Actually visible on the page
- Not for advertising purposes
Auto-populated from frontmatter:
main.title→ event namemain.sortable_date→ start datemain.duration→ durationmain.location→ location (physical or virtual)main.presenters→ speakers/performers
Example:
---
type: webinars
main:
title: "Pulumi Workshop"
sortable_date: 2025-03-15T10:00:00-07:00
duration: "2 hours"
location: "Seattle, WA"
presenters:
- name: "Jane Smith"
role: "Developer Advocate"
---Best for:
- Documentation pages
- Educational content
- Technical guides
- Case studies
Auto-populated from:
title→ headlinemeta_desc→ descriptionPublishDate→ datePublishedLastmod→ dateModified
Learn more about Article schema
Automatically applied to:
- Pages with
type: blogandisPage: true
Includes:
- Article metadata (title, dates, description)
- Author information
- Word count
- Images
Best for:
- Step-by-step tutorials
- Procedural guides
- How-to content
Auto-extracts:
- Steps from markdown headers and content
- Duration if specified
- Prerequisites
- Build your page locally or deploy to staging
- Visit Rich Results Test
- Enter your page URL
- Check for errors or warnings
- After deploying to production
- Go to Google Search Console
- Check "Enhancements" section for structured data issues
- Monitor for "Missing field" or other errors
- Run
make serve - Visit your page at http://localhost:1313/your-page/
- View page source
- Look for
<script type="application/ld+json"> - Verify the JSON structure is valid and complete
Problem: FAQPage schema without questions
Solution: Either:
- Add Q&A content (H2/H3 ending with
?) - Set
schema_type: articleto use Article schema instead - The system now auto-falls back to Article if no questions found
Problem: Page could be both Article and FAQ
Solution: Choose the primary purpose:
- If primarily Q&A → use
faq - If primarily educational → use
article - Google prefers one primary schema type per page
Causes:
- Page type doesn't match any auto-detection rules
schema_type: noneis set- Content doesn't meet requirements (e.g., FAQ with no questions)
Solution:
- Check page frontmatter for
typefield - Add explicit
schema_typeif needed - Ensure content meets schema requirements
- Create
layouts/partials/schema/collectors/[name]-entity.html - Follow pattern from existing collectors (see
blog-entity.html) - Add to
main-entity.htmlexplicit and auto-detection logic - Update this documentation
- Add to archetype templates
layouts/partials/schema/
├── collectors/
│ ├── main-entity.html # Routes to appropriate schema
│ ├── article-entity.html # TechArticle schema
│ ├── blog-entity.html # BlogPosting schema
│ ├── faq-entity.html # FAQPage schema
│ ├── howto-entity.html # HowTo schema
│ ├── product-entity.html # SoftwareApplication schema
│ ├── event-entity.html # Event schema
│ ├── video-entity.html # VideoObject (supplementary)
│ └── breadcrumb-entity.html # BreadcrumbList (supplementary)
└── graph-builder.html # Assembles @graph structure
All schemas are assembled into a single JSON-LD @graph structure:
{
"@context": "https://schema.org",
"@graph": [
{"@type": "BreadcrumbList", ...},
{"@type": "BlogPosting", ...},
{"@type": "Organization", ...},
{"@type": "WebPage", ...}
]
}- Schema.org Documentation
- Google Search Central - Structured Data
- Google Rich Results Test
- Google Search Console
- SEO.md - SEO best practices including structured data
- BLOGGING.md - Blog post guidelines