Date: 2026-05-18
The site has 500+ pages: ~560 blog posts across 12 layouts, plus 22 zen PWA apps, tools, subscription pages, feeds, retired pages, and other scaffolding. Without explicit indexing control, Google sees everything — wasting crawl budget on pages that hurt SEO (thin content, duplicate, non-canonical).
A Jekyll post_render hook plugin that injects <meta name="robots"> into every rendered HTML page using a priority chain so you can express rules as simply as possible:
robots: — per-page override, injected verbatim into content=""noindex_patterns — glob patterns that always force noindex, regardless of layoutindex_layouts / index_paths / index_patterns — any match → indexdefault: noindex — fallback for everything not matched_config.yml under robots_meta: — the full block lives there so it’s version-controlled and visible in one place.
robots_meta:
default: noindex
index_layouts: [post, bpost, bpostnoads, bookpost, ...]
index_paths: [/]
index_patterns: [/categories/**, /tags/**]
noindex_patterns: [/zen*/, /zen*/**, /apps/**, /token/**, ...]
The site has more non-blog pages than blog pages by count. Opt-in indexing (list what should appear in Google) is safer than opt-out (list what shouldn’t). A new section added to the site is silently noindexed until explicitly added to index_layouts or index_patterns.
Any page or post can override with front matter:
---
robots: noindex # keep this post out of Google
robots: index, follow # force index even if pattern would noindex it
robots: noindex, nofollow, noarchive # verbatim — any robots directive works
---
index_layoutsnoindex_patternsrobots: on that pageFile::FNM_PATHNAME | File::FNM_EXTGLOB so /zen*/ matches /zeneditor/, /zengen/, etc. without listing each slugname="robots" — never clobbers a manually placed tag in a layout <head>:pages, :posts, :documents — covers all Jekyll output types