Validation & diagnostics
This is the engine companion to
How Keystone checks your book.
That page is the author's mental model; this one is the mechanism — how the
checks are implemented, and how a core forker adds their own. If you're only
writing a book, you want the author page.
One path for filter diagnostics
Every diagnostic from Keystone's Lua filters and handlers goes through one
library, .pandoc/filters/lib/errors.lua — no filter writes to stderr or exits
the process on its own. (Shell resolvers and Pandoc emit their own; see
strict mode.) That single owner is why
the Lua messages are uniform: the WARN:/ERROR: prefix, the element-context
suffix, and clean-fatal behavior are decided in one place.
Two functions, two severities:
warn(msg)— print, notify the strict-mode sink, and return. The caller continues with a fallback.fatal(msg)— print, then exit the process cleanly. No Lua traceback into filter source reaches the author;fatalcallsos.exitbefore Pandoc's own error handler can append one.
fatal is for author- and config-facing failures — a bad value, an undeclared
mark. A bare error() is reserved for internal-invariant bugs (a handler that
returned the wrong shape), where a traceback is the right tool. Handlers are
pure author-content surface: they always report, never error() or io.stderr
directly — so every message stays uniform and the fatal path stays
traceback-free.
Element context: quoting the element, not a line
A diagnostic locates the offending element by quoting it, not by line number:
WARN: aside: unknown type 'todo' (in .aside "A callout whose type is not one of the d…")
That trailing (in …) suffix is built by kast.inspect.describe(el), which
snapshots the element's classes, identifier, and a leading-text snippet (capped,
cut on a UTF-8 boundary). The errors lib renders that as a CSS-like selector plus
the snippet.
Why quote the element instead of pointing at file:line? Source positions reach
the document tree only through Pandoc's sourcepos extension, which the
commonmark/gfm readers support but the markdown reader — the one the whole
fenced-div and shortcut system rests on — does not. Quoting the element is 100%
correct with no reader switch, and it covers every handler. describe is also
the seam where source positions would attach: if the markdown reader ever gains
sourcepos, a pos field added here surfaces real file:line with the call
sites untouched. (The one place a real line number appears today is Pandoc's own
unclosed-div warning, which Keystone passes through untouched — see
Finding the offender.)
The context is bound lazily: the describe walk runs only if a diagnostic actually fires, so the happy path pays nothing.
-- in the dispatcher, per element
local report = errors.bind(function() return kast.inspect.describe(el) end)
-- a handler then calls report.warn("…") / report.fatal("…")
The closed vocabulary
Keystone's class set is closed: there is no author CSS channel, so a class
that resolves to nothing is a typo, not an extension point. shortcuts.lua holds
the full vocabulary, so it owns the check — every div/span class must resolve
against one of:
- a handler class (bare or
ks--prefixed), - a shortcut name (system or user), or
- a short Pandoc-native allowlist (
smallcaps,underline,ul,mark,unlisted).
Any class matching none of them warns, naming the class. A close match from the
combined vocabulary — Levenshtein distance ≤ 2, ties broken lexicographically —
is offered as a "Did you mean?" hint. The name is canonicalized (its ks-
prefix stripped) before matching, so a broken private class steers to the public
name, which is the stable surface.
Classless elements (an id-only ::: {#refs}) are skipped — citeproc fills those
with csl-* classes after the filters run, so flagging them would be wrong.
Required fields
Whether a field is required is declared in the shortcut interface, not in handler code:
- An interface entry may declare
required: true. At expansion, a required field with no author value and nodefaultis fatal — the field is the whole point of the construct (vspace.size,set.mark). - Each handler declares the attributes it cannot run without on its returned
table:
required_attributes = { "size" }. This is the handler's contract. - At load time,
shortcuts.luachecks that every shortcut routing to a handler guarantees each required attribute — via adefaultorrequired: true. A shortcut that would let a required attribute through unset is a fatal misconfiguration, caught before any book builds. The guarantee extends transitively to shortcut bodies; only a bareks-*handler used directly in the manuscript bypasses it (the documented escape hatch).
The split keeps handlers free of fatal presence checks — they assume the
shortcut layer supplied what's required and validate only value validity
(aside rejects an unknown type). The required-field guarantee lives in the
interface, where aside.type defaults to note; a handler keeps only a soft
guard for the bare-ks-* bypass path — aside warns no type given rather
than crash.
Strict mode
KEYSTONE_WARNINGS_AS_ERRORS (accepted truthy: true, 1, yes, on) turns
every warning fatal. Three worlds emit warnings, and each fails differently:
- Shell resolvers (
diagnostics.sh) fail fast — under strict mode the firstwarnstops the build immediately. - Pandoc runs with
--fail-if-warnings, so its own reader warnings fail the pass they occur in. - Lua filters are the one world that aggregates. They run as separate Lua
states — the main pass, the EPUB pre-scan, a standalone font-path invocation —
so an in-process counter can't see them all. Instead
publish.shexports a sink file path, each state's errors lib appends its warnings there, andpublish.shchecks the sink after the build. That's report-all-then-fail: one run surfaces every filter warning at once rather than dying on the first.
Whichever world trips, artifacts are promoted on success only — the build
writes to a staging path and moves into artifacts/ only after a clean build
with an empty sink, so a strict failure never leaves a half-built file behind.
EPUB diagnostics can print twice
EPUB builds run a pre-scan pass before the main build, and the filters run in both. A filter-level warning therefore prints once per pass. The sink de-duplicates before reporting, so strict mode lists each distinct warning once.
Adding a diagnostic to your handler
When you add a handler, it receives the
element and a bound report handle:
local function my_div(el, report)
local width = el.attributes["width"]
if width and not valid(width) then
report.warn("mydiv: unknown width '" .. width .. "'")
-- fall through to a safe default
end
-- …emit output…
end
- Use
report.warnwhen you have a fallback and the book can still build; usereport.fatalwhen proceeding would produce broken output. - The element context is attached automatically — don't repeat the class or a location in your message. State the problem and the offending value.
- Declare anything you can't run without in
required_attributeson the returned table, rather than hand-rolling a presence check — the interface then guarantees it, and the message is consistent with every other required field.
Match the wording of the existing handlers: lower-case handler name, the bad
value quoted, and the accepted set named when it's small
(invalid cols '9' (must be an integer from 2 to 4)).