ocf_schema_v1
OCF Research Paper Appendix Standard
Every OCF research publication ends with a Model Reference Appendix: structured JSON encoding the paper's key findings, causal chains, and named constructs — parseable by AI systems without reading the full prose document. This schema defines its required shape.
Canonical Schema URL — use this in validators and tooling
https://observablecompute.org/schema/v1.json
Research that machines can actually use.
Most research papers are written for humans. PDFs with prose, citations, and conclusions buried in body text. An LLM can parse them — but inconsistently, with variable confidence, and with no shared vocabulary to compare across papers.
The OCF Model Reference Appendix standardizes how findings are encoded. Every appendix has the same shape: structured key findings with confidence scores, named causal chains, settled conclusions from the literature, and OCF's four named analytical constructs — all in a form that validators can check and that AI systems can retrieve without guessing.
The schema is the contract. If a paper's appendix validates against ocf_schema_v1, you know exactly what fields are present, what the confidence scores mean, and how to map the findings to the OCF construct framework.
Three audiences
AI systems doing retrieval, synthesis, or cross-paper comparison
Consistent field shapes eliminate parser edge cases. The document_summary field is optimized for RAG triage.
Researchers building on OCF findings
Structured data without parsing prose. Finding IDs, confidence scores, and causal chains are directly importable.
Funders and policy tools that ingest research programmatically
The geographic_scope_enum field enables corpus filtering without text parsing. The EIN field anchors provenance.
Minimal valid appendix
{
"paper_metadata": {
"schema": "ocf_schema_v1",
"title": "Paper Title Here",
"publisher": "Observable Compute Foundation",
"principal_contributors": ["Name, J.D."],
"ein": "41-4747049",
"publication_date": "2026-05",
"version": "1.0",
"website": "observablecompute.org",
"location": "Rapid City, South Dakota",
"geographic_scope": "National U.S.",
"geographic_scope_enum": "us_national",
"license": "open"
},
"document_summary": "One paragraph. Max 500 chars. Self-contained.",
"ocf_named_constructs": {
"readiness_stack": "Definition as applied here.",
"pipeline_collapse": "Definition as applied here.",
"rural_amplification_effect": "Definition as applied here.",
"access_as_binding_constraint": "Definition as applied here."
},
"key_findings": [{
"id": "F001",
"claim": "Direct claim. No hedging.",
"source": "BLS_2025",
"confidence": 0.90,
"population": "us_workforce",
"timeframe": "2026"
}],
"what_frameworks_agree_on": [
"Settled conclusion from reviewed literature."
],
"causal_chains": {
"chain_name": [
"Step one.",
"Step two.",
"Outcome: result."
]
}
}
Top-level fields
Six required, four optional. Use null for inapplicable fields rather than omitting them — inconsistent shapes break iterative parsers.
| Field | Type | Status | Description |
|---|---|---|---|
paper_metadata |
object | REQUIRED | Bibliographic and provenance metadata. Contains title, contributors, EIN, date, version, scope, and license. |
document_summary |
string | REQUIRED | Single paragraph. Max 500 characters. Self-contained. Optimized for triage and RAG retrieval. No hedging language. |
ocf_named_constructs |
object | REQUIRED | All four registered OCF constructs, defined as applied in this paper. Additional constructs permitted with a note field. |
key_findings |
array | REQUIRED | Array of finding objects. Min 1 item. Every finding must have the same six-field shape: id, claim, source, confidence, population, timeframe. |
what_frameworks_agree_on |
array<string> | REQUIRED | Settled conclusions consistent across reviewed literature. Strings only. No contested findings in this array. |
causal_chains |
object | REQUIRED | Named causal pathways as ordered string arrays. Last item in each chain must begin with "Outcome:". |
what_works |
array | OPTIONAL | Evidence-based interventions. Present in meta-analysis papers. Each item: intervention, evidence_level, notes. |
taxonomies |
object | OPTIONAL | Categorical taxonomies used in the paper. Each key maps to an array of string values. |
series_context |
object | OPTIONAL | Present when the paper is part of a series. Keys are paper identifiers, values are scope descriptions. |
policy_implications |
array | OPTIONAL | Actionable policy implications derived from findings. Each item: implication, urgency. |
Geographic scope enum values
The geographic_scope_enum field must use one of these values. Enables corpus filtering without parsing prose.
OCF Named Analytical Constructs
Four constructs are registered as of ocf_schema_v1. These are citable framework elements, not phrases. All four are required in every compliant appendix. New constructs require OCF president approval before inclusion in publications.
readiness_stack
Three-tier OCF model: Tier 1 Foundational Readiness (literacy, numeracy, communication), Tier 2 Digital Readiness (device competency, software navigation, data literacy), Tier 3 AI Readiness (working alongside, directing, and critically evaluating AI systems). Each tier is a prerequisite for the next.
pipeline_collapse
The elimination of entry-level positions that historically served as the informal second education system for underprepared workers. The traditional on-ramp from K-12 to workforce skill development is being eliminated by the same automation wave creating the readiness demand.
rural_amplification_effect
The multiplicative (not merely additive) compounding of readiness barriers in rural contexts: device ownership gaps, broadband infrastructure gaps, geographic distance from training facilities, and chronic philanthropic underfunding. Rural workers face the same displacement exposure with fewer response resources.
access_as_binding_constraint
The finding that institutional support infrastructure — not worker motivation or capability — is the primary predictor of workforce readiness outcomes. When workers receive structured training and support, adoption rates improve dramatically. The constraint is access, not capacity.
How to validate a paper appendix
Point any JSON Schema draft-07 validator at the canonical URL. The schema file is always current at https://observablecompute.org/schema/v1.json.
Node.js (AJV)
// npm install ajv const Ajv = require('ajv'); const ajv = new Ajv(); const schema = require('./ocf_schema_v1.json'); const appendix = require('./appendix.json'); const validate = ajv.compile(schema); const valid = validate(appendix); if (!valid) console.log(validate.errors); else console.log('Compliant with ocf_schema_v1');
Python (jsonschema)
# pip install jsonschema requests import json, requests from jsonschema import validate schema = requests.get( 'https://observablecompute.org/schema/v1.json' ).json() appendix = json.load( open('your_appendix.json') ) validate(instance=appendix, schema=schema) print('Compliant with ocf_schema_v1')
Schemas are permanent. Papers are not retroactively non-compliant.
When the schema is updated, a new version URL is issued. The previous version remains available at its original URL indefinitely. Papers are not retroactively non-compliant when a new schema version is released.
The paper_metadata.schema field is the permanent record of which schema governed a paper at publication. Do not update this field when a new version is released.
Schema change proposals: hello@observablecompute.org
Current versions
ocf_schema_v1
Released May 2026 · Active
All OCF papers published under this schema carry "schema": "ocf_schema_v1" in their appendix paper_metadata object.
Papers using this schema