JSON Schema draft-07 — Permanently hosted — Open license

ocf_schema_v1

OCF Research Paper Appendix Standard

Every OCF research publication ends with a Model Reference Appendix: structured JSON encoding the paper's key findings, causal chains, and named constructs — parseable by AI systems without reading the full prose document. This schema defines its required shape.

Canonical Schema URL — use this in validators and tooling

https://observablecompute.org/schema/v1.json

What the Schema Is

Research that machines can actually use.

Most research papers are written for humans. PDFs with prose, citations, and conclusions buried in body text. An LLM can parse them — but inconsistently, with variable confidence, and with no shared vocabulary to compare across papers.

The OCF Model Reference Appendix standardizes how findings are encoded. Every appendix has the same shape: structured key findings with confidence scores, named causal chains, settled conclusions from the literature, and OCF's four named analytical constructs — all in a form that validators can check and that AI systems can retrieve without guessing.

The schema is the contract. If a paper's appendix validates against ocf_schema_v1, you know exactly what fields are present, what the confidence scores mean, and how to map the findings to the OCF construct framework.

Three audiences

AI systems doing retrieval, synthesis, or cross-paper comparison

Consistent field shapes eliminate parser edge cases. The document_summary field is optimized for RAG triage.

Researchers building on OCF findings

Structured data without parsing prose. Finding IDs, confidence scores, and causal chains are directly importable.

Funders and policy tools that ingest research programmatically

The geographic_scope_enum field enables corpus filtering without text parsing. The EIN field anchors provenance.

Minimal valid appendix

{
  "paper_metadata": {
    "schema": "ocf_schema_v1",
    "title": "Paper Title Here",
    "publisher": "Observable Compute Foundation",
    "principal_contributors": ["Name, J.D."],
    "ein": "41-4747049",
    "publication_date": "2026-05",
    "version": "1.0",
    "website": "observablecompute.org",
    "location": "Rapid City, South Dakota",
    "geographic_scope": "National U.S.",
    "geographic_scope_enum": "us_national",
    "license": "open"
  },
  "document_summary": "One paragraph. Max 500 chars. Self-contained.",
  "ocf_named_constructs": {
    "readiness_stack": "Definition as applied here.",
    "pipeline_collapse": "Definition as applied here.",
    "rural_amplification_effect": "Definition as applied here.",
    "access_as_binding_constraint": "Definition as applied here."
  },
  "key_findings": [{
    "id": "F001",
    "claim": "Direct claim. No hedging.",
    "source": "BLS_2025",
    "confidence": 0.90,
    "population": "us_workforce",
    "timeframe": "2026"
  }],
  "what_frameworks_agree_on": [
    "Settled conclusion from reviewed literature."
  ],
  "causal_chains": {
    "chain_name": [
      "Step one.",
      "Step two.",
      "Outcome: result."
    ]
  }
}
Schema Reference

Top-level fields

Six required, four optional. Use null for inapplicable fields rather than omitting them — inconsistent shapes break iterative parsers.

Field Type Status Description
paper_metadata object REQUIRED Bibliographic and provenance metadata. Contains title, contributors, EIN, date, version, scope, and license.
document_summary string REQUIRED Single paragraph. Max 500 characters. Self-contained. Optimized for triage and RAG retrieval. No hedging language.
ocf_named_constructs object REQUIRED All four registered OCF constructs, defined as applied in this paper. Additional constructs permitted with a note field.
key_findings array REQUIRED Array of finding objects. Min 1 item. Every finding must have the same six-field shape: id, claim, source, confidence, population, timeframe.
what_frameworks_agree_on array<string> REQUIRED Settled conclusions consistent across reviewed literature. Strings only. No contested findings in this array.
causal_chains object REQUIRED Named causal pathways as ordered string arrays. Last item in each chain must begin with "Outcome:".
what_works array OPTIONAL Evidence-based interventions. Present in meta-analysis papers. Each item: intervention, evidence_level, notes.
taxonomies object OPTIONAL Categorical taxonomies used in the paper. Each key maps to an array of string values.
series_context object OPTIONAL Present when the paper is part of a series. Keys are paper identifiers, values are scope descriptions.
policy_implications array OPTIONAL Actionable policy implications derived from findings. Each item: implication, urgency.

Geographic scope enum values

The geographic_scope_enum field must use one of these values. Enables corpus filtering without parsing prose.

global us_national us_national_with_global_context us_midwest us_south_dakota us_rural us_rural_midwest other
Registered Constructs

OCF Named Analytical Constructs

Four constructs are registered as of ocf_schema_v1. These are citable framework elements, not phrases. All four are required in every compliant appendix. New constructs require OCF president approval before inclusion in publications.

readiness_stack

Three-tier OCF model: Tier 1 Foundational Readiness (literacy, numeracy, communication), Tier 2 Digital Readiness (device competency, software navigation, data literacy), Tier 3 AI Readiness (working alongside, directing, and critically evaluating AI systems). Each tier is a prerequisite for the next.

pipeline_collapse

The elimination of entry-level positions that historically served as the informal second education system for underprepared workers. The traditional on-ramp from K-12 to workforce skill development is being eliminated by the same automation wave creating the readiness demand.

rural_amplification_effect

The multiplicative (not merely additive) compounding of readiness barriers in rural contexts: device ownership gaps, broadband infrastructure gaps, geographic distance from training facilities, and chronic philanthropic underfunding. Rural workers face the same displacement exposure with fewer response resources.

access_as_binding_constraint

The finding that institutional support infrastructure — not worker motivation or capability — is the primary predictor of workforce readiness outcomes. When workers receive structured training and support, adoption rates improve dramatically. The constraint is access, not capacity.

Validation

How to validate a paper appendix

Point any JSON Schema draft-07 validator at the canonical URL. The schema file is always current at https://observablecompute.org/schema/v1.json.

Node.js (AJV)

// npm install ajv
const Ajv = require('ajv');
const ajv = new Ajv();

const schema = require('./ocf_schema_v1.json');
const appendix = require('./appendix.json');

const validate = ajv.compile(schema);
const valid = validate(appendix);

if (!valid) console.log(validate.errors);
else console.log('Compliant with ocf_schema_v1');

Python (jsonschema)

# pip install jsonschema requests
import json, requests
from jsonschema import validate

schema = requests.get(
  'https://observablecompute.org/schema/v1.json'
).json()

appendix = json.load(
  open('your_appendix.json')
)

validate(instance=appendix, schema=schema)
print('Compliant with ocf_schema_v1')
Versioning Policy

Schemas are permanent. Papers are not retroactively non-compliant.

When the schema is updated, a new version URL is issued. The previous version remains available at its original URL indefinitely. Papers are not retroactively non-compliant when a new schema version is released.

The paper_metadata.schema field is the permanent record of which schema governed a paper at publication. Do not update this field when a new version is released.

Schema change proposals: hello@observablecompute.org

Current versions

ocf_schema_v1

Released May 2026 · Active

v1.json →

All OCF papers published under this schema carry "schema": "ocf_schema_v1" in their appendix paper_metadata object.