Technology

Private by architecture

The Primer's privacy promises aren't policy — they're structure. The system is designed so that learner data physically cannot leave the device, and so that the entire experience works with the network cable cut.

Local-first, cloud-optional

The Primer runs airgapped: language model, speech recognition, speech synthesis, knowledge base, and the child's learning record all live and execute on the device. Families who choose cloud inference for better conversation quality send individual conversation turns per request — nothing is stored server-side, and the learner model never travels at all.

This is the inverse of the usual ed-tech architecture, where the child's data lives on the company's servers and the product stops working when the subscription does.

One engine, many brains

The pedagogical engine — the part that decides whether this moment calls for a guiding question, scaffolding, or a comprehension probe — is completely decoupled from the language model behind it. Swapping the model is a configuration choice, not a rewrite. The same engine runs against:

Cloud models

Anthropic's Claude or any OpenAI-compatible provider, for the highest conversation quality where connectivity and trust allow.

Local models

Open-weight models running in-process on a laptop or desktop — CPU or GPU — with no network at all.

Phone neural processors

A 4-billion-parameter model on a Snapdragon phone's Hexagon NPU: a full multi-turn conversation runs entirely on-device at ~25.7 tokens/s, first token in under a second, validated on hardware in June 2026.

Hybrid routing

An optional per-turn router keeps routine conversation local and escalates only complex turns to a cloud model — opt-in, never default.

What works today

This is running software, demonstrable end-to-end on a developer laptop and (text mode) on an Android phone:

Streaming Socratic conversation with session persistence, resume, and long-term memory that spans hours of dialogue without losing the thread.
A live learner model — an engagement classifier, a concept extractor, and a comprehension assessor run quietly behind every exchange, recording what the child encountered and how deeply they understood it.
Spaced-repetition vocabulary woven passively into conversation at expanding intervals — no drilling, no quizzes.
A curated children's knowledge base — hand-drafted passages plus licensed children's-encyclopedia layers in English and German, searched with hybrid lexical + semantic retrieval, with full source attribution.
A complete offline voice loop — voice activity detection, on-device transcription, generation, and natural speech synthesis, with strict turn-taking: the Primer never speaks over the child, and never lets itself be trained to interrupt.
A desktop app and CLI — including a developer-facing view of every pedagogical decision the engine makes, per turn.
Multilingual prompt packs — English and German production-ready, Hindi in preview; the architecture keeps every user-facing string out of the code so new languages are data, not development.
Safety plumbing — chain-of-thought from reasoning models is stripped before it can reach a child, and inference failures degrade to friendly, age-appropriate messages.

Validated platforms

macOS and Linux — full experience: desktop app, CLI, and voice loop (with native Apple speech engines on macOS).
Android (Snapdragon 8 Elite) — text conversation validated on-device May 2026; the neural-processor inference pipeline validated June 2026 at ~9.4 tokens/s, ~190 ms to first token, 57 °C peak — comfortably inside thermal limits.

The path to a dedicated, child-friendly device — the Primer as an object a child holds, not an app on a parent's laptop — runs through this phone-class silicon. See the roadmap.

Built to be inspected

The Primer is written in Rust — a single codebase from the pedagogy engine down to the device integration — and licensed under the AGPL. For a product whose users are children, "trust us" is not an acceptable privacy model; the only honest answer is source code anyone can read, build, and verify. The knowledge corpus ships with per-passage licensing and attribution, and every claim on this page corresponds to code you can run from the public repository.