Forge your own: why everyone should own their language models
Smedjan is public today: a pure-Rust engine that trains and runs language models from scratch — no Python, no PyTorch, no cloud. The code is the easy part to explain. The reason is the part I care about.
Whoever can build a language model holds a kind of power. Today that ability sits inside a few companies and a few clouds. We can call it efficient, and it is — but it is also a concentration. The tools that are starting to shape how people write, decide, and work are owned by a handful of firms, and what a handful of firms own, political power eventually reaches for. A capability that everyone leans on and almost no one can reproduce isn't a neutral fact. It is leverage, held by very few.
I don't think the fix is a bigger model from a better-behaved company. I think building one has to be possible for a single person, on hardware they already own — not to match the frontier, but to not need it.
One model that does everything is the wrong shape. It is the most expensive way to solve most problems. You don't run a power station to light one room. Most real work — sorting tickets, drafting a reply, reading a contract, tagging a log — is narrow, and a small model trained for that exact job does it better, faster, and on a machine you control. People specialize; it is how we get good at things. Models will go the same way, and I think we will come to prefer it: a sharp tool for the task over a vast one rented by the hour.
That is what Smedjan is for. The whole pipeline is one Rust binary — train a tokenizer, prepare data, pretrain, distill, fine-tune, align, quantize, export, and serve — Metal on Apple Silicon, CUDA on NVIDIA, the same checkpoint on both. Four small crates and the GPU bindings. You can read every line, which means you can trust it, change it, and keep it running when someone else's API changes its terms or its mind.
The headline at the top of this site is written, live, by a 295K-parameter model compiled to WebAssembly and running in your browser, with no server deciding what it says. It is tiny on purpose. The point isn't that it is impressive — it is that the entire loop, from weights to sampled tokens, is yours and runs where you are.
I won't oversell it. Smedjan trains small, specialized models well; it will not train a frontier system on a laptop, and it isn't trying to. Safetensors import reads F32, BF16, and F16, and a HuggingFace config.json maps straight to a model; what's left on the roadmap is bit-exact HuggingFace inference parity and finishing CUDA backward parity for a few kernels. That honesty is part of the point: a tool you own is one whose limits you can actually see.
Open source is the mechanism, not the slogan. Code in the open, weights you can hold, a pipeline one person can run — that is how the ability to build these things stays distributed instead of granted. If that reads as exciting rather than worrying, come help or back the work. This power is more useful in many hands than in a few.
— Andrei
Watch the repository for releases, or come help forge it.