Research
Notes from the training lab.
Benchmark methodology, training postmortems, and hardware notes. No hype — working notes from the team.
2026-04-21 · 2 min read
Training a 14B LLM on a Tesla V100 in 2026
How we fine-tune Qwen3-14B on a single V100 16GB without renting H100s. Unsloth, LoRA, and the unglamorous bits nobody writes about.
2026-04-19 · 2 min read
The A-rate benchmark: how we grade crypto LLMs
Our 45-question crypto benchmark is graded on an A–F scale by two reviewers plus a rubric matcher. Here's exactly how, and why it's harder to game than multiple-choice.
2026-04-17 · 2 min read
DPO regressed our model. Here's what happened.
We trained DPO on top of a strong 87% A-rate SFT model and the result dropped to 78%. Full postmortem with logs, not marketing.
2026-04-15 · 2 min read
Open vs closed models on crypto Q&A: what the numbers actually say
Sovereign v2 (14B) beats GPT-4o and Claude on our 45-question crypto benchmark — but the interesting stories are in *where* and *why*.