Dataset Library
Reasoning traces for distilling frontier models
Curated datasets built by querying Claude, GPT, Gemini and other frontier models with diverse coding, math, and reasoning prompts. Designed for training small open models that still think clearly.
What's included
Each dataset includes detailed reasoning traces, carefully filtered conversations, and metadata ready for fine-tuning. Listings are synced hourly from Hugging Face.
claude-4.5-opus-high-reasoning-250x
Distilled from Claude Opus 4.5
gemini-3-pro-preview-high-reasoning-250x
Distilled from Gemini 3 Pro
gpt-5.2-high-reasoning-250x
gemini-3-pro-preview-high-reasoning-1000x
Distilled from Gemini 3 Pro
claude-sonnet-4.5-high-reasoning-250x
Distilled from Claude Sonnet 4.5
Pony-Alpha-15k
convo-v1
Step-3.5-Flash-2600x
gpt-5.1-codex-max-1000x
Distilled from GPT-5.1
claude-haiku-4.5-high-reasoning-1700x
deepseek-v3.2-speciale-OpenCodeReasoning-3k
Distilled from DeepSeek v3.2 Speciale
MiniMax-M2.1-Code-SFT
deepseek-v3.2-speciale-openr1-math-3k
Distilled from DeepSeek v3.2 Speciale
gemini-3-flash-preview
deepseek-v3.2-speciale-1000x
Distilled from DeepSeek v3.2 Speciale
glm-4.7-2000x
claude-haiku-4.5-1700x
MiniMax-M2.1-8800x
gemini-2.5-flash-11000x
Distilled from Gemini 2.5 Flash
gpt-5.1-high-reasoning-1000x
Distilled from GPT-5.1
kimi-k2-thinking-1000x
Distilled from Kimi K2
sherlock-thinking-alpha-11000x
minimax-m2.1-1000x
gpt-5-codex-1000x
Distilled from GPT-5 Codex
glm-4.7-350x
gemini-3-flash-preview-1000x
Gemini-3-Flash-Preview-VIBE
gpt-5-codex-250x
Distilled from GPT-5 Codex
grok-code-fast-1-1000x
Distilled from Grok
brainstorm-v3.1-grok-4-fast-200x
Distilled from Grok
mistral-small-creative-500x
Distilled from Mistral
Aurora-Alpha-15.5k
gemini-2.5-flash-lite-2509-preview-1000x
Distilled from Gemini 2.5 Flash
polaris-alpha-1000x
MiMo-V2-Flash-2300x
glm-4.6-250x
Distilled from GLM 4.6
open-moderator-v1
kimi-k2-thinking-250x
Distilled from Kimi K2