Go offline with the Player FM app!
Podcasts Worth a Listen
SPONSORED


Data Recipes for Reasoning Models
Manage episode 486928070 series 3524393
The OpenThoughts project creates open-source datasets for reasoning models, achieving state-of-the-art results with OpenThinker3-7B, trained on 1.2M examples, available at openthoughts.ai.
https://arxiv.org/abs//2506.04178
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2405 episodes
Manage episode 486928070 series 3524393
The OpenThoughts project creates open-source datasets for reasoning models, achieving state-of-the-art results with OpenThinker3-7B, trained on 1.2M examples, available at openthoughts.ai.
https://arxiv.org/abs//2506.04178
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2405 episodes
All episodes
×
1 [QA] A Systematic Analysis of Hybrid Linear Attention 7:55

1 A Systematic Analysis of Hybrid Linear Attention 15:40



1 [QA] Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs 8:31

1 Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs 15:32

1 [QA] Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving 8:09

1 Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving 21:33

1 [QA] Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful 7:03

1 Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation Is Wasteful 18:57

1 [QA] The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation 7:35

1 The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation 23:36

1 [QA] Cascade: Token-Sharded Private LLM Inference 7:04

1 Cascade: Token-Sharded Private LLM Inference 35:03

1 [QA] Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data 7:28

1 Real-TabPFN: Improving Tabular Foundation Models via Continued Pre-training With Real-World Data 10:15

1 [QA] Strategic Intelligence in Large Language Models Evidence from evolutionary Game Theory. 7:21

1 Strategic Intelligence in Large Language Models Evidence from evolutionary Game Theory. 34:06

1 [QA] Fast and Simplex: 2-Simplicial Attention in Triton 7:28

1 Fast and Simplex: 2-Simplicial Attention in Triton 17:55

1 [QA] Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 7:21

1 Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 15:33

1 [QA] DABstep: Data Agent Benchmark for Multi-step Reasoning 7:54

1 DABstep: Data Agent Benchmark for Multi-step Reasoning 16:50

1 [QA] Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 8:16

1 Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 16:52
Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.