Go offline with the Player FM app!
Podcasts Worth a Listen
SPONSORED


1 Tiffany Yu — Smashing Stereotypes and Building a Disability-Inclusive World 30:23
[QA] Pre-training Large Memory Language Models with Internal and External Knowledge
Manage episode 484295829 series 3524393
We introduce Large Memory Language Models (LMLMs) that store factual knowledge externally, enabling targeted lookups and improving verifiability, while maintaining competitive performance on standard benchmarks.
https://arxiv.org/abs//2505.15962
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2381 episodes
Manage episode 484295829 series 3524393
We introduce Large Memory Language Models (LMLMs) that store factual knowledge externally, enabling targeted lookups and improving verifiability, while maintaining competitive performance on standard benchmarks.
https://arxiv.org/abs//2505.15962
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2381 episodes
All episodes
×
1 [QA] Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 7:21

1 Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 15:33

1 [QA] DABstep: Data Agent Benchmark for Multi-step Reasoning 7:54

1 DABstep: Data Agent Benchmark for Multi-step Reasoning 16:50

1 [QA] Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 8:16

1 Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 16:52

1 [QA] LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs 8:19

1 LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs 14:25

1 [QA] Performance Prediction for Large Systems via Text-to-Text Regression 8:40

1 Performance Prediction for Large Systems via Text-to-Text Regression 20:32

1 [QA] From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers 7:47

1 From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers 20:44

1 [QA] OmniGen2: Exploration to Advanced Multimodal Generation 7:44

1 OmniGen2: Exploration to Advanced Multimodal Generation 32:16

1 [QA] OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling 7:28

1 OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling 25:52

1 [QA] Potemkin Understanding in Large Language Models 8:04

1 Potemkin Understanding in Large Language Models 17:20

1 [QA] Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test 7:49

1 Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test 18:35

1 [QA] MMSearch-R1: Incentivizing LMMs to Search 8:11


1 [QA] Thought Anchors: Which LLM Reasoning Steps Matter? 7:51

1 Thought Anchors: Which LLM Reasoning Steps Matter? 15:41

1 [QA] Scaling Speculative Decoding with LOOKAHEAD REASONING 8:06

1 Scaling Speculative Decoding with LOOKAHEAD REASONING 22:49

1 [QA] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations 7:55

1 Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations 16:59

1 [QA] Watermarking Autoregressive Image Generation 7:39

1 Watermarking Autoregressive Image Generation 27:33

1 [QA] Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights 6:43

1 Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights 11:26

1 [QA] Flat Channels to Infinity in Neural Loss Landscapes 7:16

1 Flat Channels to Infinity in Neural Loss Landscapes 15:03

1 [QA] Approximating Language Model Training Data from Weights 7:34

1 Approximating Language Model Training Data from Weights 21:37

1 [QA] GenRecal: Generation after Recalibration from Large to Small Vision-Language Models 7:40

1 GenRecal: Generation after Recalibration from Large to Small Vision-Language Models 17:19

1 [QA] ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs 8:30

1 ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs 12:10

1 [QA] Sampling from Your Language Model One Byte at a Time 7:05

1 Sampling from Your Language Model One Byte at a Time 13:35

1 [QA] Don't throw the baby out with the bathwater: How and why deep learning for ARC 7:44

1 Don't throw the baby out with the bathwater: How and why deep learning for ARC 32:30

1 [QA] What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers 7:18
Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.