Go offline with the Player FM app!
Podcasts Worth a Listen
SPONSORED


Accelerating Diffusion LLMs via Adaptive Parallel Decoding
Manage episode 486840926 series 3524393
The paper introduces adaptive parallel decoding (APD), enhancing diffusion large language models' speed by dynamically adjusting token sampling, improving throughput while maintaining quality compared to autoregressive models.
https://arxiv.org/abs//2506.00413
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2381 episodes
Manage episode 486840926 series 3524393
The paper introduces adaptive parallel decoding (APD), enhancing diffusion large language models' speed by dynamically adjusting token sampling, improving throughput while maintaining quality compared to autoregressive models.
https://arxiv.org/abs//2506.00413
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
2381 episodes
All episodes
×
1 [QA] Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 7:21

1 Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning 15:33

1 [QA] DABstep: Data Agent Benchmark for Multi-step Reasoning 7:54

1 DABstep: Data Agent Benchmark for Multi-step Reasoning 16:50

1 [QA] Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 8:16

1 Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in Inference-time Scaling? 16:52

1 [QA] LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs 8:19

1 LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs 14:25

1 [QA] Performance Prediction for Large Systems via Text-to-Text Regression 8:40

1 Performance Prediction for Large Systems via Text-to-Text Regression 20:32

1 [QA] From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers 7:47

1 From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers 20:44

1 [QA] OmniGen2: Exploration to Advanced Multimodal Generation 7:44

1 OmniGen2: Exploration to Advanced Multimodal Generation 32:16

1 [QA] OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling 7:28

1 OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling 25:52

1 [QA] Potemkin Understanding in Large Language Models 8:04

1 Potemkin Understanding in Large Language Models 17:20

1 [QA] Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test 7:49

1 Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test 18:35

1 [QA] MMSearch-R1: Incentivizing LMMs to Search 8:11


1 [QA] Thought Anchors: Which LLM Reasoning Steps Matter? 7:51

1 Thought Anchors: Which LLM Reasoning Steps Matter? 15:41

1 [QA] Scaling Speculative Decoding with LOOKAHEAD REASONING 8:06

1 Scaling Speculative Decoding with LOOKAHEAD REASONING 22:49

1 [QA] Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations 7:55

1 Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations 16:59

1 [QA] Watermarking Autoregressive Image Generation 7:39

1 Watermarking Autoregressive Image Generation 27:33

1 [QA] Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights 6:43

1 Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights 11:26

1 [QA] Flat Channels to Infinity in Neural Loss Landscapes 7:16

1 Flat Channels to Infinity in Neural Loss Landscapes 15:03

1 [QA] Approximating Language Model Training Data from Weights 7:34

1 Approximating Language Model Training Data from Weights 21:37

1 [QA] GenRecal: Generation after Recalibration from Large to Small Vision-Language Models 7:40

1 GenRecal: Generation after Recalibration from Large to Small Vision-Language Models 17:19

1 [QA] ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs 8:30

1 ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs 12:10

1 [QA] Sampling from Your Language Model One Byte at a Time 7:05

1 Sampling from Your Language Model One Byte at a Time 13:35

1 [QA] Don't throw the baby out with the bathwater: How and why deep learning for ARC 7:44

1 Don't throw the baby out with the bathwater: How and why deep learning for ARC 32:30

1 [QA] What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers 7:18

1 What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers 19:43

1 [QA] MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention 8:28

1 MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention 25:05

1 [QA] Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation 8:10

1 Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation 16:59

1 [QA] TreeRL: LLM Reinforcement Learning with On-Policy Tree Search 7:17

1 TreeRL: LLM Reinforcement Learning with On-Policy Tree Search 19:00

1 [QA] Solving Inequality Proofs with Large Language Models 8:20

1 Solving Inequality Proofs with Large Language Models 23:49

1 [QA] Reinforcement Learning Teachers of Test Time Scaling 7:54

1 Reinforcement Learning Teachers of Test Time Scaling 22:37

1 [QA] Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers 7:05

1 Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers 18:28

1 [QA] Spurious Rewards: Rethinking Training Signals in RLVR 7:41

1 Spurious Rewards: Rethinking Training Signals in RLVR 30:13

1 [QA] Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation 8:08

1 Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation 24:30





1 [QA] Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening 7:45

1 Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening 16:56



1 [QA] Why Gradients Rapidly Increase Near the End of Training 7:00

1 Why Gradients Rapidly Increase Near the End of Training 11:24

1 [QA] GEM: Empowering LLM for both Embedding Generation and Language Understanding 7:41

1 GEM: Empowering LLM for both Embedding Generation and Language Understanding 20:38

1 [QA] HYPERSTEER: Activation Steering at Scale with Hypernetworks 7:49

1 HYPERSTEER: Activation Steering at Scale with Hypernetworks 9:15



1 [QA] Accelerating Diffusion LLMs via Adaptive Parallel Decoding 8:08

1 Accelerating Diffusion LLMs via Adaptive Parallel Decoding 21:09

1 [QA] Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning 7:34

1 Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning 16:44

1 [QA] Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning 8:08

1 Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning 23:02
Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.