Best URP Podcasts (2025)

1
40 - Jason Gross on Compact Proofs and Interpretability 2:36:05

2M ago2:36:05

2:36:05

How do we figure out whether interpretability is doing its job? One way is to see if it helps us prove things about models that we care about knowing. In this episode, I speak with Jason Gross about his agenda to benchmark interpretability in this way, and his exploration of the intersection of proofs and modern machine learning. Patreon: https://w…

1
38.8 - David Duvenaud on Sabotage Evaluations and the Post-AGI Future 20:42

3M ago20:42

20:42

In this episode, I chat with David Duvenaud about two topics he's been thinking about: firstly, a paper he wrote about evaluating whether or not frontier models can sabotage human decision-making or monitoring of the same models; and secondly, the difficult situation humans find themselves in in a post-AGI future, even if AI is aligned with human i…

1
38.7 - Anthony Aguirre on the Future of Life Institute 22:39

4M ago22:39

22:39

The Future of Life Institute is one of the oldest and most prominant organizations in the AI existential safety space, working on such topics as the AI pause open letter and how the EU AI Act can be improved. Metaculus is one of the premier forecasting sites on the internet. Behind both of them lie one man: Anthony Aguirre, who I talk with in this …

1
38.6 - Joel Lehman on Positive Visions of AI 15:28

4M ago15:28

15:28

Typically this podcast talks about how to avert destruction from AI. But what would it take to ensure AI promotes human flourishing as well as it can? Is alignment to individuals enough, and if not, where do we go form here? In this episode, I talk with Joel Lehman about these questions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko…

1
38.5 - Adrià Garriga-Alonso on Detecting AI Scheming 27:41

4M ago27:41

27:41

Suppose we're worried about AIs engaging in long-term plans that they don't tell us about. If we were to peek inside their brains, what should we look for to check whether this was happening? In this episode Adrià Garriga-Alonso talks about his work trying to answer this question. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com…

1
38.4 - Shakeel Hashim on AI Journalism 24:14

5M ago24:14

24:14

AI researchers often complain about the poor coverage of their work in the news media. But why is this happening, and how can it be fixed? In this episode, I speak with Shakeel Hashim about the resource constraints facing AI journalism, the disconnect between journalists' and AI researchers' views on transformative AI, and efforts to improve the st…

1
38.3 - Erik Jenner on Learned Look-Ahead 23:46

6M ago23:46

23:46

Lots of people in the AI safety space worry about models being able to make deliberate, multi-step plans. But can we already see this in existing neural nets? In this episode, I talk with Erik Jenner about his work looking at internal look-ahead within chess-playing neural networks. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.c…

1
39 - Evan Hubinger on Model Organisms of Misalignment 1:45:47

6M ago1:45:47

1:45:47

The 'model organisms of misalignment' line of research creates AI models that exhibit various types of misalignment, and studies them to try to understand how the misalignment occurs and whether it can be somehow removed. In this episode, Evan Hubinger talks about two papers he's worked on at Anthropic under this agenda: "Sleeper Agents" and "Sycop…

1
38.2 - Jesse Hoogland on Singular Learning Theory 18:18

6M ago18:18

18:18

You may have heard of singular learning theory, and its "local learning coefficient", or LLC - but have you heard of the refined LLC? In this episode, I chat with Jesse Hoogland about his work on SLT, and using the refined LLC to find a new circuit in language models. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast T…

1
38.1 - Alan Chan on Agent Infrastructure 24:48

6M ago24:48

24:48

Road lines, street lights, and licence plates are examples of infrastructure used to ensure that roads operate smoothly. In this episode, Alan Chan talks about using similar interventions to help avoid bad outcomes from the deployment of AI agents. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https…

1
38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems 22:42

7M ago22:42

22:42

Do language models understand the causal structure of the world, or do they merely note correlations? And what happens when you build a big AI society out of them? In this brief episode, recorded at the Bay Area Alignment Workshop, I chat with Zhijing Jin about her research on these questions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: http…

1
37 - Jaime Sevilla on AI Forecasting 1:44:25

8M ago1:44:25

1:44:25

Epoch AI is the premier organization that tracks the trajectory of AI - how much compute is used, the role of algorithmic improvements, the growth in data used, and when the above trends might hit an end. In this episode, I speak with the director of Epoch AI, Jaime Sevilla, about how compute, data, and algorithmic improvements are impacting AI, an…

1
36 - Adam Shai and Paul Riechers on Computational Mechanics 1:48:27

8M ago1:48:27

1:48:27

Sometimes, people talk about transformers as having "world models" as a result of being trained to predict text data on the internet. But what does this even mean? In this episode, I talk with Adam Shai and Paul Riechers about their work applying computational mechanics, a sub-field of physics studying how to predict random processes, to neural net…

1
New Patreon tiers + MATS applications 5:32

8M ago5:32

5:32

Patreon: https://www.patreon.com/axrpodcast MATS: https://www.matsprogram.org Note: I'm employed by MATS, but they're not paying me to make this video.

1
35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization 2:17:24

9M ago2:17:24

2:17:24

How do we figure out what large language models believe? In fact, do they even have beliefs? Do those beliefs have locations, and if so, can we edit those locations to change the beliefs? Also, how are we going to get AI to perform tasks so hard that we can't figure out if they succeeded at them? In this episode, I chat with Peter Hase about his re…

1
34 - AI Evaluations with Beth Barnes 2:14:02

10M ago2:14:02

2:14:02

How can we figure out if AIs are capable enough to pose a threat to humans? When should we make a big effort to mitigate risks of catastrophic AI misbehaviour? In this episode, I chat with Beth Barnes, founder of and head of research at METR, about these questions and more. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast The transcript:…

1
33 - RLHF Problems with Scott Emmons 1:41:24

12M ago1:41:24

1:41:24

Reinforcement Learning from Human Feedback, or RLHF, is one of the main ways that makers of large language models make them 'aligned'. But people have long noted that there are difficulties with this approach when the models are smarter than the humans providing feedback. In this episode, I talk with Scott Emmons about his work categorizing the pro…

1
32 - Understanding Agency with Jan Kulveit 2:22:29

12M ago2:22:29

2:22:29

What's the difference between a large language model and the human brain? And what's wrong with our theories of agency? In this episode, I chat about these questions with Jan Kulveit, who leads the Alignment of Complex Systems research group. Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast The transcript: axrp.net/episode/2024/05/30/epi…

1
31 - Singular Learning Theory with Daniel Murfet 2:32:07

1y ago2:32:07

2:32:07

What's going on with deep learning? What sorts of models get learned, and what are the learning dynamics? Singular learning theory is a theory of Bayesian statistics broad enough in scope to encompass deep neural networks that may help answer these questions. In this episode, I speak with Daniel Murfet about this research program and what it tells …

1
30 - AI Security with Jeffrey Ladish 2:15:44

1y ago2:15:44

2:15:44

Top labs use various forms of "safety training" on models before their release to make sure they don't do nasty stuff - but how robust is that? How can we ensure that the weights of powerful AIs don't get leaked or stolen? And what can AI even do these days? In this episode, I speak with Jeffrey Ladish about security and AI. Patreon: patreon.com/ax…

1
29 - Science of Deep Learning with Vikrant Varma 2:13:46

1y ago2:13:46

2:13:46

In 2022, it was announced that a fairly simple method can be used to extract the true beliefs of a language model on any given topic, without having to actually understand the topic at hand. Earlier, in 2021, it was announced that neural networks sometimes 'grok': that is, when training them on certain tasks, they initially memorize their training …

1
28 - Suing Labs for AI Risk with Gabriel Weil 1:57:30

1y ago1:57:30

1:57:30

How should the law govern AI? Those concerned about existential risks often push either for bans or for regulations meant to ensure that AI is developed safely - but another approach is possible. In this episode, Gabriel Weil talks about his proposal to modify tort law to enable people to sue AI companies for disasters that are "nearly catastrophic…

1
27 - AI Control with Buck Shlegeris and Ryan Greenblatt 2:56:05

1y ago2:56:05

2:56:05

A lot of work to prevent AI existential risk takes the form of ensuring that AIs don't want to cause harm or take over the world---or in other words, ensuring that they're aligned. In this episode, I talk with Buck Shlegeris and Ryan Greenblatt about a different approach, called "AI control": ensuring that AI systems couldn't take over the world, e…

1
26 - AI Governance with Elizabeth Seger 1:57:13

1+ y ago1:57:13

1:57:13

The events of this year have highlighted important questions about the governance of artificial intelligence. For instance, what does it mean to democratize AI? And how should we balance benefits and dangers of open-sourcing powerful AI systems such as large language models? In this episode, I speak with Elizabeth Seger about her research on these …

1
25 - Cooperative AI with Caspar Oesterheld 3:02:09

1+ y ago3:02:09

3:02:09

Imagine a world where there are many powerful AI systems, working at cross purposes. You could suppose that different governments use AIs to manage their militaries, or simply that many powerful AIs have their own wills. At any rate, it seems valuable for them to be able to cooperatively work together and minimize pointless conflict. How do we ensu…

1
24 - Superalignment with Jan Leike 2:08:29

2y ago2:08:29

2:08:29

Recently, OpenAI made a splash by announcing a new "Superalignment" team. Lead by Jan Leike and Ilya Sutskever, the team would consist of top researchers, attempting to solve alignment for superintelligent AIs in four years by figuring out how to build a trustworthy human-level AI alignment researcher, and then using it to solve the rest of the pro…

1
23 - Mechanistic Anomaly Detection with Mark Xu 2:05:52

2y ago2:05:52

2:05:52

Is there some way we can detect bad behaviour in our AI system without having to know exactly what it looks like? In this episode, I speak with Mark Xu about mechanistic anomaly detection: a research direction based on the idea of detecting strange things happening in neural networks, in the hope that that will alert us of potential treacherous tur…

1
Survey, store closing, Patreon 4:26

2y ago4:26

4:26

Very brief survey: bit.ly/axrpsurvey2023 Store is closing in a week! Link: store.axrp.net/ Patreon: patreon.com/axrpodcast Ko-fi: ko-fi.com/axrpodcast

1
22 - Shard Theory with Quintin Pope 3:28:21

2y ago3:28:21

3:28:21

What can we learn about advanced deep learning systems by understanding how humans learn and form values over their lifetimes? Will superhuman AI look like ruthless coherent utility optimization, or more like a mishmash of contextually activated desires? This episode's guest, Quintin Pope, has been thinking about these questions as a leading resear…

1
21 - Interpretability for Engineers with Stephen Casper 1:56:02

2y ago1:56:02

1:56:02

Lots of people in the field of machine learning study 'interpretability', developing tools that they say give us useful information about neural networks. But how do we know if meaningful progress is actually being made? What should we want out of these tools? In this episode, I speak to Stephen Casper about these questions, as well as about a benc…

1
20 - 'Reform' AI Alignment with Scott Aaronson 2:27:35

2y ago2:27:35

2:27:35

How should we scientifically think about the impact of AI on human civilization, and whether or not it will doom us all? In this episode, I speak with Scott Aaronson about his views on how to make progress in AI alignment, as well as his work on watermarking the output of language models, and how he moved from a background in quantum complexity the…

1
Lions, the other teams, and Bears, Oh My! 42:10

2y ago42:10

42:10

Danache and Jerry rank their favorite off season moves so far. The Lamar Jackson Saga continues, AFC North needs and all the latest around the league.By Danache & Jeremy

1
URP 2023 03 22 41:45

2y ago41:45

41:45

URP 2023 03 22By Danache & Jeremy

1
Free Agency Frenzy 42:00

2y ago42:00

42:00

Are you ready for some football? With the Jets on the market for a quarterback, it looks like Lamar Jackson's ball game! Dan Snyder, the Colts, the Titans, the Texans, the Bucks, and even the Patriots all need a quarterback. Who will be the one to take the lead? Tune in to find out!By Danache & Jeremy

1
03-15-23 40:31

2y ago40:31

40:31

03-15-23By Danache & Jeremy

1
MONEY, MONEY, MONEY w. Special Guest Lawrence "Bam Bam" 1:04:06

2y ago1:04:06

1:04:06

Tune is as the team discuss NFL combine, franchise tags, Derek Carr, Lamar Jackson, Aaron Rodgers, and break down what each NFL team needs, starting with the AFC East.By Danache & Jeremy

1
TAG, YOU'RE IT!!!! 54:37

2y ago54:37

54:37

This week Danache and Jerry talk about the lack of black coaches in the NFL, the NFL's unwillingness to hire Eric Bieniemy as a head coach after years of showing he's a capable OC, the Giants look like their heading in a mess of their own. Daniel Jones? Saquon? or both? and more..tune in.By Danache & Jeremy

1
Super Bowl LVII: Legends in the making 43:34

2+ y ago43:34

43:34

Danache and Jerry dive into Super Bowl LVII, the history and importance of the black quarterback, what a win or loss could mean for either Patrick Mahomes and Jalen Hurts. Are these two on the path to becoming one of the greatest to ever play the game? Andy Reid's legacy as a coach. Updates on some coaching moves around the league, LamarJackson and…

1
Store, Patreon, Video 2:39

2+ y ago2:39

2:39

Store: https://store.axrp.net/ Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Video: https://www.youtube.com/watch?v=kmPFjpEibu0

1
19 - Mechanistic Interpretability with Neel Nanda 3:52:47

2+ y ago3:52:47

3:52:47

How good are we at understanding the internal computation of advanced machine learning models, and do we have a hope at getting better? In this episode, Neel Nanda talks about the sub-field of mechanistic interpretability research, as well as papers he's contributed to that explore the basics of transformer circuits, induction heads, and grokking. …

1
He’s good, but top 5?? 52:01

2+ y ago52:01

52:01

Danache is joined by the Jospeh brothers and they break down what happened in the division championship games, the 49ers and their QB nightmares, coaching decisions that just make sense, Super Bowl preview and much more. And of course a heartfelt farewell to BradyBy Danache & Jeremy

1
Buffalo Blues 48:22

2+ y ago48:22

48:22

Danache and Jerry are joined by What Just Happened Sports founder Melissa Anthony. Tune in as they discuss, her journey, the importance of equipping and empowering women with sports knowledge, preview of this weekends conference championship games, and answering the question of what’s wrong with the Buffalo Bills?…

1
Super Wild Card Weekend 50:54

2+ y ago50:54

50:54

Danache and Jerry talk super wild card weekend, Aaron Rodgers, does he stay, does he go? Lovie Smith and those good ol' Texans. Houston has a problem and it wasn't Lovie Smith. Game picks and more.By Danache & Jeremy

1
Football is Family 55:53

2+ y ago55:53

55:53

Danache is back with special guests as the team discusses Buffalo Bills Safety Damar Hamlin, his road to recovery, the NFL community's next steps and more.By Danache & Jeremy

1
The Grass is Greener Where You Water It 56:58

2+ y ago56:58

56:58

Only thing better than 2 is 3. Jon, Jerry, & Ed talk Deion Sanders and the weight of the move. Are the Cowboys an OBJ away from best team in the league? The Bucs are all but guaranteed to make the playoffs but what's the point? The Bengals could be the most slept on team in the league and the 49ers might just be alright with Mr Irrelevant…

1
To Bench or Not To Bench 55:21

2+ y ago55:21

55:21

The clock is ticking in the NFL & strength of schedule MATTERS. Jonathan & Jerry talk about the great differentiator between contenders and pretenders. You're QB can be YOUR guy, but not THE guy. The decisions that just have to be made, and welcome the fresh faces of the Coach of the Year race.By Danache & Jeremy

1
Turkey Day!!!! 41:24

2+ y ago41:24

41:24

Thanksgiving is here and we preview the best Thanksgiving line up we may have possibly seen in years. Zach Wilson benched..what are the JETS going to do? We might have an idea. Tune in.By Danache & Jeremy

1
Big Kirko 48:52

2+ y ago48:52

48:52

The NFL season is starting to turn up and the next few weeks can change everything. The AFC and NFC East have turned to probably the best divisons in football. Kirk Cousins isn't looking like a joke anymore (well 1 pm kirk that is). Tune is as we cover whats been going on around the league.By Danache & Jeremy

1
$150 Million Dollar Ship 58:36

2+ y ago58:36

58:36

Jon & Ed break down why Jeff Saturday makes sense and why it absolutely doesn't. Why the AFC East is the best division in the league. The biggest disappointments of the league and predictions for Week 10!By Danache & Jeremy