Artwork

Content provided by Philip - Host of AI Explained YT. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Philip - Host of AI Explained YT or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Claude 3.7 is More Significant than its Name Implies (ft DeepSeek R2 + GPT 4.5 coming soon)

27:39
 
Share
 

Manage episode 468470654 series 3611272
Content provided by Philip - Host of AI Explained YT. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Philip - Host of AI Explained YT or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Claude 3.7 is here, hot on the heels of Grok 3 and a host of other developments, but how good is it really? And what does it say about the next few months in AI? I’ve read the papers, played with the model for hours, and benched it on Simple. Things aren’t slowing down. Plus the latest in humanoid robots, led by Helix and freaked out by Protoclone. And reports of GPT 4.5 and DeepSeek R2.

GraySwan Competition! https://app.grayswan.ai/arena/challenge/agent-red-teaming

https://x.com/GraySwanAI/status/1894084923260043282

Chapters:

00:00 - Introduction

01:25 - Claude 3.7 New Stats/Demos

05:22 - 128k Output

06:13 - Pokemon

06:58 - Just a tool?

09:54 - DeepSeek R2

10:20 - Claude 3.7 System Card/Paper Highlights

17:18 - Simple Record Score/Competition

20:37 - Grok 3 + Redteaming prizes

22:26 - Google Co-scientist

24:02 - Humanoid Robot Developments

3.7 Release Notes: https://www.anthropic.com/news/claude-3-7-sonnet

vs o3 and Grok 3: https://x.com/12exyz/status/1891723056931827959

Extended Thinking: https://www.anthropic.com/research/visible-extended-thinking?s=09

System Prompt: https://docs.anthropic.com/en/release-notes/system-prompts#feb-24th-2025

System Card: https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf

Unfaithful CoT: https://arxiv.org/pdf/2305.04388

Original Constitution: https://www.anthropic.com/news/claudes-constitution

Responsible Scaling Policy: https://assets.anthropic.com/m/24a47b00f10301cd/original/Anthropic-Responsible-Scaling-Policy-2024-10-15.pdf

Amodei and Hassabis:https://www.youtube.com/watch?v=4poqjZlM8Lo

https://simple-bench.com/

400 Weekly Users: https://x.com/bradlightcap/status/1892579908179882057

Grok 3 Jailbroken: https://x.com/LinusEkenstam/status/1893832876581380280

Google Co-Scientist: https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/

But Hassabis Says Years Away: https://www.youtube.com/watch?v=yr0GiSgUvPU&t=156s

DeepSeek R2 Reuters: https://www.reuters.com/technology/artificial-intelligence/deepseek-rushes-launch-new-ai-model-china-goes-all-2025-02-25/

Protoclone: https://www.reddit.com/r/interestingasfuck/comments/1it9rpp/protoclone_the_worlds_first_bipedal/

Helix: https://www.figure.ai/news/helix

TechTrance: https://www.youtube.com/@TheTechTrance/videos

GPT 4.5 Soon:

  continue reading

24 episodes

Artwork
iconShare
 
Manage episode 468470654 series 3611272
Content provided by Philip - Host of AI Explained YT. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Philip - Host of AI Explained YT or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Claude 3.7 is here, hot on the heels of Grok 3 and a host of other developments, but how good is it really? And what does it say about the next few months in AI? I’ve read the papers, played with the model for hours, and benched it on Simple. Things aren’t slowing down. Plus the latest in humanoid robots, led by Helix and freaked out by Protoclone. And reports of GPT 4.5 and DeepSeek R2.

GraySwan Competition! https://app.grayswan.ai/arena/challenge/agent-red-teaming

https://x.com/GraySwanAI/status/1894084923260043282

Chapters:

00:00 - Introduction

01:25 - Claude 3.7 New Stats/Demos

05:22 - 128k Output

06:13 - Pokemon

06:58 - Just a tool?

09:54 - DeepSeek R2

10:20 - Claude 3.7 System Card/Paper Highlights

17:18 - Simple Record Score/Competition

20:37 - Grok 3 + Redteaming prizes

22:26 - Google Co-scientist

24:02 - Humanoid Robot Developments

3.7 Release Notes: https://www.anthropic.com/news/claude-3-7-sonnet

vs o3 and Grok 3: https://x.com/12exyz/status/1891723056931827959

Extended Thinking: https://www.anthropic.com/research/visible-extended-thinking?s=09

System Prompt: https://docs.anthropic.com/en/release-notes/system-prompts#feb-24th-2025

System Card: https://assets.anthropic.com/m/785e231869ea8b3b/original/claude-3-7-sonnet-system-card.pdf

Unfaithful CoT: https://arxiv.org/pdf/2305.04388

Original Constitution: https://www.anthropic.com/news/claudes-constitution

Responsible Scaling Policy: https://assets.anthropic.com/m/24a47b00f10301cd/original/Anthropic-Responsible-Scaling-Policy-2024-10-15.pdf

Amodei and Hassabis:https://www.youtube.com/watch?v=4poqjZlM8Lo

https://simple-bench.com/

400 Weekly Users: https://x.com/bradlightcap/status/1892579908179882057

Grok 3 Jailbroken: https://x.com/LinusEkenstam/status/1893832876581380280

Google Co-Scientist: https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/

But Hassabis Says Years Away: https://www.youtube.com/watch?v=yr0GiSgUvPU&t=156s

DeepSeek R2 Reuters: https://www.reuters.com/technology/artificial-intelligence/deepseek-rushes-launch-new-ai-model-china-goes-all-2025-02-25/

Protoclone: https://www.reddit.com/r/interestingasfuck/comments/1it9rpp/protoclone_the_worlds_first_bipedal/

Helix: https://www.figure.ai/news/helix

TechTrance: https://www.youtube.com/@TheTechTrance/videos

GPT 4.5 Soon:

  continue reading

24 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide

Listen to this show while you explore
Play