Building Open-Source LLMs With Philosophy | Anastasia Stasenko Open||Source||Data podcast

Content provided by DataStax and Charna Parkey. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by DataStax and Charna Parkey or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://ppacc.player.fm/legal.

Open||Source||Data »
Building Open-Source LLMs with Philosophy | Anastasia Stasenko

4d ago 57:45

MP3•Episode home

Join Charna Parkey as she welcomes Anastasia Stasenko, CEO and co-founder of pleias, through her unique journey from philosophy to building open-source, energy-efficient LLMs. Discover how pleias is revolutionizing the AI landscape by training models exclusively on open data and establishing a precedent for ethical and socially acceptable AI. Learn about the challenges and opportunities in creating multilingual models and contributing back to the open-source community.

QUOTES

[00:00:00] Introducing Anastasia and pleias

[00:02:00] From Philosophy to AI

[00:06:00] The Problem of Generic Models

[00:10:00] Open Weights vs. Open Source vs. Open Science

[00:14:00] Why Open Data Matters

[00:18:00] High-Quality, Specialized Models

[00:22:00] Multilingual Challenges

[00:26:00] Global Inclusion Requires Small Models

[00:30:00] Using and Contributing to Wikidata

[00:38:00] The Future: Specialized Models

[00:48:00] Advice for Newcomers

[00:54:00] Cultural Sensitivity and Data Representation

[00:50:00] Leo’s Takeaways

[00:52:00] Charna on Ethical, Verifiable AI

[00:54:00] Representation vs. Exclusion

[00:56:00] Letting People Be More Human

[00:57:30] Applied, Transformative AI

QUOTES

Charna:

"If you didn’t make it represented in the data, then we’re leaving another culture behind... So which one are you wanting to do, misrepresent them or just completely leave them behind from this technical revolution?"

Anastasia:
"The real issue now is that the lack of diversity in the current AI labs leads to the situation where all LLMs look alike."

Anastasia:
"Being able to design, to find, and also to create the appropriate data mix for large language models is something that we shouldn't really forget about when we talk about the success of what large language models are."

98 episodes

#Tech #DataStax #Charna Parkey