We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions
cover of episode Ep 7: Co-Creator of Databricks Dolly Mike Conover on Open-Source LLMs

Ep 7: Co-Creator of Databricks Dolly Mike Conover on Open-Source LLMs

2023/5/11
logo of podcast Unsupervised Learning

Unsupervised Learning

Shownotes Transcript

Patrick and Jacob sit down with Mike Conover, Staff Software Engineer at Databricks and Co-Creator of Databricks Dolly, the world’s first truly open instruction-tuned LLM, to discuss the magic behind Dolly, Alpaca and other instruction-tuned LLMs, the unreasonable effectiveness of fine-tuning, how they got all Databricks employees to help them curate the Dolly dataset (hint: google forms), and more.

 

(0:00) - Intro

(5:54) - The birth of Dolly

(12:03) - Data curation at Databricks

(15:34) - Advice for building LLMs

(24:10) - The future of instruction-tuning datasets

(30:43) - UI innovation

(38:16) - The future of machine learning infrastructure

(42:05) - How SkipFlag would be different with the tools we have today

(47:01) - What Mike has learned since Dolly

 

With your co-hosts:

@jasoncwarner

  • Former CTO GitHub, VP Eng Heroku & Canonical

@ericabrescia

  • Former COO Github, Founder Bitnami (acq’d by VMWare)

@patrickachase

  • Partner at Redpoint, Former ML Engineer LinkedIn

@jacobeffron

  • Partner at Redpoint, Former PM Flatiron Health