Date: Sun, 04 Dec 2022 19:00:00 -0500
<div class="wp-block-jetpack-markdown"><h2>Preamble</h2> <p>This is a <a href="https://www.themachinelearningpodcast.com/predibase-declarative-machine-learning-episode-4/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">cross-over episode</a> from our new show <a href="https://www.themachinelearningpodcast.com?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">The Machine Learning Podcast</a>, the show about going from idea to production with machine learning.</p> <h2>Summary</h2> <p>Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of Predibase, Travis Addair, explains how they are reducing the burden of model development even further with their managed service for declarative and low-code ML and how they are integrating with the growing ecosystem of solutions for the full ML lifecycle.</p> <h2>Announcements</h2> <ul> <li>Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great!</li> <li>When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. And now you can launch a managed MySQL, Postgres, or Mongo database cluster in minutes to keep your critical data safe with automated backups and failover. Go to <a href="https://www.pythonpodcast.com/linode?utm_source=rss&utm_medium=rss">pythonpodcast.com/linode</a> and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!</li> <li>Your host is Tobias Macey and today I’m interviewing Travis Addair about Predibase, a low-code platform for building ML models in a declarative format</li> </ul> <h2>Interview</h2> <ul> <li>Introduction</li> <li>How did you get involved in machine learning?</li> <li>Can you describe what Predibase is and the story behind it?</li> <li>Who is your target audience and how does that focus influence your user experience and feature development priorities?</li> <li>How would you describe the semantic differences between your chosen terminology of "declarative ML" and the "autoML" nomenclature that many projects and products have adopted? <ul> <li>Another platform that launched recently with a promise of "declarative ML" is Continual. How would you characterize your relative strengths?</li> </ul> </li> <li>Can you describe how the Predibase platform is implemented? <ul> <li>How have the design and goals of the product changed as you worked through the initial implementation and started working with early customers?</li> <li>The operational aspects of the ML lifecycle are still fairly nascent. How have you thought about the boundaries for your product to avoid getting drawn into scope creep while providing a happy path to delivery?</li> </ul> </li> <li>Ludwig is a core element of your platform. What are the other capabilities that you are layering around and on top of it to build a differentiated product?</li> <li>In addition to the existing interfaces for Ludwig you created a new language in the form of PQL. What was the motivation for that decision? <ul> <li>How did you approach the semantic and syntactic design of the dialect?</li> <li>What is your vision for PQL in the space of "declarative ML" that you are working to define?</li> </ul> </li> <li>Can you describe the available workflows for an individual or team that is using Predibase for prototyping and validating an ML model? <ul> <li>Once a model has been deemed satisfactory, what is the path to production?</li> </ul> </li> <li>How are you approaching governance and sustainability of Ludwig and Horovod while balancing your reliance on them in Predibase?</li> <li>What are some of the notable investments/improvements that you have made in Ludwig during your work of building Predibase?</li> <li>What are the most interesting, innovative, or unexpected ways that you have seen Predibase used?</li> <li>What are the most interesting, unexpected, or challenging lessons that you have learned while working on Predibase?</li> <li>When is Predibase the wrong choice?</li> <li>What do you have planned for the future of Predibase?</li> </ul> <h2>Contact Info</h2> <ul> <li><a href="https://www.linkedin.com/in/travisaddair/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">LinkedIn</a></li> <li><a href="https://github.com/tgaddair?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">tgaddair</a> on GitHub</li> <li><a href="https://twitter.com/travisaddair?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">@travisaddair</a> on Twitter</li> </ul> <h2>Parting Question</h2> <ul> <li>From your perspective, what is the biggest barrier to adoption of machine learning today?</li> </ul> <h2>Closing Announcements</h2> <ul> <li>Thank you for listening! Don’t forget to check out our other shows. The <a href="https://www.dataengineeringpodcast.com?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast</a> covers the latest on modern data management. The <a href="https://www.themachinelearningpodcast.com?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Machine Learning Podcast</a> helps you go from idea to production with machine learning.</li> <li>Visit the <a href="https://www.pythonpodcast.com?utm_source=rss&utm_medium=rss">site</a> to subscribe to the show, sign up for the mailing list, and read the show notes.</li> <li>If you’ve learned something or tried out a project from the show then tell us about it! Email <a href="mailto:hosts@podcastinit.com">hosts@podcastinit.com</a>) with your story.</li> <li>To help other people find the show please leave a review on <a href="https://itunes.apple.com/us/podcast/podcast.-init/id981834425?mt=2&uo=6&at=&ct=&utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">iTunes</a> and tell your friends and co-workers</li> </ul> <h2>Links</h2> <ul> <li><a href="https://predibase.com/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Predibase</a></li> <li><a href="https://horovod.ai/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Horovod</a></li> <li><a href="https://ludwig-ai.github.io/ludwig-docs/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Ludwig</a> <ul> <li><a href="https://www.pythonpodcast.com/ludwig-horovod-distributed-declarative-deep-learning-episode-341/?utm_source=rss&utm_medium=rss">Podcast.__init__ Episode</a></li> </ul> </li> <li><a href="https://en.wikipedia.org/wiki/Support-vector_machine?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Support Vector Machine</a></li> <li><a href="https://hadoop.apache.org/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Hadoop</a></li> <li><a href="https://www.tensorflow.org/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Tensorflow</a></li> <li><a href="https://eng.uber.com/michelangelo-machine-learning-platform/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Uber Michaelangelo</a></li> <li><a href="https://en.wikipedia.org/wiki/Automated_machine_learning?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">AutoML</a></li> <li><a href="https://spark.apache.org/mllib/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Spark ML Lib</a></li> <li><a href="https://en.wikipedia.org/wiki/Deep_learning?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Deep Learning</a></li> <li><a href="https://pytorch.org/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">PyTorch</a></li> <li><a href="https://continual.ai/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Continual</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/continual-declarative-machine-learning-episode-222/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode</a></li> </ul> </li> <li><a href="https://machinelearning.apple.com/research/overton?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Overton</a></li> <li><a href="https://kubernetes.io/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Kubernetes</a></li> <li><a href="https://docs.ray.io/en/latest/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Ray</a></li> <li><a href="https://developer.nvidia.com/nvidia-triton-inference-server?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Nvidia Triton</a></li> <li><a href="https://github.com/whylabs/whylogs?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Whylogs</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/whylogs-data-logging-data-observability-episode-283/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode</a></li> </ul> </li> <li><a href="https://wandb.ai/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Weights and Biases</a></li> <li><a href="https://mlflow.org/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">MLFlow</a></li> <li><a href="https://www.comet.com/site/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Comet</a></li> <li><a href="https://en.wikipedia.org/wiki/Confusion_matrix?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Confusion Matrices</a></li> <li><a href="https://www.getdbt.com/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">dbt</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/dbt-data-analytics-episode-81/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode</a></li> </ul> </li> <li><a href="https://pytorch.org/docs/stable/jit.html?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Torchscript</a></li> <li><a href="https://en.wikipedia.org/wiki/Self-supervised_learning?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Self-supervised Learning</a></li> </ul> <p>The intro and outro music is from <a href="https://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Tales_Of_A_Dead_Fish/Hitmans_Lovesong/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">Hitman’s Lovesong feat. Paola Graziano</a> by <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">The Freak Fandango Orchestra</a>/<a href="https://creativecommons.org/licenses/by-sa/3.0/?utm_source=rss&utm_medium=rss" rel="noopener" target="_blank">CC BY-SA 3.0</a></p> </div> <img alt="" height="0" src="https://analytics.boundlessnotions.com/piwik.php?idsite=1&rec=1&url=https%3A%2F%2Fwww.pythonpodcast.com%2Fpredibase-declarative-machine-learning-episode-387%2F&action_name=Declarative+Machine+Learning+For+High+Performance+Deep+Learning+Models+With+Predibase+-+Episode+387&urlref=https%3A%2F%2Fwww.pythonpodcast.com%2Ffeed%2F&utm_source=rss&utm_medium=rss" style="border: 0; width: 0; height: 0;" width="0" />