Data Science with Juliet Hougland and Michelle Casbon

Google Cloud Platform Podcast

Episode | Podcast

Date: Wed, 06 Jun 2018 00:00:00 +0000

<p><a href="https://twitter.com/j_houg">Juliet Hougland</a> and <a href="https://twitter.com/texasmichelle">Michelle Casbon</a> are on the podcast this week to talk about data science with <a href="https://twitter.com/nyghtowl">Melanie</a> and <a href="https://twitter.com/Neurotic">Mark</a>. We had a great discussion about methodology, applications, tools, pipelines, challenges and resources. Juliet shared insights into the unique data science ownership workflow from idea to deployment at Stitch Fix, and Michelle dove into how Kubeflow is playing a role to help drive reliability in model development and deployment.</p> <h5 id="juliet-hougland">Juliet Hougland</h5> <p><a href="https://twitter.com/j_houg">Juliet Hougland</a> leads the Workflow, Environment, and Execution team at <a href="https://www.stitchfix.com/">Stichfix</a>. She is a data scientist and engineer with expertise in computational mathematics and years of hands-on machine learning and big data experience. She has built and deployed production ML models, advised Fortune 500 companies on infrastructure and worked on a variety of open source projects (Apache Spark, Scalding, and Kiji) at the intersection of big data and machine learning.</p> <h5 id="michelle-casbon">Michelle Casbon</h5> <p><a href="https://twitter.com/texasmichelle">Michelle Casbon</a> is a Senior Engineer on the Google Cloud Platform Developer Relations team, where she focuses on open source contributions and community engagement for machine learning and big data tools. Prior to joining Google, she was at several San Francisco-based startups as a Senior Engineer and Director of Data Science. Within these roles, she built and shipped machine learning products on distributed platforms using both AWS and GCP. Michelle’s development experience spans more than a decade and has primarily focused on multilingual natural language processing, system architecture and integration, and continuous delivery pipelines for machine learning applications. She especially loves working with open source projects and is an active contributor to Kubeflow. Michelle holds a masters degree from the University of Cambridge.</p> <h5 id="cool-things-of-the-week">Cool things of the week</h5> <ul> <li>Sandeep Dinesh: Kubernetes Best Practices <a href="https://www.youtube.com/watch?v=ajbC1yTW2x">YouTube</a></li> <li>CNCF TOC voted to accept Helm as an incubation-level hosted project to CNCF <a href="https://www.cncf.io/blog/2018/06/01/cncf-to-host-helm/">site</a></li> <li>Andriod P in Beta <a href="https://blog.google/products/android/android-p/">blog</a></li> <li>Agones 0.2.0 <a href="https://agones.dev">site</a></li> <li>Securing cloud-connected devices with Cloud IoT and Microchip <a href="https://cloud.google.com/blog/big-data/2018/05/securing-cloud-connected-devices-with-cloud-iot-and-microchip"> blog</a></li> </ul> <h5 id="interview">Interview</h5> <ul> <li>flotilla-os <a href="https://github.com/stitchfix/flotilla-os">repo</a></li> <li>Kubeflow <a href="https://github.com/kubeflow/kubeflow">repo</a></li> <li>Cloud Dataproc <a href="https://cloud.google.com/dataproc/">site</a> & <a href="https://cloud.google.com/dataproc/docs/">docs</a></li> <li>Spark <a href="http://spark.apache.org/">site</a> & <a href="https://spark.apache.org/community.html">community site</a></li> <li>scikit-learn <a href="http://scikit-learn.org/stable/index.html">site</a></li> <li>xgboost <a href="https://github.com/dmlc/xgboost">repo</a></li> <li>PyTorch <a href="https://pytorch.org/">site</a></li> <li>TensorFlow <a href="https://www.tensorflow.org/">site</a> and <a href="https://github.com/tensorflow">github</a></li> <li>Kubernetes <a href="https://kubernetes.io">site</a> <a href="https://github.com/kubernetes/kubernetes">github</a></li> <li>Introducing ultramem Google Compute Engine machine types <a href="https://cloudplatform.googleblog.com/2018/05/Introducing-ultramem-Google-Compute-Engine-machine-types.html"> blog</a></li> <li>#114 Machine Learning Bias and Fairness with Timnit Gebru and Margaret Mitchell <a href="https://www.gcppodcast.com/post/episode-114-machine-learning-bias-and-fairness-with-timnit-gebru-and-margaret-mitchell/"> podcast</a></li> <li>Machine Learning Flash Clards <a href="https://machinelearningflashcards.com/">site</a></li> <li>Open Source Data Science Masters <a href="http://datasciencemasters.org/">site</a></li> <li>DockerCon SF <a href="https://2018.dockercon.com/">site</a></li> </ul> <h5 id="question-of-the-week">Question of the week</h5> <p>If I have written a gRPC Service, but I’m using a language/platform that isn’t supported - is there any way I can access it as REST?</p> <ul> <li><a href="https://github.com/grpc-ecosystem/grpc-gateway">grpc-gateway</a></li> <li><a href="https://www.envoyproxy.io/">Envoy proxy</a></li> <li><a href="https://cloud.google.com/endpoints/docs/grpc/transcoding">Transcoding</a></li> </ul> <h5 id="where-can-you-find-us-next">Where can you find us next?</h5> <p>Mark is speaking at the <a href="https://www.meetup.com/San-Francisco-Kubernetes-Meetup/events/251242006"> San Francisco Kubernetes Meetup: Scaling Game Servers and the Conduit Service Mesh</a> on June 14th.</p> <p>Melanie is speaking at a joint <a href="http://wimlds.org">WiMLDS</a> and <a href="http://www.pyladies.com">PyLadies</a> event “Paths to Data Science” on June 26th and Stanford AI4ALL on June 28th.</p>