Monitoring, Metrics and M3, with Martin Mao and Rob Skillington

Kubernetes Podcast from Google

Episode | Podcast

Date: Tue, 17 Dec 2019 20:23:45 +0000

<p><a href="https://twitter.com/martin_c_mao">Martin Mao</a> and <a href="https://twitter.com/roskilli">Rob Skillington</a> are co-founders of Chronosphere; CEO and CTO respectively. They both worked on the monitoring team at Uber, where they created M3: a metrics platform with an open source time-series database built for scale. They join <a href="https://kubernetespodcast.com/about">Craig and Adam</a> to talk about monitoring, metrics and M3 on the last episode of 2019.</p> <p>Do you have something cool to share? Some questions? Let us know:</p> <ul> <li>web: <a href="https://kubernetespodcast.com">kubernetespodcast.com</a></li> <li>mail: <a href="mailto:kubernetespodcast@google.com">kubernetespodcast@google.com</a></li> <li>twitter: <a href="https://twitter.com/kubernetespod">@kubernetespod</a></li> </ul> <h3 id="chatter-of-the-week">Chatter of the week</h3> <ul> <li><a href="https://www.reddit.com/r/delta/comments/ebidaz/test_text_message/"> Test message from Delta Airlines</a></li> </ul> <h3 id="news-of-the-week">News of the week</h3> <ul> <li><a href="https://kubernetes.io/blog/2019/12/09/kubernetes-1-17-feature-csi-migration-beta/"> CSI migration</a> and <a href="https://kubernetes.io/blog/2019/12/09/kubernetes-1-17-feature-cis-volume-snapshot-beta/"> CSI volume snapshots</a></li> <li><a href="https://docs.microsoft.com/azure/aks/private-clusters">AKS Private Clusters</a> in preview</li> <li><a href="https://cloud.google.com/kubernetes-engine/docs/concepts/maintenance-windows-and-exclusions"> GKE maintenance Windows and exclusions</a> is GA</li> <li>Google Cloud E2 VMs: <a href="https://cloud.google.com/blog/products/compute/google-compute-engine-gets-new-e2-vm-machine-types"> introduction</a> and <a href="https://cloud.google.com/blog/products/compute/understanding-dynamic-resource-management-in-e2-vms"> understanding dynamic resource management</a></li> <li><a href="https://cloud.google.com/blog/products/serverless/new-features-in-cloud-run-for-anthos-ga"> New features in Cloud Run for Anthos</a></li> <li><a href="https://cloud.google.com/blog/products/containers-kubernetes/best-practices-for-performing-forensics-on-containers"> Best practices for performing forensics on containers</a></li> <li><a href="https://0x65.dev/blog/2019-12-14/the-architecture-of-a-large-scale-web-search-engine-circa-2019.html"> Infrastructure at Cliqz</a>, and <a href="https://0x65.dev/blog/2019-12-15/hydra-kubernetes-based-dataset-pubsub-and-volume-management-system.html"> introducing Hydra</a></li> <li><a href="https://groups.google.com/forum/#!topic/envoy-announce/BjgUTDTKAu8"> Envoy CVEs</a> <ul> <li><a href="https://istio.io/news/security/istio-security-2019-007/">Istio security bulletin</a></li> </ul> </li> <li><a href="https://thenewstack.io/the-top-3-service-mesh-developments-in-2020/"> The Top 3 Service Mesh Developments in 2019</a> by Zack Jory</li> <li><a href="https://www.youtube.com/watch?v=6zDrLvpfCK4">Istio Service Mesh Explained in 5 Minutes</a> by Ram Vennam</li> <li><a href="https://blog.getambassador.io/introducing-the-ambassador-edge-stack-1-0-with-automatic-https-and-the-edge-policy-console-47ed8e8c129a"> Ambassador Edge Stack</a></li> <li><a href="https://medium.com/solo-io/introducing-the-webassembly-hub-a-service-for-building-deploying-sharing-and-discovering-wasm-d461719383ca"> Solo.io WebAssembly Hub</a> <ul> <li><a href="https://kubernetespodcast.com/episode/055-solo-io/">Episode 55, with Idit Levine</a></li> </ul> </li> <li><a href="https://banzaicloud.com/blog/kafka-envoy-protocol-filter/">Kafka Envoy Protocol Filter</a></li> <li><a href="https://www.talos-systems.com/blog/2019/12/talos-0.3-beta-release/"> Talos 0.3 beta</a></li> <li><a href="https://www.cncf.io/blog/2019/12/10/autotikv-tikv-tuning-made-easy-by-ai-and-machine-learning/"> AutoTiKV tuning</a></li> <li><a href="https://blog.openpolicyagent.org/kubecon-us-2019-recap-3e60c70d633a"> OpenPolicyAgent’s KubeCon recap</a> <ul> <li><a href="https://kubernetespodcast.com/episode/042-policy-and-config-management/"> Episode 42, with John Murray</a></li> </ul> </li> <li><a href="https://alexbrand.dev/post/first-look-at-antrea-a-cni-plugin-based-on-open-vswitch/"> A first look at Antrea</a> from Alex Brand</li> <li><a href="https://medium.com/@augmentable/looking-at-kubernetes-2k-todo-comments-b2db42dc7fdb"> TODO: read this article</a> by Patrick DeVivo</li> <li><a href="https://www.cncf.io/blog/2019/12/13/does-testing-kubernetes-conformance-leave-you-in-the-dark-get-progress-updates-as-tests-run/"> Does Testing Kubernetes Conformance Leave You in the Dark? Get Progress Updates as Tests Run</a> by John Schnake</li> <li><a href="https://www.cncf.io/blog/2019/12/12/demystifying-kubernetes-as-a-service-how-does-alibaba-cloud-manage-10000s-of-kubernetes-clusters/"> Demystifying Kubernetes as a Service – How Alibaba Cloud Manages 10,000s of Kubernetes Clusters</a></li> <li><a href="https://www.cncf.io/blog/2019/12/11/how-jaeger-helped-grafana-labs-improve-query-performance-and-root-out-tough-bugs/"> How Jaeger Helped Grafana Labs Improve Query Performance and Root Out Tough Bugs</a></li> <li><a href="https://www.quora.com/q/quoraengineering/Adopting-Kubernetes-at-Quora"> Adopting Kubernetes at Quora</a> by Taylor Barrella,</li> <li><a href="https://www.cncf.io/announcement/2019/12/12/cloud-native-computing-foundation-announces-schedule-for-kubernetes-forums-in-bengaluru-and-delhi-india/"> CNCF announces schedule for Bengaluru/Delhi Forums</a></li> </ul> <h3 id="links-from-the-interview">Links from the interview</h3> <ul> <li><a href="https://www.m3db.io/">M3 website</a></li> <li><a href="https://eng.uber.com/m3/">M3: Uber’s Open Source, Large-scale Metrics Platform for Prometheus</a></li> <li>Before: <a href="https://en.wikipedia.org/wiki/Graphite_(software)">Graphite</a> and its <a href="https://graphite.readthedocs.io/en/latest/whisper.html">Whisper</a> database</li> <li><a href="https://prometheus.io/">Prometheus</a> <ul> <li><a href="https://prometheus.io/docs/introduction/faq/#why-do-you-pull-rather-than-push"> Why pull rather than push?</a></li> <li><a href="https://github.com/prometheus/alertmanager">AlertManager</a></li> <li><a href="https://prometheus.io/docs/prometheus/latest/querying/basics/">PromQL</a></li> </ul> </li> <li><a href="https://oss.oetiker.ch/rrdtool/">RRDtool</a></li> <li><a href="https://github.com/m3db/m3">M3 on GitHub</a>: open source from the start</li> <li><a href="https://chronosphere.io">Chronosphere</a></li> <li>Rob’s 2019 KubeCon’s talks: <ul> <li>EU: <a href="https://www.youtube.com/watch?v=EFutyuIpFXQ">M3 and Prometheus, Monitoring at Planet Scale for Everyone</a></li> <li>NA: <a href="https://www.youtube.com/watch?v=TzNZIEvhAdA">Deep Linking Metrics and Traces with OpenTelemetry, OpenMetrics and M3</a></li> </ul> </li> <li>Twitter: <ul> <li><a href="https://twitter.com/roskilli">Rob Skillington</a></li> <li><a href="https://twitter.com/martin_c_mao">Martin Mao</a></li> <li><a href="https://twitter.com/m3db_io">M3</a></li> <li><a href="https://twitter.com/chronosphereio">Chronosphere</a></li> </ul> </li> </ul>