Scalability, with Wojciech Tyczynski

Kubernetes Podcast from Google

Episode | Podcast

Date: Tue, 07 Jul 2020 15:02:28 +0000

<p>Before Kubernetes was launched, it could have at most 25 nodes in a cluster. At 1.0, the target was 100. Meanwhile, Borg, Omega and Mesos were all running away at 10,000. What did it take to get Kubernetes to this number, and above? SIG Scalability and GKE Tech Lead Wojciech Tyczynski tells us.</p> <p>Do you have something cool to share? Some questions? Let us know:</p> <ul> <li>web: <a href="https://kubernetespodcast.com">kubernetespodcast.com</a></li> <li>mail: <a href="mailto:kubernetespodcast@google.com">kubernetespodcast@google.com</a></li> <li>twitter: <a href="https://twitter.com/kubernetespod">@kubernetespod</a></li> </ul> <h3 id="chatter-of-the-week">Chatter of the week</h3> <ul> <li>Follow-up: <ul> <li>Chairs, from <a href="https://kubernetespodcast.com/episode/107-cncf-under-new-management/"> Episode 107</a></li> <li><a href="https://twitter.com/craigbox/status/1280474600544100352">Christmas trees</a>, from <a href="https://kubernetespodcast.com/episode/104-ingress-and-service-apis/"> Episode 104</a></li> <li>Kids music <ul> <li><a href="https://www.youtube.com/watch?v=MtN1YnoL46Q">The duck song</a></li> <li><a href="https://www.reddit.com/r/Jokes/comments/4mrgpm/a_duck_walks_into_a_bar_and_asks_got_any_bread/"> The duck joke</a></li> <li><a href="https://www.youtube.com/user/schmoyoho">Autotune the News</a></li> <li><a href="https://www.youtube.com/watch?v=eHP_gMCRZqQ">The duck song goes viral on TikTok</a></li> <li><a href="https://www.youtube.com/watch?v=LW_6cy_VhHI">Walmart Yodeling Kid</a></li> </ul> </li> </ul> </li> </ul> <h3 id="news-of-the-week">News of the week</h3> <ul> <li><a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/attend/virtual-event-update/"> KubeCon US goes virtual</a></li> <li><a href="https://promcon.io/2020-online/schedule/">PromCon schedule</a></li> <li><a href="https://aws.amazon.com/blogs/aws/aws-app2container-a-new-containerizing-tool-for-java-and-asp-net-applications/"> AWS App2Container</a> <ul> <li><a href="https://kubernetespodcast.com/episode/048-anthos-migrate/">Episode 48, with Issy Ben-Shaul</a></li> </ul> </li> <li><a href="https://cloud.google.com/kubernetes-engine/docs/release-notes#july_2_2020"> GKE brings Node Local DNS cache to GA</a> <ul> <li><a href="https://kubernetespodcast.com/episode/106-coredns/">Episode 106, with John Belamaric</a></li> </ul> </li> <li><a href="https://cloud.google.com/sdk/gcloud/reference/alpha/container/node-pools/create#--system-config-from-file"> Update kernel and Kubelet config on GKE nodes</a></li> <li><a href="https://github.com/Azure/AKS/releases/tag/2020-06-29">AKS brings 1.17 to GA</a>; adds <a href="https://aka.ms/aks/containerd">containerd</a> and <a href="https://aka.ms/aks/ppg">priority placement group</a> support</li> <li><a href="https://diamanti.com/spektra-3-0-simplifies-hybrid-cloud-for-kubernetes-apps/"> Diamanti Spektra 3.0</a></li> <li><a href="https://github.com/kubernetes/community/tree/master/wg-naming">Kubernetes WG Naming</a></li> <li><a href="https://www.cncf.io/blog/2020/07/01/introducing-cloud-native-community-groups/"> Introducing Cloud Native Community Groups</a></li> <li><a href="https://www.cncf.io/blog/2020/07/06/announcing-the-updated-cncf-storage-landscape-whitepaper/"> Updated CNCF Storage whitepaper</a></li> <li><a href="https://www.presslabs.com/blog/presslabs-is-the-first-managed-wordpress-hosting-platform-running-on-kubernetes/"> Presslabs moves to Kubernetes</a> <ul> <li><a href="https://github.com/presslabs/stack/">Presslabs Stack</a> and <a href="https://github.com/presslabs/wordpress-operator">WordPress Operator</a></li> </ul> </li> </ul> <h3 id="links-from-the-interview">Links from the interview</h3> <ul> <li><a href="https://research.google/pubs/pub41684/">Omega</a> <ul> <li><a href="https://kubernetespodcast.com/episode/043-borg-omega-kubernetes-beyond/"> Episode 43, with Brian Grant</a></li> </ul> </li> <li><a href="https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md#how-we-define-scalability"> Defining scalability</a></li> <li><a href="https://kubernetes.io/blog/2015/09/kubernetes-performance-measurements-and/?m=1"> Original SLOs</a> <ul> <li>API-responsiveness: 99% of all our API calls return in less than 1 second</li> <li>Pod startup time: 99% of pods (with pre-pulled images) start within 5 seconds</li> </ul> </li> <li><a href="https://docs.google.com/document/d/15rD6XBtKyvXXifkRAsAVFBqEGApQxDRWM3H1bZSBsKQ/edit"> Target SLO doc</a> - 25 nodes</li> <li><a href="https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf"> Borg</a> - ~10,000 nodes</li> <li><a href="https://kubernetes.io/blog/2015/09/kubernetes-performance-measurements-and/"> Sep 2015, Kubernetes 1.0</a> - 100 nodes <ul> <li><a href="https://www.nextplatform.com/2015/09/15/kubernetes-has-a-ways-to-go-to-scale-like-google-mesos/"> “Kubernetes Has A Ways To Go To Scale Like Google, Mesos”</a> by Timothy Prickett Morgan</li> </ul> </li> <li><a href="https://kubernetes.io/blog/2016/03/1000-nodes-and-beyond-updates-to-kubernetes-performance-and-scalability-in-12/"> March 2016, Kubernetes 1.2</a> - 1,000 nodes</li> <li><a href="https://kubernetes.io/blog/2016/07/update-on-kubernetes-for-windows-server-containers/"> July 2016, Kubernetes 1.3</a> - 2,000 nodes <ul> <li>Work by Clayton Coleman, guest of <a href="https://kubernetespodcast.com/episode/085-openshift-and-kubernetes/"> Episode 85</a></li> </ul> </li> <li><a href="https://kubernetes.io/blog/2017/03/scalability-updates-in-kubernetes-1-6/"> March 2017, Kubernetes 1.6</a> - 5000 nodes</li> <li><a href="https://www.cncf.io/blog/2019/05/09/performance-optimization-of-etcd-in-web-scale-data-scenario/"> etcd v3 improvements for web scale</a></li> <li><a href="https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/thresholds.md"> Scalability Envelope</a></li> <li><a href="https://kubernetes.io/docs/setup/best-practices/cluster-large/">Today’s scalability numbers</a></li> <li><a href="https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/"> EndpointSlices</a> <ul> <li><a href="https://kubernetespodcast.com/episode/104-ingress-and-service-apis/"> Episode 104, with Bowei Du</a></li> </ul> </li> <li><a href="https://kubernetes.io/blog/2017/02/inside-jd-com-shift-to-kubernetes-from-openstack/"> JD.com’s 10,000 node clusters</a></li> <li><a href="https://www.alibabacloud.com/blog/how-does-alibaba-ensure-the-performance-of-system-components-in-a-10000-node-kubernetes-cluster_595469"> Alibaba’s 10,000 node clusters</a> <ul> <li><a href="https://kubernetespodcast.com/episode/095-etcd/">Episode 95, with Xiang Li</a></li> </ul> </li> <li><a href="https://cloud.google.com/blog/products/containers-kubernetes/google-kubernetes-engine-clusters-can-have-up-to-15000-nodes"> Google’s 15,000 node GKE clusters</a></li> <li><a href="https://cloud.withgoogle.com/next/sf/sessions?session=APP310&amp;gate=true#application-modernization"> Twitter session at the upcoming Google Cloud Next</a> by Reza Motamedi and Maciek Różacki</li> <li><a href="https://github.com/kubernetes-sigs/poseidon">Poseidon</a> and <a href="http://firmament.io/">Firmament</a></li> <li>Wojciech Tyczynski: <ul> <li><a href="https://github.com/wojtek-t">GitHub</a></li> <li><a href="https://pl.linkedin.com/in/wojciech-tyczy%C5%84ski-69207417">LinkedIn</a></li> </ul> </li> </ul>