Build Your Own Knowledge Graph With Zincbase

The Python Podcast.__init__

Episode | Podcast

Date: Mon, 05 Aug 2019 17:00:00 -0400

<div class="wp-block-jetpack-markdown"><h3>Summary</h3> <p>Computers are excellent at following detailed instructions, but they have no capacity for understanding the information that they work with. Knowledge graphs are a way to approximate that capability by building connections between elements of data that allow us to discover new connections among disparate information sources that were previously uknown. In our day-to-day work we encounter many instances of knowledge graphs, but building them has long been a difficult endeavor. In order to make this technology more accessible Tom Grek built Zincbase. In this episode he explains his motivations for starting the project, how he uses it in his daily work, and how you can use it to create your own knowledge engine and begin discovering new insights of your own.</p> <h3>Announcements</h3> <ul> <li>Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.</li> <li>When you&#8217;re ready to launch your next app or want to try a project you hear about on the show, you&#8217;ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you&#8217;ve got everything you need to scale up. And for your tasks that need fast computation, such as training machine learning models, they just launched dedicated CPU instances. Go to <a href="https://www.pythonpodcast.com/linode?utm_source=rss&amp;utm_medium=rss">pythonpodcast.com/linode</a> to get a $20 credit and launch a new server in under a minute. And don&#8217;t forget to thank them for their continued support of this show!</li> <li>And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. With such an intuitive tool it&#8217;s easy to make sure that everyone in the business is on the same page. Podcast.<strong>init</strong> listeners get 2 months free on any plan by going to <a href="https://www.pythonpodcast.com/clubhouse?utm_source=rss&amp;utm_medium=rss">pythonpodcast.com/clubhouse</a> today and signing up for a trial.</li> <li>You listen to this show to learn and stay up to date with the ways that Python is being used, including the latest in machine learning and data analysis. For even more opportunities to meet, listen, and learn from your peers you don&#8217;t want to miss out on this year&#8217;s conference season. We have partnered with organizations such as O&#8217;Reilly Media, Dataversity, and the Open Data Science Conference. Coming up this fall is the combined events of Graphorum and the Data Architecture Summit. The agendas have been announced and super early bird registration for up to $300 off is available until July 26th, with early bird pricing for up to $200 off through August 30th. Use the code BNLLC to get an additional 10% off any pass when you register. Go to <a href="https://www.pythonpodcast.com/conferences?utm_source=rss&amp;utm_medium=rss">pythonpodcast.com/conferences</a> to learn more and take advantage of our partner discounts when you register.</li> <li>Visit the <a href="https://www.pythonpodcast.com?utm_source=rss&amp;utm_medium=rss">site</a> to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at <a href="https://twtiter.com/Podcast__init__?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">@Podcast__init__</a> or email <a href="mailto:hosts@podcastinit.com">hosts@podcastinit.com</a>)</li> <li>To help other people find the show please leave a review on <a href="https://itunes.apple.com/us/podcast/podcast.-init/id981834425?mt=2&amp;uo=6&amp;at=&amp;ct=&amp;utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">iTunes</a> and tell your friends and co-workers</li> <li>Join the community in the new Zulip chat workspace at <a href="https://www.pythonpodcast.com/chat?utm_source=rss&amp;utm_medium=rss">pythonpodcast.com/chat</a></li> <li>Your host as usual is Tobias Macey and today I&#8217;m interviewing Tom Grek about knowledge graphs, when they&#8217;re useful, and his project Zincbase that makes them easier to build</li> </ul> <h3>Interview</h3> <ul> <li>Introductions</li> <li>How did you get introduced to Python?</li> <li>Can you start by explaining what a knowledge graph is and some of the ways that they are used? <ul> <li>How did you first get involved in the space of knowledge graphs?</li> </ul> </li> <li>You have built the Zincbase project for building and querying knowledge graphs. What was your motivation for creating this project and what are some of the other tools that are available to perform similar tasks?</li> <li>Can you describe how Zincbase is implemented and some of the ways that it has evolved since you first began working on it? <ul> <li>What are some of the assumptions that you had at the outset of the project which have been challenged or updated in the process of working on and with it?</li> </ul> </li> <li>What are some of the common challenges when building or using knowledge graphs?</li> <li>How has the domain of knowledge graphs changed in recent years as new approaches to entity resolution and data processing have been introduced?</li> <li>Can you talk through a use case and workflow for using Zincbase to design and populate a knowledge graph?</li> <li>What are some of the ways that you are using Zincbase in your own projects?</li> <li>What have you found to be the most challenging/interesting/unexpected lessons that you have learned in the process of building and maintaining Zincbase?</li> <li>What do you have planned for the future of the project?</li> </ul> <h3>Keep In Touch</h3> <ul> <li><a href="https://github.com/tomgrek?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">tomgrek</a> on GitHub</li> <li><a href="https://tomgrek.com?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Website</a></li> <li><a href="https://twitter.com/tomgrek?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">@tomgrek</a> on Twitter</li> <li><a href="https://www.linkedin.com/in/tomgrek/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">LinkedIn</a></li> </ul> <h3>Picks</h3> <ul> <li>Tobias <ul> <li><a href="https://www.livewellbakeoften.com/blueberry-banana-baked-oatmeal/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Banana Blueberry Oat Bars</a></li> </ul> </li> <li>Tom <ul> <li><a href="https://www.cayennediane.com/sweet-spicy-pickled-habaneros/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Pickled HabaƱero</a></li> </ul> </li> </ul> <h3>Links</h3> <ul> <li><a href="https://github.com/tomgrek/zincbase?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Zincbase</a></li> <li><a href="https://en.wikipedia.org/wiki/Commodore_64?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Commodore 64</a></li> <li><a href="https://en.wikipedia.org/wiki/Electrical_engineering?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Electronic Engineering</a></li> <li><a href="https://en.wikipedia.org/wiki/Artificial_intelligence?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Artificial Intelligence</a></li> <li><a href="https://primer.ai?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Primer.ai</a></li> <li><a href="https://en.wikipedia.org/wiki/Artificial_general_intelligence?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Artificial General Intelligence</a></li> <li><a href="https://www.mathworks.com/products/matlab.html?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Matlab</a></li> <li><a href="https://ipython.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">IPython</a></li> <li><a href="https://numpy.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">NumPy</a></li> <li><a href="https://products.office.com/en-us/excel?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Excel</a></li> <li><a href="https://jupyter.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Jupyter</a></li> <li><a href="https://pandas.pydata.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Pandas</a></li> <li><a href="https://en.wikipedia.org/wiki/Knowledge_Graph?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Knowledge Graph</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/enigma-with-chris-groskopf-episode-50/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode About Enigma Knowledge Graph</a></li> </ul> </li> <li><a href="https://www.imdb.com/title/tt0133093/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">The Matrix</a></li> <li><a href="https://en.wikipedia.org/wiki/Keanu_Reeves?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Keanu Reeves</a></li> <li><a href="https://en.wikipedia.org/wiki/Ontology_(information_science)?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Ontology</a></li> <li><a href="https://en.wikipedia.org/wiki/Semantic_Web?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Semantic Web</a></li> <li><a href="https://en.wikipedia.org/wiki/Word2vec?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Word2Vec</a></li> <li><a href="https://en.wikipedia.org/wiki/SPARQL?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">SparQL</a></li> <li><a href="https://neo4j.com?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Neo4J</a></li> <li><a href="https://en.wikipedia.org/wiki/Graph_database?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Graph Database</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/dgraph-with-manish-jain-episode-44/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode About DGraph</a></li> </ul> </li> <li><a href="https://aws.amazon.com/neptune/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">AWS Neptune</a></li> <li><a href="https://www.postgresql.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">PostgreSQL</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/postgresql-with-jonathan-katz-episode-42/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode</a></li> </ul> </li> <li><a href="https://dask.readthedocs.io/en/latest/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Dask</a> <ul> <li><a href="https://www.dataengineeringpodcast.com/episode-2-dask-with-matthew-rocklin/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Data Engineering Podcast Episode</a></li> </ul> </li> <li><a href="https://en.wikipedia.org/wiki/BBC_Micro?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">BBC Micro</a></li> <li><a href="https://en.wikipedia.org/wiki/BASIC?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">BASIC</a></li> <li><a href="https://en.wikipedia.org/wiki/Prolog?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Prolog</a></li> <li><a href="https://en.wikipedia.org/wiki/Natural_language_processing?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">NLP</a></li> <li><a href="https://allennlp.org/elmo?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">ELMO</a></li> <li><a href="https://arxiv.org/abs/1810.04805?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">BERT</a></li> <li><a href="https://openai.com/blog/better-language-models/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">GPT-2</a></li> <li><a href="https://en.wikipedia.org/wiki/Winograd_Schema_Challenge?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Winograd Schema Challenge</a></li> <li><a href="https://torchbiggraph.readthedocs.io/en/latest/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">PyTorch BigGraph</a></li> <li><a href="https://docs.ampligraph.org/en/latest/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Ampligraph</a></li> <li><a href="https://spacy.io?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">SpaCy</a> <ul> <li><a href="https://www.pythonpodcast.com/episode-87-spacy-with-matthew-honnibal/?utm_source=rss&amp;utm_medium=rss">Podcast.__init__ Episode</a></li> </ul> </li> <li><a href="https://en.wikipedia.org/wiki/AI_winter?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">AI Winter</a></li> <li><a href="https://pytorch.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">PyTorch</a> <ul> <li><a href="https://www.pythonpodcast.com/pytorch-deep-learning-epsiode-202/?utm_source=rss&amp;utm_medium=rss">Podcast Episode</a></li> </ul> </li> <li><a href="https://scikit-learn.org/stable/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">scikit-learn</a></li> <li><a href="http://networkx.github.io/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">NetworkX</a></li> <li><a href="https://www.scipy.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">SciPy</a></li> <li><a href="https://circleci.com?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">CircleCI</a></li> <li><a href="https://readthedocs.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Read The Docs</a> <ul> <li><a href="https://www.pythonpodcast.com/episode-36-eric-holscher-on-documentation-and-read-the-docs/?utm_source=rss&amp;utm_medium=rss">Podcast Episode</a></li> </ul> </li> <li><a href="https://www.gutenberg.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Project Gutenberg</a></li> <li><a href="https://allennlp.org?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Allen NLP</a></li> <li><a href="https://docs.python.org/3/library/doctest.html?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Doctest</a></li> <li><a href="https://en.wikipedia.org/wiki/Reinforcement_learning?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Reinforcement Learning</a></li> <li><a href="https://en.wikipedia.org/wiki/Metacognition?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">Metacognition</a></li> </ul> <p>The intro and outro music is from Requiem for a Fish <a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">The Freak Fandango Orchestra</a> / <a href="http://creativecommons.org/licenses/by-sa/3.0/?utm_source=rss&amp;utm_medium=rss" rel="noopener" target="_blank">CC BY-SA</a></p> </div> <img alt="" height="0" src="https://analytics.boundlessnotions.com/piwik.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fwww.pythonpodcast.com%2Fzincbase-knowledge-graph-episode-223%2F&amp;action_name=Build+Your+Own+Knowledge+Graph+With+Zincbase+-+Episode+223&amp;urlref=https%3A%2F%2Fwww.pythonpodcast.com%2Ffeed%2F&amp;utm_source=rss&amp;utm_medium=rss" style="border: 0; width: 0; height: 0;" width="0" />