About this talk
This tutorial provides an introduction to building knowledge graphs by using open source libraries in Python. We’ll introduce the key concepts and abstractions, discuss engineering trade-offs, and emphasize hands-on coding exercises.
The coding exercises are based on progressive examples based on managing the content for a website, which illustrate how to integrate the use of:
- rdflib - RDF triples, SPARQL queries, serialization
- arrow - Parquet serialization of RDF graphs
- networkx - graph algorithms
- pyvis - interactive visualization
- gensim - embedding (clean up annotations)
- pslpython - probabilistic soft logic, to apply rules for graph-based inference, link prediction, testing data quality of annotations, etc.
Plus related use of pandas, numpy, matplotlib, pylev, and other libraries that help with building and analyzing KGs in open source Python.
We will work in Jupyter notebooks, available from a public repository on GitHub, which can be run locally. Semantic technologies used within these examples include OWL, FOAF, XSD (for literals), and some SKOS, which are represented in Turtle and JSON-LD formats.
Participants are encouraged to ask questions throughout the lectures, exercises, and during breaks.
- Some coding experience in Python (you can read a 20-line program)
- Interest in use cases that require knowledge graph representation
Preparation before class:
- git clone https://github.com/DerwenAI/kglab.git
- fill out the online survey https://forms.gle/uB9p7XBjWutR2fHd7
- join our Slack channel for the class https://knowledgeconnexions.slack.com/archives/C01F95PAL31
- Python developers who need to work with KGs
- Data Scientists and Machine Learning Engineers
- Technical Leaders who want hands-on KG implementation experience
- Hands-on experience with popular open source libraries in Python for building KGs
- Coding examples that can be used as starting points for your own KG projects
- Understanding trade-offs for different approaches to building KGs