The staggering increase in private data stored by corporations is driving numerous efforts aimed at utilizing the data for business purposes.
One compelling approach is to merge private data with the plethora of available public sources into a seemingly single unit that is searchable and regularly updated.
In this talk we demonstrate how we achieved this vision using a scalable asynchronous stream processing architecture that can handle Terabytes of data and ensure data relevance via regular updates.
Eran Avidan offers an overview of a novel architecture based on Kafka streams, Kubernetes and Neo4J that easily enables the transformation of any piece of information into a knowledge graph structure while maintaining its freshness over time.
The solution is based on a series of distributed asynchronous steps that ‘listens’ to changes in private and public data sources including sales information, marketing activity, social media and commercial websites, extracts knowledge, structures the knowledge into a multitude of appropriate graph formations, and inserts that knowledge into a large and growing graph database.
This solution serves as a knowledge base for new AI models that are used by Intel’s Sales and Marketing Group to aid in detecting otherwise difficult to find links between potential clients and as a result directly help Intel in better serving its customers.
The Sales AI knowledge graph currently holds hundreds of millions of connected entities with thousands being fetched, enriched and connected to the graph by the hour.
The approach is highly generalizable and can be applied to a broad range of settings that could benefit from integration of large private and public data into a rich graph of knowledge.