GraphConnect 2020 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Technical Case Study [clear filter]
Tuesday, April 21

11:00am EDT

Leveraging Knowledge Graphs for Environmental Challenges
Never has understanding the environment been more crucial. The challenge: environmental data are fragmented across many different systems and formats, or lost in files. Menome Technologies show how we use multi-agent system design and neo4j to get ALL data to create environmental knowledge graphs.

Imagine what we could know about the environment if we had ALL of the data, instead of just some of the data.

Solving environmental challenges requires combining the knowledge of experts, input from community stakeholders with historical and field data. There are significant challenges though with getting all the available data into a useable state.

Data are siloed across many different enterprise and environmental data management systems, many of which use highly specialized applications, use custom data formats, and have been developed with legacy technology.

This problem is exacerbated by the fact that the environmental sector produces much of its insight in large text based environmental reports. The data in these files traditionally consume hours of data scientists valuable time in identifying and hand extracting key data from these reports.

These historical data often must also be combined with large volumes of data derived from field monitoring programs. Monitoring data ranges from IOT style devices for monitoring things such as stream flow or downhole water quality, to streams of video or acoustic data used to identify wildlife.

Many of these data can be collected in places where connectivity is intermittent and doesn’t allow for transmitting large volumes of data, or require large amounts of time from scientists reviewing footage just looking for key frames.

Menome Technologies has developed the Menome Insight platform to address the challenges associated with collecting, integrating and deriving insight from environmental data.

By using a highly containerized, micros-service multi-agent message based architecture, Menome has created a set of Knowledge Agents designed to atomize data from any source into a set of streams that are continuously refined into an environmental Knowledge Graph of all available data sources.

Mike will provide an overview of the Menome approach using examples derived from projects Menome has worked on.


Mike Morley

Mike developed his first knowledge system in 1986 and has been developing software designed to augment the abilities of people, organizations and industries ever since. Following getting a degree in Geological Engineering, Mike has spent the past 25 focusing on disrupting to Environmental... Read More →

Tuesday April 21, 2020 11:00am - 11:40am EDT
Room 4

11:45am EDT

Big Pharma Problems. Big Graphs: Creating the Merck Manufacturing Mesh
The year: 1989. A young(er) Tim Berners-Lee, working at the research-focused CERN, submits a modest proposal to solve to organization's growing knowledge and information management problem. His memo, titled "Information Management: A Proposal" focuses on creating a non-hierarchical web to connect the CERN's heterogeneous systems. With management's only comment "Vague but exciting..." scrawled across the top of the page, this would serve as the basis for the creation of the World Wide Web.

Fast-forward 30 years. A group of engineers and scientist, working at the research-focused pharmaceutical company Merck, grows tired of typing the phrase "LEFT OUTER JOIN" in order to link and contextualize the company's many data sources. Consequently, they set out to create a unified manufacturing information model. With attempts at relational data models failing, the team turns graph databases to better capture the complex nature of their heterogeneous, inconsistent schemas-based, and non-hierarchical data systems. With nothing written across any memos (because it's 2019 and we use email now), their efforts serve as the origins of the Merck Manufacturing Mesh. This mesh brings together a wide array of domains including MES, LIMS, and ERP data to create a more intuitive data model free of the need for primary keys. Leveraging the efficiency of index free adjacency, the Mesh will help engineers and scientist quickly answer questions regarding batch genealogy, material impact assessments, end-to-end product lead times, and much more.

avatar for Marcus Adams

Marcus Adams

Associate Director, Digital Proactive Process Analytics, Merck & Co.
Marcus Adams earned his BEng and MS in Chemical Engineering from the University of Delaware and Villanova University, respectively. His more than decade of experience at Merck spans the bio-pharmaceutical spectrum and includes experience in pre-clinical PK/PD modeling, product commercialization... Read More →

Tuesday April 21, 2020 11:45am - 12:25pm EDT
Room 2

11:45am EDT

From Impossible to Done: Transforming WestJet's Flight Schedule Data Publishing
We replaced a year-long manual (pen, paper, spreadsheet and email) process with three hours of compute in our flight schedule graph model. The result is an incomparable increase in efficiency, accuracy, and work capacity. The Neo4j-based solution changed the game entirely.    

At WestJet Airlines, we prioritize Guest experience. On our website, the first step of our booking process is optimized by representing only reachable destinations from the selected origin. Without such an optimization, our guests would have only a one in four chance of choosing an origin-destination combination that is purchasable.

Identifying routable pairs of origins and destinations was simple to do when our flight network was small. But with a growing network of bases, seasonal destinations, additional code share partners, and ever-shifting commercial flight schedules, maintaining this data becomes a difficult problem. Historically, dedicated employees identified obvious connection changes in a tedious process involving pen and paper. Connection changes were communicated through a change-set spreadsheet via email to be applied to master XML files.

In this talk we'll describe how we used Neo4j to model our flight network from the commercial schedule and requisite minimum connection time rules. The result is a weekly process that ingests a stream of SSIM format schedule data to build a graph model from which we harvest our required data artifacts.


Dave Pirie

avatar for Mark Miller

Mark Miller

Software Engineer, Westjet

Tuesday April 21, 2020 11:45am - 12:25pm EDT
Room 3

11:45am EDT

How we're using GRANDstack for Data Utopia in MEP design
The construction industry is in the midst of radial change. A significant driving force is in developing sustainable designs which meet the needs of the building occupants.

This talk will showcase how graph data can provide rich insights and a common source of data throughout the design process."


WIll Reynolds

Software Engineer, Hoare Lea
I've been developing software for the construction industry for around 15 years. Starting from tools and utilities for AutoCAD and Autodesk Revit, and now a firm graph data enthusiast set on revolutionising the buildings services industry with the awesome power the graph!

Tuesday April 21, 2020 11:45am - 12:25pm EDT
Room 4

2:35pm EDT

Using Graph Technology to Map Fandom
For years fan analytics have been focused on metrics over time - Streams today, new followers this month, likes on a post… But not every fan who follows or likes a post is created equal. Using graph technology we are helping artists identify, communicate with, and ultimately monetize their superfans. Graph technology has been instrumental for us in measuring the relationship between artist and fan. Did the fan use the song in their YouTube video that has 100K views? Did the fan post from the show and get 5 friends to come? Did they tweet the artist’s Spotify link and get 100 retweets? These are all metrics that we are able to sift through using graph algorithms to determine who are those top 20% of fans that are driving the majority of an artist’s revenue.

avatar for Sajan Sanghvi

Sajan Sanghvi

CTO, Laylo
Saj currently serves as the CTO of Laylo, a platform aimed to be Salesforce for Artists. The platform allows artists to identify, communicate with, and ultimately monetize their superfans. Saj also serves as a band member of 'No Suits', totalling over 10M streams online, and represented... Read More →

Tuesday April 21, 2020 2:35pm - 2:50pm EDT
Room 6

3:30pm EDT

The Graph Database Workshop That Changed An Industry
In 2019, a group of eager engineers gathered in a room to learn about Neo4j, Graphql and Apollo. Little did they know, this workshop not only changed their lives, but shook an entire industry. This is the true story of how graph connected people around the world and catalyzed real change.

avatar for Polley Wong

Polley Wong

Polley Wong is a serial entrepreneur. She’s the CEO of design technology and research company VIUSPACE (pronounces as View-space) and co-founded the Interior and Architecture design firm We Create Group with Interior Designer Briana Earl. She’s a strong advocate for gender equality... Read More →

Tuesday April 21, 2020 3:30pm - 3:45pm EDT
Room 6

3:30pm EDT

99.9999% (seriously, that many 9's) uptime at Adobe: How we got there with Neo4j
Did you ever think you can setup your casual cluster to be self-healing and auto recoverable in the cloud? Would you like to know that your backups will restore without error and that your data is consistent every day? Come to our talk to learn more about running a stress free causal Neo4j cluster.

We will discuss:
  • Restore testing / data consistency check
    • Confirming the backup was a success by executing a successful restore
    • Executing a consistency check on the restore
  • Automated backups
    • Installed and scheduled to run on each node
    • Uses etcd cluster is as a locking mechanism to ensure only running on one node at a time in the cluster
  • Autoscaling groups
    • Set to always have at least one server running
    • Extra ASG configured to help facilitate rolling upgrades
  • CoreOS
    • Cloud Ignition
      • systemd units
      • Used to get secure keys and environment variables from S3 bucket.
      • Setup scripts
  • Docker implementation
    • Use AWS ECR to store custom Neo4j docker image
  • Ansible
    • Config management tool we used to configure all infrastructure.
  • ELB endpoints
    • Uses native Neo4j calls to properly forward requests to a Leader or Follower
  • ENI for persistent IP
    • Known private IPs for the cluster allows the use of a pre-ordained config file.
  • How to select the right instance types
    • Hardware considerations
    • Memory is at least 2x database size - allows for growth - as well as some more for the OS.
  • Gotchas
    • RAFT leader election issue with ephemeral ports
      • Add the correct port range to Security Group to allow for RAFT protocol
    • ENI CoreOS routing
      • Routing table rules needed for the configured private IPs
    • Unique constraints
      • What happens when you don’t add a unique constraint when adding a new Node Type that has an Id


Manuel Toledo

Mgr, Software and DevOps Development, Adobe
Manny Toledo is a leader in Cloud Platforms at Adobe. He has in-depth knowledge of multiple cloud environments and their supporting technology. Over the years, he's become adept at shifted large scale web applications to new platforms, taking advantage of the latest technology, without... Read More →

Gabe Tucker

Software Engineer, Adobe
I have been working with data technologies in operations, administration and engineering for over 15 years in multiple technologies and industries. I consistently advocates of the importance of accuracy and integrity of data.

Tuesday April 21, 2020 3:30pm - 4:10pm EDT
Room 2

3:30pm EDT

From Zero to Knowledge Graph at BCBS
It started as a straightforward request:

"How can we view a timeline of each interaction we have with our members?"

Along the way, this straightforward ask grew into the “Member Intervention Hub: an analytics engine to enable our business teams to identify key members and intervention opportunities to improve health and reduce costs.

In between, our small team had a crash course in Neo4j, CypherQL, and the right and wrong ways to build a knowledge graph for health insurance.

This talk covers some of the key things we learned while building our Member Intervention Hub:

1. Getting business / executive buy-in on a graph solution

2. Iterating on a data model

3. How far you can get with Community Edition

avatar for James Colvin

James Colvin

It started as a straightforward request:“How can we view a timeline of each interaction we have with our members?”Along the way, this straightforward ask grew into the “Member Intervention Hub”: an analytics engine to enable our business teams to identify key members and intervention... Read More →

Tuesday April 21, 2020 3:30pm - 4:10pm EDT
Room 4

4:15pm EDT

Open Source Knowledge Graph of News
Journalism is in crisis. Newspaper revenues have been falling while fake news is on the rise. To alleviate these problems, we propose a knowledge graph of entities and relationships from publicly available news sources, using an open dataset of over 23,000 Vox articles as a proof-of-concept. We believe that this work will significantly enable both journalists and consumers of news to keep track of large volumes of information in a non-siloed manner. News organizations will have access to a comprehensive content catalog that allows for collaboration between and within other organizations, as well as producing investigative journalism pieces at a lower cost. The public will benefit from having a platform that is widely available, easily searchable, comprehensive over time and space, and curated by trustworthy sources. Because we also believe that our initiative is of great benefit to the public, we are both open sourcing our efforts and actively encouraging others to collaborate with us and each other to bring this project forward.

avatar for Hanhan J. Li

Hanhan J. Li

I’m Hanhan, the co-founder of PressDB, an exploratory writing engine that empowers people to research and write better and faster. I graduated with a dual masters’ degree in Journalism and Statistics from Columbia University in 2018.I have been fascinated about the graph world... Read More →

Chris Rusnak

Chris Rusnak has 8 years of work experience as a data scientist in information services and consulting industries. He has obtained a M.S. degree in Data Science from the Data Science Institute at Columbia University in 2017 and a B.A. degree in Math and Biological Sciences as sum... Read More →

Tuesday April 21, 2020 4:15pm - 4:30pm EDT
Room 6

4:15pm EDT

Graph-based AIOPs at eBay
In the 25-year-journey of eBay developing and managing large scale software, data and system architecture. It has always been critical to ensuring quality, reliability, and security among a host of other key expected fundamentals of the business products.

Our AI OPs roadmap aims to address the following key challenges:

* "Blindness": limited observability on architectural knowledge or issues

* "Ignorance": Lack of measurability for service architecture, or technical debts

* "Primitiveness": Missing diagnostic, engineering and run-time automation

Graph techniques and algorithms are a critical part of our roadmap - build and evolve sustainable eBay service architecture by providing automated architectural visibility, assessment, and governance of our service ecosystem.

In this talk, we will be sharing our blueprints, thoughts, and existing progress (e.g., [Realtime Graph-based Root Cause Analysis for Cloud-Native Distributed Data Platform](http://www.vldb.org/pvldb/vol12/p1942-wang.pdf), graph-based dependency systems). Our goal here is to share the motivation, concept, design, and values of modeling complicated and evolving infrastructure with key knowledge (which generated from various distributed sources, e.g. ML models) as a graph.


Hanzhang Wang

Applied Researcher, eBay
Hanzhang is an applied researcher at eBay. After earning his Ph.D. from the University of Michigan, he is leading eBay's intelligent infrastructure research - to build and evolve sustainable eBay microservice architecture by providing best of breed automated architectural visibility... Read More →

Tuesday April 21, 2020 4:15pm - 4:55pm EDT
Room 1

4:40pm EDT

Resolving Locations in Text with NLP and Neo4j
Humans are very good at inferring the location of events described in text, even when no places are mentioned by name. Can computers do something similar? We demonstrate a solution, using a combination of natural language processing and graph techniques applied to a corpus of online news stories.

Effective news monitoring, for competitive intelligence and other purposes, often requires an understanding of the location at which the events mentioned in an article occur. This information is sometimes not given explicitly in the text, but can be inferred using a combination of natural language processing and graph techniques. This talk describes a Neo4j-based system for performing this task. We also discuss some lessons learned along the way, and a graph-based approach to enhancing the traditional process of named entity recognition.

avatar for Stephen Hall

Stephen Hall

Software Engineer, Predix Communications
After receiving a PhD in Electronic and Computer Engineering from the University of Wollongong, Australia, I held various academic and industry positions in telecommunications and computing. In 2000, I co-founded Predix Communications in Cape Town, South Africa, and the company developed... Read More →

Tuesday April 21, 2020 4:40pm - 4:55pm EDT
Room 6
Wednesday, April 22

11:20am EDT

Enhancing conversational AI with a contextual graph model
A contextual graph model feeds AI suggestions to refine a retail catalog display in a person to person conversation. The graph indexes a conversational sequence on genres, related sub-genres and attributes to yield suggestions that are meaningful and relevant through the conversation flow.

avatar for Kofi Dadzie

Kofi Dadzie

Co-Founder, Executive Vice-Chair, Rancard
Kofi is Co-Founder and Executive Vice Chair of Rancard. He has led the company in its evolution to conversational discovery with AI & social recommendations, following success in mobile content distribution technology with developers and brands including Google, BBC, VOA, MTV, ESPN... Read More →

Wednesday April 22, 2020 11:20am - 11:35am EDT
Room 6

1:30pm EDT

Knowledge Graphs for AI-Powered Shopping Assistants
The world around us is a network of connected concepts. Concepts can belong to different domains, but they are not isolated, rather maintain some connections with each other. Graph databases contain these connections besides the data, in contrast to traditional databases where relations are found by expensive operations during query time. In recent years, conversational systems have been trending as they provide an effortless shopping experience for customers. One of the challenges towards building a seamless dialogue with the user is understanding the products in the catalog. We address this by exploiting the perks of graph databases into the conversational systems for e-commerce through creating a knowledge graph representation for catalog data. This graph representation can be applied to many domains including grocery, movies, furniture, fashions, etc. Moreover, it can be beneficial in several facets within each domain, such as improving the training data for NLU models or answering product-related questions or even to narrow down the search results for search refinement queries.

To enable these, we first link retail concepts to catalog products, each with a specific brand, size, color, type, etc. In our application, this particular linkage between these two entities was missing initially but we utilize other hierarchical relations and design a simple but powerful semi-supervised-learning algorithm to create this linkage. To enhance our algorithm, we utilize the user logs carrying insightful information among these entities. We load the output of our algorithm into the neo4j database, a fast visual graph database supporting multiple hop query and node properties. In particular, to further benefit from Neo4j, we use its Cypher query language feature to address knowledge-based questions by transforming natural queries to Cypher queries.


Ghodrat Aalipour

Senior Data Scientist, Walmart Labs
I joined Walmart Labs in September 2018 and since then, I have been working on NLU systems for e-Commerce, please see here or here. Prior to that I was a lecturer in the School of Mathematical Sciences at RIT and a visiting faculty at the University of Colorado Denver. I have a P... Read More →

Wednesday April 22, 2020 1:30pm - 1:45pm EDT
Room 6

1:30pm EDT

Bank of Montreal: Predictive Risk Graph for Financial Institutions
Commercial banks hand out loans based on risk assessment at a point in time. When a company's risk degrades, banks must act quickly to limit their exposure.
We built a graph that combines external news, and internal data to alert analysts when a customer's rating might degrade bc of negative news.

With thousands of commercial banking clients, it is difficult for risk analysts to know what the bank's exposure is to certain customers. Often they are unaware that an entity is a customer or has a commercial relationship with one.

The graph we built combines relational internal bank data to show how a loan is structured, unstructured documentation to identify connected entities, and external media (twitter, reuters, etc) with sentiment analysis.

Whenever we detect a news article with negative sentiment and identify a path to one of our customers we issue an alert to the risk team who can then take appropriate action. In this talk I'll highlight the key technologies we used to make this possible: NLP, Neo4j plugins, and how we built an end-to-end application using neo4j as the engine.

Wednesday April 22, 2020 1:30pm - 2:10pm EDT
Room 3

1:50pm EDT

The Connected Habitat of Impact
The Nature Conservancy

The Nature Conservancy has a global team of 400+ scientists that are helping drive our understanding of how nature’s resources like water, land and air interact with society, and vice versa. When we pick our actions and efforts to conserve a piece of land or a patch on the ocean, the ecological impacts are usually scientifically clear and well stated. However, more often than not, preserving nature’s resources can impact more than just the environment - take for instance the task of protecting land. Land provides food, food feeds people, industries and infrastructure and more. Even the interconnectedness of land and water make nature a complex entity of change! We need a better way to understand and drive decisions for conservation impact!

Helping nature do nature faster

While nature always works, it takes time. For instance, we know reforestation helps restore our water systems in a sustainable way! However, it can be years before we are able to see the proof of the impact. We need technology and tools to speed up our understanding and confidence in our strategies. We need to engage more regional and relevant stakeholders that can push the success as well as de-risk the execution of such strategies. ## Telling stories & making a case for impact With the power of graphs, we can better understand the connectedness of people and nature’s resources. By leaning on patterns for “detecting risk” & “recommending actions”, we can kickstart some innovative and highly relevant stories of impact!


Niraj Swami

Niraj Swami is an avid technologist & innovator with deep interests in Artificial Intelligence, applied knowledge graphs, cognitive technologies and behavioral economics. Niraj has led Applied AI and innovation initiatives for the learning, business productivity and healthcare spaces... Read More →

Wednesday April 22, 2020 1:50pm - 2:05pm EDT
Room 6

2:15pm EDT

Ending the Licit Opioid Crisis with Neo4j and Artificial Intelligence
Leveraging Neo4j, machine learning, and our Analytics Driven Targeting methodology our team was able to identify pharmacies and prescribers diverting opioids and other controlled substances. Ultimately this has led to restrictions against numerous medical professionals in the U.S.  

Our team has supported actions against numerous medical professionals in the U.S. Using Neo4j, machine learning, our Analytics Driven Targeting methodology, and bespoke diversion-specific risk factors we analyzed millions of prescription records and identified numerous prescribers, patients, and pharmacies diverting opioids and other controlled substances. Neo4j proved to be critical in our analysis because the relationships between entities proved to be some of our most valuable features in our predictive models. Additionally, Neo4j's graph engine, algorithms, and built-in scalability enabled us to analyze the massive amount of data rapidly.

Wednesday April 22, 2020 2:15pm - 2:55pm EDT
Room 3

2:15pm EDT

Relevant Search with Graphs
"Text based similarity scoring has had his time, nowadays all that matters is context. And what is best suited to make sense of context ? A Graph of course !
Learn how to use graphs to improve your search engine and improve your users experience.This session will go through :

- Graph-Based search boosting

- Synonyms are graphs too

- Graph Algorithms useful for search

All will be demoed on the Neo4j community forum data.

avatar for Christophe Willemsen

Christophe Willemsen

CTO, GraphAware
Christophe Willemsen is CTO at GraphAware, the world's #1 Neo4j consultancy. He is a Neo4j expert, consultant and trainer, having implemented graphs in various industries all over the globe. He focus now on the technical development of Hume, GraphAware's Graph-Powered Insights Engine... Read More →

Wednesday April 22, 2020 2:15pm - 2:55pm EDT
Room 4

2:15pm EDT

The shortest path to digitizing worldwide multi-modal transport planning at the Port of Rotterdam
Learn how Europe's largest and most innovative port, the Port of Rotterdam, is leveraging neo4j graph technology to run its multi-modal schedule route optimization engine. The core driver of our application Navigate. Our door-to-door route optimization application for container shippers.

We believe that transparency is key in the optimization of the logistics industry and in the reduction of CO2 emissions due to suboptimal routing. Our application Navigate therefore provides container-shippers and -forwarders a clear and neutral comparison of door-to-door options to ship their container from A to B. Combining schedule information from deepsea, shortsea, train, barge and truck operators. Enabling them to optimize their route on duration, arrival time, emissions and number of transfers.

Due to the combination of an urging problem, proven technology, business experts and development team, the engine was able to go from a proof of concept in our Innovation lab, to a minimal viable product, and into production within months. Starting up fast with cypher queries in Neo4j desktop and scaling up by using graph procedures in our cloud environment. Additionally the engine also serves as an enabler for machine learning models generating estimated times of arrival for vessels coming to our and other ports.

The talk is covering our process from an initial idea to bringing it into production as a SaaS solution, now used by ports on multiple continents. Our experienced benefits of having full control from data retrieval to application, as well as technical insights such as custom graph procedures using Kotlin.

avatar for Riccardo Lippolis

Riccardo Lippolis

Software Engineer, Port of Rotterdam
An inquiring and experienced Java/Kotlin Software Engineer with a passion for solving complex problems. He works for JDriven (currently at the Port of Rotterdam), where he shares his passion and drive with other enthusiasts. He has spoken at several international conferences, including... Read More →
avatar for Jorrit van der Ven

Jorrit van der Ven

Port of Rotterdam
Jorrit is a Kotlin/Java developer working at JDriven in the Netherlands. He loves to learn about new technologies and to share his knowledge with others. In his spare time he likes to make his house a bit smarter using wires, chips and a soldering iron. He has spoken at several international... Read More →

Kevin Kruijthoff

Product Lead, Port of Rotterdam
A data scientist turned Product lead at Port of Rotterdam, responsible for the team realizing the Pathfinder engine as well as multiple applications. Enthusiastic about exploring the potential beneficial use of data analysis, modeling and simulation, and applying them in complex problems... Read More →

Wednesday April 22, 2020 2:15pm - 2:55pm EDT
Room 1

2:35pm EDT

Exploring NASA Open data
How’s NASA Open Data datasets related? Which datasets are related and which aren’t? How easy is it to search for specific data in NASA? Let’s go through the work of putting all this information in a Neo4j Database to get the most of one of the biggest open data sources ever.


Wednesday April 22, 2020 2:35pm - 2:50pm EDT
Room 6

3:30pm EDT

First Line of Defense: How the Danish Business Registry fights fraud with Machine Learning and Knowledge Graphs
Regulation of fraudulent businesses is necessary in order to mitigate the negative impact on society. However it would be best if these actors are never given the means to commit fraud. The Danish Business Authority uses ML and knowledge graph to prevent registration of fraudulent businesses.


Marius Hartmann

Team Lead, ML Lab, Danish Business Authority
Marius Hartmann is the team leader of the ML Lab at the Danish Business Authority and responsible for designing their ML data platform. Based on open source components, the synergy between machine learning and graph analysis enables near real time interception of fraudulant behaviour... Read More →

Wednesday April 22, 2020 3:30pm - 4:10pm EDT
Room 3