GraphConnect 2020 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Graph Data Science [clear filter]
Tuesday, April 21

11:45am EDT

The Graph Data Science Journey: From Analytics to AI
When do you use graphs for machine learning, what domains can they be used in, and how do you get started. Real world examples and use cases to show the steps from getting started with a knowledge graph through to graph native learning. Graphs - or information about the relationships, connection, and topology of data points - are transforming machine learning. We'll walk through real world examples of how to get transform your tabular data into a graph and how to get started with graph AI.

This talk will provide an overview of how we to incorporate graph based features into traditional machine learning pipelines, create graph embeddings to better describe your graph topology, and give you a preview of approaches for graph native learning using graph neural networks. We'll talk about relevant, real world case studies in financial crime detection, recommendations, and drug discovery.

This talk is intended to introduce the concept of graph based AI to beginners, as well as help practitioners understand new techniques and applications. Key take aways: how graph data can improve machine learning, when graphs are relevant to data science applications, what graph native learning is and how to get started.

avatar for Amy Hodler

Amy Hodler

Director, Graph Analytics & AI Programs, Neo4j
Amy is a network science devotee, AI and Graph Analytics Program Manager at Neo4j, and a co-author of the O'Reilly book, ""Graph Algorithms: Practical Examples in Apache Spark and Neo4j. She promotes the use of graph analytics to reveal structures within real-world networks and... Read More →

Tuesday April 21, 2020 11:45am - 12:25pm EDT
Room 1

1:30pm EDT

Identity Graph at Scale - Transforming Billions of Page views to Unique Identity Profiles in Publishing
Leveraging Neo4j, Meredith has been able to create an Enterprise wide Identity Graph with 12.3 Billion Nodes and 18.9 Billion Relationships that represents its Digital presence over the last 18 Months across 30+ brands. Neo4js Graph Algorithms enable unique Digital Profiles for Premium Ad Sales. Digital Marketing and Custom Audience Advertising is a focus of any Media Publisher. Meredith Corporation is leading the field of first party data retargeting using Neo4j for an in-house Identity Graph containing every digital cookie seen across 30+ brands and applications, including People, Entertainment Weekly,Real Simple, Eating Well, and All Recipes. With 100 million unduplicated consumers, the Meredith Database is the largest U.S. consumer database of any media company, and includes 7 in 10 women and 8 in 10 homeowners. The Identity Graph solution has leveraged 10's of Billions page views across multiple data streams to create a graph comprising of 12.2 Billion Cookies and 18.9 Billion Relationships between them to identify recurring individuals across domains and devices to improve Custom Audience Advertising and Profiling.


Ben Squire

Data Scientist, Meredith Corporation
Benjamin Squire is a Senior Data Scientist at Meredith Corporation. He has developed several successful POC's in Machine Learning, Custom Audience Advertising, and Identity + Profiling. His interest is in using new technologies and automation to bring the best content to the right... Read More →

Tuesday April 21, 2020 1:30pm - 2:10pm EDT
Room 1

2:15pm EDT

New: Neo4j's Graph Data Science Library
avatar for Alicia Frame

Alicia Frame

Senior Data Scientist, Neo4j
Alicia Frame is the lead data scientist at Neo4j. She works as part of the product management team to determine the product roadmap for Neo4j's graph algorithms library, and the strategy to grow Neo4j into a dominant analytics platform. In her role, she works closely with early adopters... Read More →

Tuesday April 21, 2020 2:15pm - 2:55pm EDT
Room 2

3:30pm EDT

Empowering the Business with Graph Analytics
Lockheed Martin Aeronautics has integrated graph technology into its technology landscape empowering end users to build and visually explore their data models more effectively than traditional methods. During our graph implementation journey, we developed a self-service operating and support model to enable our users. Equipped with Kettle, Neo4j and Linkurious, business users have been able to develop their own graph models to answer business questions, leveraging developer-consultants for complex solutions. This has proven to be successful in creating a graph community within Lockheed Martin Aeronautics.


Robert Tung

Staff Software Engineer, Lockheed Martin Aeronautics

Shawn Akberali

Senior Software Engineer, Lockheed Martin Aeronautics

Caroline Nelson

Senior Data Analyst, Lockheed Martin Aeronautics

Tuesday April 21, 2020 3:30pm - 4:10pm EDT
Room 1

4:15pm EDT

Graph Analytics in Healthcare Healthfirst's Journey from Idea to Production
At Healthfirst, we're on a journey to use graph to improve the health of our members. We began with a small pilot and scaled up to a KG, allowing us to analyze connections between members, claims, and providers: we’re using Neo4j for fraud detection, reducing costs, and improving care patterns.    

We'll start the talk with an overview of Healthfirst and our business model. We'll discuss the business problems outlined to us by stakeholders, and how we helped them frame their questions in a way that could be answered with graph analytics.

We'll review our very first use case (fraud detection), which started with a Neo4j trial off the side of our desks, the results of which were powerful and gained us the support to scale up to a Neo4j server and several licenses, expand the fraud detection use case, and launch a new use case for out of network utilization analysis.

We will focus the talk how we collaborated w/ business partners to execute the use cases and how the results are being used to improve quality of care for our members, but will also touch on the graph schema, data sources/ ingestion patterns, and graph algorithms used for the analyses."

Tuesday April 21, 2020 4:15pm - 4:55pm EDT
Room 3
Wednesday, April 22

11:45am EDT

1:30pm EDT

Interval Trees for Genomic Feature Retrieval in Neo4j
When you grow corn, yield is paramount. How much a corn seed will yield is encoded in its genome. If you visualized that genome in ASCII characters, you would see a seemingly random string of 25 million A, C, G, and T characters. Somewhere in that string you could find the interesting portions that governed how much corn that seed will yield, as well as others that provide useful properties to the corn plant making it resilient to different pressures.

At Bayer Crop Science, we track these interesting portions of the Corn genome as numeric intervals: start and stop indices within the string of 25 million characters. Our goal is to find relationships between intervals of interest to determine how to breed plants to produce the greatest yield and be resilient for different conditions around the world.

Our team, the developers of an internal genomics software stack within Bayer Crop Science, have been challenged to provide an API to our internal customers for efficiently finding these related intervals. In exploring solutions we came upon the interval tree data structure and implemented a means for storing and querying interval trees in Neo4j.

In this session we will discuss and demonstrate our approach as well as a set of Neo4j stored procedures created by Bayer Crop Science to effectively manage, retrieve and search interval tree data structures in large scale.

avatar for Jason Clark

Jason Clark

Lead Data Engineer, Bayer
Jason Clark is a Lead Data Engineer in Bayer's Crop Science business unit with a depth of experience in delivering fit-for-purpose data and software solutions for genetics and genomics datasets using software design principles. As a founding member of Crop Science's Product360 data... Read More →

Wednesday April 22, 2020 1:30pm - 2:10pm EDT
Room 4

2:15pm EDT

Worst (And Best) Practices for Implementing Graph Data Science
avatar for Sören Reichardt

Sören Reichardt

Graph Analytics Engineer, Neo4j

Martin Junghanns

Graph Analytics Engineer, Neo4j

Wednesday April 22, 2020 2:15pm - 2:55pm EDT
Room 2

3:30pm EDT

Molecules are Graphs! Lowering the Costs of Drug Discovery with Neo4j
Molecules are graphs! When you change part of this graph, swap one part out for another, add something in here, remove a little there you change how that molecule behaves and interacts with your body. This talk models alterations of molecular graphs as a network and applies it to drug discovery!

**Molecules are graphs!** The nodes are atoms and the edges are bonds. What happens when we take these graphs and their properties and put them in a graph database?

**Drug discovery is expensive**. The cost of producing a new drug is estimated to cost $2.6 Bn with a significant chunk of that cost coming from research and development of the drug molecule. Reducing down the cost of developing new therapeutics is key to helping patients. If researchers can make smarter decisions earlier in R&D the timelines and cost of bringing new drugs to patients will be reduced.

**Matched molecular pair** analysis (MMPA) is a method used in chemoinformatics that compares the properties of two molecules that differ only by a single chemical transformation (or graph alteration). An example of this would be if you were looking at two molecules that only differed by the substitution of a hydrogen atom for a fluorine atom. These two molecules whilst almost identical could have vastly different chemical properties. These **pairs** of compounds are known as **matched molecular pairs** (MMP), and any change in the properties of these molecules can be modelled on an edge linking them together.

This talk will explore how combining biological assay data with these chemical transformations in a **matched molecular pair knowledge graph (MMPKG)** using Neo4J allows for powerful exploration of chemical data. Through the use of Neo4j browser, cypher, and graph algorithms library new insights can be gathered to help answer the question on most medicinal chemists lips... _""which molecule should I make next?""_ and exactly how the MMPKG can be used and applied to real drug discovery problems to help drive this decision making process."

avatar for Matthew Sellwood

Matthew Sellwood

Product Manager, IQVia
Matthew is currently a Product Manager at IQVIA, and has worked across the life sciences and healthcare industry. He earned his Masters of Chemistry and his PhD at the University of Sheffield in the UK. In his thesis work he researched the discovery of novel therapeutics for ALS (Lou... Read More →

Wednesday April 22, 2020 3:30pm - 4:10pm EDT
Room 4

3:30pm EDT

Which Comes First, The Data Model or the Algorithm?
The intelligence community has applied link analysis to everything from modeling call records to financial transactions. But what happens when you apply the same techniques to the technical artifacts of cyberattacks? How do you avoid overthinking your data model when modeling such complex data?    

Cybersecurity may be the ideal domain for graph analysis as the relationships between technical attributes are often more critical than the discrete values. For example, an attribute's maliciousness often depends on the surrounding context. This can include the presence or absence of other attributes, the behaviors that those attributes exhibited, and the similarity of that behavior with other attack vectors. Graphs and contextual link analysis are very effective mechanisms for identifying potentially malicious activity.

However, before performing any type of analysis, you need to create the data model! While many graph data model examples are reasonably straightforward, the modeling of cybersecurity data can become quite complex. You would ideally model the attributes of any real world artifact (e.g., an email or file), the occasions in which those attributes were seen together, the behavior that those attributes exhibited when they were observed, and the source of your knowledge about those relationships. But how much knowledge do you really need to encode in the graph? When should you rely on path traversals rather than leveraging more advanced graph algorithms? Do you need to create a hyper graph in order to capture the source of relationships? What are the performance implications? How do you expire data from the graph? And finally, how do you make some decisions and actually build something?

avatar for Liz Maida

Liz Maida

CEO, UpLevel Security (McAfee)
Liz Maida is the Founder and CEO of Uplevel Security (recently acquired by McAfee). She was previously a Senior Director at Akamai Technologies and served in multiple executive roles focused on technology strategy and new product development. She played a lead role in Akamai’s initial... Read More →

Wednesday April 22, 2020 3:30pm - 4:10pm EDT
Room 2

4:15pm EDT

Graph Analytics in Anti-Money Laundering at Manulife
Money launderers use complicated schemes to wash dirty money, and financial institutes need to fight back with advanced techniques. Using graph analytics, it becomes feasible to connect dots in very complicated schemes that traditional methods cannot handle. Nowadays, money laundering involves leveraging different types of financial instruments with more complicated schemes. Preventing money laundering and terrorist financing has become high priority for financial institutions.

Traditional methods of monitoring for AML typically involve static, rule-based alerts built from previous experience. The biggest challenge faced by traditional approach is that money laundering schemes continuously changing in a way that is difficult to detect. With the power of graph analytics, it becomes feasible to connect dots in complicated schemes that traditional methods cannot handle. In this talk, we will share our experience using advanced rules traversing hidden patterns in the data, creating graph-based features for machine learning and finding similar patterns using graph embeddings.


Lin Gao

Data Scientist, Manulife

Wednesday April 22, 2020 4:15pm - 4:55pm EDT
Room 3