hello my name is Kareem Wallach I am the community manager at neo4j and over the last three plus years that I've been working at neo I realized there's still a lot of software developers and data scientists that do not understand what a graph database is they kind of think they know they might have heard of it but they really don't understand it so I just want to take a couple minutes and explain it to you because I think it's vitally important that anyone works with any kind of capacity with data should understand what it actually
is because it really is a game changer so I'm gonna share my screen really quick I'm gonna try to make this as quick and easy as possible I all right sure my screen sharing and sharing it working on it I'm getting there hold on okay all right so I'm sharing my screen so so this is a example of like the new forge a browser right here so one thing that they think is really important to note very important to understand is that a great idea for J is a database okay it's not a visualization that
sits on another database it is an acid compliant transactional database so that's something that's very important technically it's in the no sequel category but it's very very different for most normalized relational databases in the sense that most relational databases store data in the shape of tables and joins neo stores the data in the shape of a graph and when I say graph I do not mean a chart I mean a graph theory graph like a network so in the data model inside of neo4j is like you have your nodes here your nodes are your nouns
that's person place thing location right these are your notes your nodes here and then and these notes could also have properties which is they could be like labeled right here so those your properties and then you could have relationships between those nodes and those relationships are and with an India for J relationships are actually first-class citizens meaning that they're just as important as the nodes themselves like how are things connected right so just like you could have different types of nodes you can also have different types of relationships you can put properties in those relationships
you could put values so they could be weighted you can have geospatial information you could have date and time so basically whenever you have data that's complexly connected and you want to understand how these things are connected to each other or maybe you want to find a shortest path right let's show this path from here to here or maybe you're looking for patterns in your data or maybe you're looking for something that's like a combination of patterns right it might be like if you're doing fraud detection it might not be one transaction that sets off
a flag but it's a transaction with these other patterns that in behavior that might be kind of you know raising the flag or if you're doing any kind of graph algorithm type of analysis whether you're like community detection or between the centrality and PageRank things like that like network related style queries like is the shape of your data network or is it a table so that's kind of like the big differences so there's a few things that I think that are really important to note about understanding a neo4j and what makes a difference so for
one thing its new for J is a native graph database and what that means is the underlying architecture of how the data is actually stored is not built on top of tables okay everything is built to support this type of data model this highly connected data model so when you're doing a query with most like relational databases you know you're indexing and then you make another hop and you're indexing there and then you're another joins all these joints you know these joins are always indexing and that's very computationally expensive when you're doing a lot of
hops I'm with neo4j when you do a query you index to find your initial starting point and then from there you're just basically chasing memory pointers which the computer happens to be pretty good at so the benefit of that not having to index every time you make a hop is pretty powerful the traversal time between doing one hop or twelve hops can be pretty consistent which is a pretty powerful thing when you're you know hopping through a very very highly connected so that's one thing that's very important to understand and then the other thing is
also the query language so you're probably used to sequel because that's like a pretty standard you know query language problem is is that when you're working with graph databases and graph type of problems sequel isn't going to cut it because sequel is not built for highly connected data so near forge a actually developed language it's an open language a lot of other companies are using it it's called cipher and cipher is a it's basically sequel for graphs it's more like where sequel is kind of like give me this like cipher you could be a little
bit more ambiguous it's based off of pattern matching like more network II kind of related queries which is really powerful so sequel is a decorative and it's also based off of ASCII art which it makes it really nice good to be able to see because it looks like what it actually represents so just a high-level overview of it your nodes here right in yellow blue is easier they're represented by parentheses see and then these relationships that are directional relationships between those nodes are an arrow literally an arrow and then brackets with it's almost funny to
like look at it you're like oh yeah that makes sense so here's another node right so it's like we're looking for a company company that develops a game but we're also looking for there's another relationship here on this side where a company also publishes the game where the company is Electronic Arts Wow right that was crazy it's crazy so I really like this blog post cuz I think it kind of shows the power of the date the model in general like the data model like being able to do a lot of these hops but I
think it also shows like the power of having a query language that can help you look into networks and graphs and patterns and pathways so I'll show you another example this one I also think is pretty powerful so here is an example of a video game recommendation right so you have here is a video game in the yellow node that's fallout 3 and here whoops too far yeah you have Borderlands 2 right and in the blue nodes they might be I don't know like consoles that the game is played on or whatever in the green
you have different themes of the game is it zombie and pirates and pain and war whatever and remember to like these relationships because they're first-class citizens you could also have values on them so they could be weighted you could have a lot of pirates and a little bit of zombie or whatever so that part is also something that you could take an account which can be pretty powerful and you're like you know trying to make queries based on weights and then in bread you can have like how is the game plate is it played multiple
player and you playing one player is a first-person or with a mouse or a joystick or a keyboard like how is the game actually played now here you if you have a user that likes both of these games here and you want to say okay I want to understand who my user is or maybe like find out what these two things have in common so I could find another game that has the most in common with these two games right just generally characteristics like what do they have in common if you were to do this
in sequel it would be a very extensive query because of all the different types of nodes that you have in the different types of relationships so with cypher it's actually very very straightforward it might actually laugh at how amazing it is but um so this is an example of the cipher query for this query know it's crazy three lines so you have here is like your node in parenthesis and then here is where your relationship would be in this case the relationship is undefined any relationship any direction you could put a star six in there
or something if you want to look six hops out or whatever there's all kinds of different things you could do a cipher but you're looking for characteristics and a game relationship between these two games damn right so this is like I think just like a really powerful example of like something you could do with not just the data model like being able to store the data and query it very quickly but also the ability to use cipher to kind of help you find the things that you know normally highly connected or distantly connected or you
know those like graphene related kind of problems um so I know you're probably already thinking about like oh where can I use this because this does sound kind of interesting I will tell you there's a lot of really cool use cases for it the very standard ones like recommendations big one fraud detection like network and IT management those are like the really big ones that kind of uh are frequently used a lot of like NLP related stuff like even if you think about linguistics like how we speak right that is all a graph there's this
thing that happens that we call it the graph epiphany basically when you start seeing in graphs everything you see graphs everywhere you can't get rid of it because everything is dependent on something else you know it's like all these intertwined connections of things but there are a lot of really really cool use cases actually probably one of my favorite things about my job is hearing about all the interesting use cases of how people use graphs in this case this one is Hetty oh if you go to head thought IO they have like they're the homepage
here so like Hecht dot IO this was this was created by Daniel Himelstein who's a postdoc researcher at University of Pennsylvania but you can play around with this stuff you know that he's got the ability for you to explore there's a can you afford a browser thing and there's guides that kind of walk you through and tell you what you can do with it we also have neo4j sandbox I probably say it probably one of the best places to start just to kind of get you thinking in graphs so if you go to neo4j comm
slash developer there's an online sandbox thing here you don't have to download anything there's pre-existing data sets you can just jump in you can follow the guide start playing around and then once you're ready um you know you can kind of dig into here a little bit there's like intro to graph databases YouTube series but we have all kinds of like you have graphic Adam E or you can go you know self-paced tutorial and stuff so yeah hopefully you're gonna thank me for this and not hate me for getting you addicted to graphs I will
also make sure I mentioned because this does happen to people who are in the graph epiphany they try to put graphs everywhere they don't belong everywhere they are everywhere they don't belong everywhere they are highly connected data problems but that said once you're addicted to graphs and you already have these amazing awesome use cases that you want to share with the rest of the world then you can come to me and then you know we could do something with the community so yeah hopefully you enjoyed the session and hopefully this was helpful see you soon
bye