During this year’s AWS Summit TLV, I attended the GraphDB presentation on the innovation track. What made this session stand out was that the track was structured as an application creation tutorial.
The first two parts covered the basics of Serverless applications, user management and data collection from external sources. The third part was about connecting the users and the external data to provide meaningful insights into their relationships (mutual friends, same workplaces, etc.), and for this purpose, they chose AWS Neptune, a Graph database (which was announced last year at Re: Invent 2017 and currently still in preview).
Is AWS Neptune right for me?
When considering AWS Neptune one has to make sure a graph database is the best storage structure for the dataset. A Graph DB is used best for highly connected datasets, where many of the data-points connect to many others, in multiple relations. The easiest way to determine if a graph DB is the better option vs a relational DB is to try and model the dataset and connections using relational DB schemas. If you have many tables representing the different object and more tables representing the connections between all of these objects, then a graph database may be the best option to exhibit this dataset. It may also help you discover new connections you would have never seen using a relational DB.
Graph databases come in two technologies for storing graph data:
- Property Graph.
Graph databases today support only one of the storage technologies. Even when some of the products do support both query languages, they do it by translating one language to one that is native to their storage engine incurring a performance penalty.
Neptune is optimised for both query languages, which allows you to choose the one you prefer, and more importantly, it allows you to switch query languages without choosing a different graph database product or suffering performance degradation.
Neptune boasts millisecond-level response on billions of connections, and as a cloud service, it’s both automated and highly scalable.
You can insert data into Neptune in code, connect to the DB, and run multiple addVertex and addEdge commands or use the Loader to load data from S3, both Gremlin and RDF structure are supported.
You can easily query Neptune from the AWS Console, using the Cloud 9 IDE.
Gremlin query using AWS Neptune
The presentation was the perfect balance of information, there was a thorough overview of Neptune and a good mix of code which kept things interesting. All in all, AWS Neptune looks very promising but I believe that most traditional applications will continue their work on relational DBs. However, I think we’ll soon have real world applications using Neptune at scale. We haven’t seen them yet, but it’s just a matter of time.