NEO4J – Shortest Paths in the Cineasts Database

August 19, 2013

As I pointed out in my last post, Counting Many Paths Between Nodes In NEO4J
https://rodgersnotes.wordpress.com/2013/08/16/counting-many-paths-between-nodes-in-neo4j/

the more complex and interconnected the graph, the number of paths between nodes goes up exponentially. However, those queries did not use the built in functionality, ShortestPath.

Last winter, I watched the video, Cypher for SQL Professionals, with Andres Taylor. In the video, there were a number of queries to the Cineasts database, that did  use the ShortestPath functionality.

Bacon Lucy:

One of the queries in the video was Bacon Lucy. How many nodes of actors and movies separated the performers, Kevin Bacon, and Lucy Liu?   Read the rest of this entry »


Counting Many Paths Between Nodes In NEO4J

August 16, 2013

A parent node in a graph database can have many child nodes.

In addition to the many children, there can be many distinct and different paths to each single child node. An example of multiple paths to the same destination is to look at a subsection of Manhattan.

Small Part of Manhattan

How many ways are there to get from the intersection of West 23rd Street and Tenth Avene, to East 34th Street and Park Avenue South? You could do only one turn: go north, and turn west. Or go west and turn north. Or you could do many turns, zig-zagging your way through the streets on the grid. And those would just be the shortest paths. You could also take the long way, zig zag all the way to the south of Manhattan, and take a site seeing tour back north.

By contrast, if there is only one country road, there is often only one way to get from point A to point B.

The more nodes and connections there are, the more the possible paths there are between nodes. As I pointed out in my last post, https://rodgersnotes.wordpress.com/2013/08/12/using-neo4j-to-find-all-parents-child-paths/ the number of paths in a graph, can be many multiples more than the number of nodes, or relationships.

So, given a parent node, finding the distinct set of children can give odd results. Here are some pitfalls to be aware of.

———–

Listing All Child Generations of SYS.STANDARD:

start p = node:node_auto_index ( object_id = ‘1219’  )
return p     Read the rest of this entry »


Using NEO4J To Find Paths To All Parent Or Child Objects

August 12, 2013

If you don’t know already, there are many layers, upon layers of objects in Oracle. As I’ve wrote before a typical scenario could be:

TYPE
  TABLE
    VIEW
      FUNCTION
        PROCEDURE

When you make your own Oracle schemas, you also create many layers of objects, on top of other objects.

One classic problem I had as a DBA in, shall we say, “fast paced” environments, was that all the developers were making Oracle objects in a trial and error manner. This especially occured when their main skill set was not Oracle, but say, java. In these environments, it would not be unusual for a dozen objects to suddenly need to be moved into the test or production environment.

If you did not keep track of the objects, it would be a difficult time to figure out the exact order of operations to create them. If they were not done in the correct order, they would not compile or be created successfully.

SQL and RDBMS Do Not Represent Trees Well:

It’s partially because of this classic problem, that I did so much work creating SQL scripts to determine the order of operations. As I mentioned in my other posts,

https://rodgersnotes.wordpress.com/2012/01/05/finding-all-generations-of-an-objects-children/
https://rodgersnotes.wordpress.com/2011/12/27/scripts-to-find-object-dependencies/
https://rodgersnotes.wordpress.com/2011/12/29/the-parents-and-the-order-of-operations/

the scripts were never perfect. Much of the reason being that, the RDBMS objects are actually created in a tree structure. However, SQL returns information in rows and columns, a structure that is clearly not a tree.

It was not unusual to see the same object returned multiple times in the result set. An example might be an error log procedure that was used by every other procedure. If you didn’t know this, what would be the correct order of operations to create this object and the others?

For more on this classic problem, see this great slide presentation by
Lorenzo Alberton: Trees In The Database,
http://www.slideshare.net/quipo/trees-in-the-database-advanced-data-structures

It covers a number of attempts to represent trees in RDBMS/SQL. However, even with 128 slides, there is no clear or simple solution.

Using Cypher To Find All The Child Paths Of An Object:

After loading all DBA_OBJECTS into the NEO4J graph database, finding all the parents or children becomes pretty easy. First, start with an object.

start p = node:node_auto_index ( object_id = ‘3192’  )
return p

+————————————————————————————–+
| p                                                                                    |
+————————————————————————————–+
| Node[77758]{owner:”SYS”,object_name:”DBA_OBJECTS”,object_type:”VIEW”,object_id:3192} |
+————————————————————————————–+

Then run a 6 line Cypher query.   Read the rest of this entry »


Recommendation Engines: RDBMS and SQL, Versus Graph Database

May 30, 2013

For years, I’ve worked with Oracle, doing complex SQL. Recently I’ve been looking at the graph database, NEO4J.

Last night I was watching a NEO4J webinar about Graphs for Gaming.
It makes some interesting Cypher (NEO4J query language) queries for recommendation engines for a gaming company. The point being that certain tasks are much easier in Graph DB/Cypher than in RDBMS/SQL.

The more complex query was to take an individual gamer, Rik, and find other users/gamers, who:
– had worked at one the same companies as Rik
– spoke one of the same languages as Rik
– had NOT gamed with Rik yet

The 12 line Cypher query was:   Read the rest of this entry »