NEO4J – Shortest Paths in the Cineasts Database

August 19, 2013

As I pointed out in my last post, Counting Many Paths Between Nodes In NEO4J
http://rodgersnotes.wordpress.com/2013/08/16/counting-many-paths-between-nodes-in-neo4j/

the more complex and interconnected the graph, the number of paths between nodes goes up exponentially. However, those queries did not use the built in functionality, ShortestPath.

Last winter, I watched the video, Cypher for SQL Professionals, with Andres Taylor. In the video, there were a number of queries to the Cineasts database, that did  use the ShortestPath functionality.

Bacon Lucy:

One of the queries in the video was Bacon Lucy. How many nodes of actors and movies separated the performers, Kevin Bacon, and Lucy Liu?   Read the rest of this entry »


Counting Many Paths Between Nodes In NEO4J

August 16, 2013

A parent node in a graph database can have many child nodes.

In addition to the many children, there can be many distinct and different paths to each single child node. An example of multiple paths to the same destination is to look at a subsection of Manhattan.

Small Part of Manhattan

How many ways are there to get from the intersection of West 23rd Street and Tenth Avene, to East 34th Street and Park Avenue South? You could do only one turn: go north, and turn west. Or go west and turn north. Or you could do many turns, zig-zagging your way through the streets on the grid. And those would just be the shortest paths. You could also take the long way, zig zag all the way to the south of Manhattan, and take a site seeing tour back north.

By contrast, if there is only one country road, there is often only one way to get from point A to point B.

The more nodes and connections there are, the more the possible paths there are between nodes. As I pointed out in my last post, https://rodgersnotes.wordpress.com/2013/08/12/using-neo4j-to-find-all-parents-child-paths/ the number of paths in a graph, can be many multiples more than the number of nodes, or relationships.

So, given a parent node, finding the distinct set of children can give odd results. Here are some pitfalls to be aware of.

———–

Listing All Child Generations of SYS.STANDARD:

start p = node:node_auto_index ( object_id = ’1219′  )
return p     Read the rest of this entry »


Using NEO4J To Find Paths To All Parent Or Child Objects

August 12, 2013

If you don’t know already, there are many layers, upon layers of objects in Oracle. As I’ve wrote before a typical scenario could be:

TYPE
  TABLE
    VIEW
      FUNCTION
        PROCEDURE

When you make your own Oracle schemas, you also create many layers of objects, on top of other objects.

One classic problem I had as a DBA in, shall we say, “fast paced” environments, was that all the developers were making Oracle objects in a trial and error manner. This especially occured when their main skill set was not Oracle, but say, java. In these environments, it would not be unusual for a dozen objects to suddenly need to be moved into the test or production environment.

If you did not keep track of the objects, it would be a difficult time to figure out the exact order of operations to create them. If they were not done in the correct order, they would not compile or be created successfully.

SQL and RDBMS Do Not Represent Trees Well:

It’s partially because of this classic problem, that I did so much work creating SQL scripts to determine the order of operations. As I mentioned in my other posts,

https://rodgersnotes.wordpress.com/2012/01/05/finding-all-generations-of-an-objects-children/
https://rodgersnotes.wordpress.com/2011/12/27/scripts-to-find-object-dependencies/
http://rodgersnotes.wordpress.com/2011/12/29/the-parents-and-the-order-of-operations/

the scripts were never perfect. Much of the reason being that, the RDBMS objects are actually created in a tree structure. However, SQL returns information in rows and columns, a structure that is clearly not a tree.

It was not unusual to see the same object returned multiple times in the result set. An example might be an error log procedure that was used by every other procedure. If you didn’t know this, what would be the correct order of operations to create this object and the others?

For more on this classic problem, see this great slide presentation by
Lorenzo Alberton: Trees In The Database,
http://www.slideshare.net/quipo/trees-in-the-database-advanced-data-structures

It covers a number of attempts to represent trees in RDBMS/SQL. However, even with 128 slides, there is no clear or simple solution.

Using Cypher To Find All The Child Paths Of An Object:

After loading all DBA_OBJECTS into the NEO4J graph database, finding all the parents or children becomes pretty easy. First, start with an object.

start p = node:node_auto_index ( object_id = ’3192′  )
return p

+————————————————————————————–+
| p                                                                                    |
+————————————————————————————–+
| Node[77758]{owner:”SYS”,object_name:”DBA_OBJECTS”,object_type:”VIEW”,object_id:3192} |
+————————————————————————————–+

Then run a 6 line Cypher query.   Read the rest of this entry »


Oracle Scripts To Recompile Invalid Objects

August 11, 2013

One of the classic problems a DBA can have is invalid objects. If a view becomes invalid, any other objects using that view, such as a procedure, also become invalid. Then when the procedure is called, it will not run.

Some scripts that have been useful to me are:

invalid.sql:

Read the rest of this entry »


Visualizing Almost Fifty Thousand DBA_OBJECTS In A Graph

August 6, 2013

What do almost fifty thousand Oracle objects look like?

Continuing my exploration of graph databases, I loaded every object from DBA_OBJECTS into a NEO4J graph and visualized it with Gephi.

That included all the objects from Oracle 11.2 DBA_OBJECTS, with the exception of the Java objects: Java Source, Java Class, Java Data. It also included some other schemas I’ve loaded into the database, such as Perfstat, SH, SCOTT, BI, etc. Altogether, 48,690 objects, and 61,710 relationships were inserted into NEO4J and then imported into Gephi.


Good Data Source
:

As I did this, I thought that DBA_OBJECTS makes a rather good dataset to experiment with. It’s freely available to any DBA. The data is not sensitive. There is lots of data:  tens of thousands of rows. The relationships between the objects are listed in DBA_DEPENDENCIES. Most all the data points represent a connected tree structure.  This is exactly what NEO4J and Gephi work well with.


OpenOrd Layout
:

Gephi has a number of layouts to work with. I used the OpenOrd layout in Gephi to visualize all the data. OpenOrd completed the layout quickly, in a few minutes. By contast, the Fruchterman Reingold layout did not complete even after I let it run all night. Perhaps Fruchterman Reingold is only good for small data sets.

I partitioned according to the Object_Type (Synonym, View, Index, Table), using the default colors Gephi provided. Then set the edge color to red.  See what the results looked like.


The Big Picture
:

OpenOrd showed stelliums of objects clustered together.

DBA_OBJECTS Visualized In OpenOrd Layout, wide

DBA_OBJECTS Visualized In OpenOrd Layout, wide

They remind me of constellations in the night sky.

Read the rest of this entry »


DBA_OBJECTS Tree – Modelled As A Graph in NEO4J, Visualized With Gephi

July 31, 2013

The structure for Oracle objects is a tree structure. Not a table structure in rows and columns.

As I wrote in my posts on the parents and children of database objects, a tree structure of Oracle object might be:

TYPE
   TABLE
      VIEW
         FUNCTION
            PROCEDURE

In another post, there is a query I wrote to show object dependencies. DBA_DEPENDENCIES will show the object’s parents. Or the children. But only one level up, or down.

Trying to see the tree structure using SQL queries is problematic. The reason being, the output from SQL is not a tree at all. And, as I wrote before, the very same object can be seen multiple times in the output.

NEO4J And Gephi:

Recently, I’ve been using the graph database, NEO4J. It is perfectly suited to create tree structures and store them in a database.

In Oracle 11.2, I snagged the data from DBA_OBJECTS and DBA_DEPENDENCIES, created Cypher commands to insert the nodes into NEO4J, and created the relationships/vertexes between the nodes. Then I used Gephi to visualize the graph with different layouts. See some different output.

All the parents and children of the View, SYS.DBA_OBJECTS, and the Synonym, PUBLIC.DBA_OBJECTS.

Yifan Hu Layout of DBA_OBJECTS

Yifan Hu Layout of DBA_OBJECTS

Using the Yifan Hu layout radiates all the nodes.  Read the rest of this entry »


Installing Gephi On 64 bit Machines

July 8, 2013

The instructions for installing Gephi can be found at:

https://gephi.org/users/install/

They seem simple enough. But I found out that on a 64 bit server, it’s really important that you have a full 64/64/64 bit combination. That represents the operating system, Java, and Gephi. Here are some hints to get it right.


Java
:

Be sure to download the right java file.   Read the rest of this entry »


%d bloggers like this: