Installing NEO4J on Windows 7

November 7, 2014

Installing and running NEO4J on Windows is a little different from installing it on Linux.

NEO4J has provided two options to install NEO4J on Windows. One version of NEO4J is a dumbed down version that includes the necessary Java, that comes bundled in an .EXE file. The other Windows version comes in a .ZIP file that more closely resembles a Linux installation.

With NEO4J on Linux, a correct version of the Java JDK/SDK is assumed to be installed, but not so on Windows. Java JDK/SDK installations do have a certain learning curve, and required reading, not suitable for some users.

And then there is Java licensing. Apparently, it’s ok to bundle to Java with an executable file. But it’s not ok for Neo4j to redistribute the correct Java JDK/SDK on NEO4J’s download page, and allow users to download it.

If you are serious about NEO4J, you’ll probably want to use the full blown Windows .ZIP version. For one thing, it follows the standard paradigm and navigation. You’ll get the full functionality. And, it’s not that difficult to use. Read the rest of this entry »


Presentation: Graph Databases – Overview and Applications

June 6, 2014

In April 2014, I gave a presentation at my Alma Mater, the University of Winnipeg: Graph Databases – Overview and Applications

It was presented to the faculty and students of the Applied Computer Science Master’s program.

Most had not seen graph databases before. However, I expect that some of them will be using graphs in the near future.  🙂

A PDF of the presentation can be found here:

Read the rest of this entry »


NEO4J – Shortest Paths in the Cineasts Database

August 19, 2013

As I pointed out in my last post, Counting Many Paths Between Nodes In NEO4J
https://rodgersnotes.wordpress.com/2013/08/16/counting-many-paths-between-nodes-in-neo4j/

the more complex and interconnected the graph, the number of paths between nodes goes up exponentially. However, those queries did not use the built in functionality, ShortestPath.

Last winter, I watched the video, Cypher for SQL Professionals, with Andres Taylor. In the video, there were a number of queries to the Cineasts database, that did  use the ShortestPath functionality.

Bacon Lucy:

One of the queries in the video was Bacon Lucy. How many nodes of actors and movies separated the performers, Kevin Bacon, and Lucy Liu?   Read the rest of this entry »


Counting Many Paths Between Nodes In NEO4J

August 16, 2013

A parent node in a graph database can have many child nodes.

In addition to the many children, there can be many distinct and different paths to each single child node. An example of multiple paths to the same destination is to look at a subsection of Manhattan.

Small Part of Manhattan

How many ways are there to get from the intersection of West 23rd Street and Tenth Avene, to East 34th Street and Park Avenue South? You could do only one turn: go north, and turn west. Or go west and turn north. Or you could do many turns, zig-zagging your way through the streets on the grid. And those would just be the shortest paths. You could also take the long way, zig zag all the way to the south of Manhattan, and take a site seeing tour back north.

By contrast, if there is only one country road, there is often only one way to get from point A to point B.

The more nodes and connections there are, the more the possible paths there are between nodes. As I pointed out in my last post, https://rodgersnotes.wordpress.com/2013/08/12/using-neo4j-to-find-all-parents-child-paths/ the number of paths in a graph, can be many multiples more than the number of nodes, or relationships.

So, given a parent node, finding the distinct set of children can give odd results. Here are some pitfalls to be aware of.

———–

Listing All Child Generations of SYS.STANDARD:

start p = node:node_auto_index ( object_id = ‘1219’  )
return p     Read the rest of this entry »


Using NEO4J To Find Paths To All Parent Or Child Objects

August 12, 2013

If you don’t know already, there are many layers, upon layers of objects in Oracle. As I’ve wrote before a typical scenario could be:

TYPE
  TABLE
    VIEW
      FUNCTION
        PROCEDURE

When you make your own Oracle schemas, you also create many layers of objects, on top of other objects.

One classic problem I had as a DBA in, shall we say, “fast paced” environments, was that all the developers were making Oracle objects in a trial and error manner. This especially occured when their main skill set was not Oracle, but say, java. In these environments, it would not be unusual for a dozen objects to suddenly need to be moved into the test or production environment.

If you did not keep track of the objects, it would be a difficult time to figure out the exact order of operations to create them. If they were not done in the correct order, they would not compile or be created successfully.

SQL and RDBMS Do Not Represent Trees Well:

It’s partially because of this classic problem, that I did so much work creating SQL scripts to determine the order of operations. As I mentioned in my other posts,

https://rodgersnotes.wordpress.com/2012/01/05/finding-all-generations-of-an-objects-children/
https://rodgersnotes.wordpress.com/2011/12/27/scripts-to-find-object-dependencies/
https://rodgersnotes.wordpress.com/2011/12/29/the-parents-and-the-order-of-operations/

the scripts were never perfect. Much of the reason being that, the RDBMS objects are actually created in a tree structure. However, SQL returns information in rows and columns, a structure that is clearly not a tree.

It was not unusual to see the same object returned multiple times in the result set. An example might be an error log procedure that was used by every other procedure. If you didn’t know this, what would be the correct order of operations to create this object and the others?

For more on this classic problem, see this great slide presentation by
Lorenzo Alberton: Trees In The Database,
http://www.slideshare.net/quipo/trees-in-the-database-advanced-data-structures

It covers a number of attempts to represent trees in RDBMS/SQL. However, even with 128 slides, there is no clear or simple solution.

Using Cypher To Find All The Child Paths Of An Object:

After loading all DBA_OBJECTS into the NEO4J graph database, finding all the parents or children becomes pretty easy. First, start with an object.

start p = node:node_auto_index ( object_id = ‘3192’  )
return p

+————————————————————————————–+
| p                                                                                    |
+————————————————————————————–+
| Node[77758]{owner:”SYS”,object_name:”DBA_OBJECTS”,object_type:”VIEW”,object_id:3192} |
+————————————————————————————–+

Then run a 6 line Cypher query.   Read the rest of this entry »


Visualizing Almost Fifty Thousand DBA_OBJECTS In A Graph

August 6, 2013

What do almost fifty thousand Oracle objects look like?

Continuing my exploration of graph databases, I loaded every object from DBA_OBJECTS into a NEO4J graph and visualized it with Gephi.

That included all the objects from Oracle 11.2 DBA_OBJECTS, with the exception of the Java objects: Java Source, Java Class, Java Data. It also included some other schemas I’ve loaded into the database, such as Perfstat, SH, SCOTT, BI, etc. Altogether, 48,690 objects, and 61,710 relationships were inserted into NEO4J and then imported into Gephi.


Good Data Source
:

As I did this, I thought that DBA_OBJECTS makes a rather good dataset to experiment with. It’s freely available to any DBA. The data is not sensitive. There is lots of data:  tens of thousands of rows. The relationships between the objects are listed in DBA_DEPENDENCIES. Most all the data points represent a connected tree structure.  This is exactly what NEO4J and Gephi work well with.


OpenOrd Layout
:

Gephi has a number of layouts to work with. I used the OpenOrd layout in Gephi to visualize all the data. OpenOrd completed the layout quickly, in a few minutes. By contast, the Fruchterman Reingold layout did not complete even after I let it run all night. Perhaps Fruchterman Reingold is only good for small data sets.

I partitioned according to the Object_Type (Synonym, View, Index, Table), using the default colors Gephi provided. Then set the edge color to red.  See what the results looked like.


The Big Picture
:

OpenOrd showed stelliums of objects clustered together.

DBA_OBJECTS Visualized In OpenOrd Layout, wide

DBA_OBJECTS Visualized In OpenOrd Layout, wide

They remind me of constellations in the night sky.

Read the rest of this entry »


DBA_OBJECTS Tree – Modelled As A Graph in NEO4J, Visualized With Gephi

July 31, 2013

The structure for Oracle objects is a tree structure. Not a table structure in rows and columns.

As I wrote in my posts on the parents and children of database objects, a tree structure of Oracle object might be:

TYPE
   TABLE
      VIEW
         FUNCTION
            PROCEDURE

In another post, there is a query I wrote to show object dependencies. DBA_DEPENDENCIES will show the object’s parents. Or the children. But only one level up, or down.

Trying to see the tree structure using SQL queries is problematic. The reason being, the output from SQL is not a tree at all. And, as I wrote before, the very same object can be seen multiple times in the output.

NEO4J And Gephi:

Recently, I’ve been using the graph database, NEO4J. It is perfectly suited to create tree structures and store them in a database.

In Oracle 11.2, I snagged the data from DBA_OBJECTS and DBA_DEPENDENCIES, created Cypher commands to insert the nodes into NEO4J, and created the relationships/vertexes between the nodes. Then I used Gephi to visualize the graph with different layouts. See some different output.

All the parents and children of the View, SYS.DBA_OBJECTS, and the Synonym, PUBLIC.DBA_OBJECTS.

Yifan Hu Layout of DBA_OBJECTS

Yifan Hu Layout of DBA_OBJECTS

Using the Yifan Hu layout radiates all the nodes.  Read the rest of this entry »


Recommendation Engines: RDBMS and SQL, Versus Graph Database

May 30, 2013

For years, I’ve worked with Oracle, doing complex SQL. Recently I’ve been looking at the graph database, NEO4J.

Last night I was watching a NEO4J webinar about Graphs for Gaming.
It makes some interesting Cypher (NEO4J query language) queries for recommendation engines for a gaming company. The point being that certain tasks are much easier in Graph DB/Cypher than in RDBMS/SQL.

The more complex query was to take an individual gamer, Rik, and find other users/gamers, who:
– had worked at one the same companies as Rik
– spoke one of the same languages as Rik
– had NOT gamed with Rik yet

The 12 line Cypher query was:   Read the rest of this entry »


NEO4J: Finding Object Information Using the Web Interface

February 4, 2013

NEO4J has provided a very functional web interface to find information on objects.  If you run the following query,

START n = node(*)
WHERE has (n.name)
and (n.name=”Lucy Liu”)
return n

you will get one object back, the node for Lucy Liu. With the web interface, you can then click on the node and see much of the information about it. Right click, open in new tab.

Node 1000 Detail Lucy Liu

Node 1000 Detail Lucy Liu

Read the rest of this entry »


Getting Started With NEO4J: For Database Professionals

February 3, 2013

Why Use the NEO4J Graph Database?

About 10 years ago, I went to a BIO-IT conference, and looked a spiralling 3D model of a very large molecule made up of hundreds of atoms. I thought, “that does not look like the rows and columns in a relational database”. Read the rest of this entry »