Importing Forests into Neo4j

Posted by Michael Hunger on Apr 10, 2014 in cypher, neo4j

Sometimes you don’t see the forest for the trees. But if you do, you probably use a graph database.

Giant Tree

Trees are one of the simple graph datastructures, directed acyclic graphs (DAGs).

For our example we use a time-tree that we want to import into the database.

Data Volume

A quick soulver script (thanks Mark) later, we know how many nodes and rels (nodes-1), we will have to
import to represent a full year down to the second level.

1 year = 12 months = 365 days = 8.760 hours = 525.600 minutes = 31.536.000 seconds

So we have to import about 32M nodes and 32M relationships. Sounds good enough.

Read more…


Sampling A Neo4j Database

Posted by Michael Hunger on Mar 25, 2014 in cypher, neo4j

After reading the interesting blog post of my colleague Rik van Bruggen on “Media, Politics and Graphs” I thought it would be really cool to render it as a GrapGist. Especially, as he already shared all the queries as a GitHub Gist.


Unfortunately the dataset was a bit large for a sensible GraphGist representation, so I thought about means of extracting a smaller sample of his raw data that he made available (see his blog post for the link).

Read more…


Quickly create a 100k Neo4j graph data model with Cypher only

Posted by Michael Hunger on Mar 21, 2014 in cypher, neo4j

We want to run some test queries on an existing graph model but have no sample data at hand and also no input files (CSV,GraphML) that would provide it.

Why not create quickly it on our own just using cypher. First I thought about using Cypher to generate CSV files and loading them back, but it is much easier.

The domain is simple (:User)-[:OWN]→(:Product) but good enough for collaborative filtering or demographic analysis.

Read more…


Full-Text-Indexing (FTS) in Neo4j 2.0

Posted by Michael Hunger on Mar 17, 2014 in neo4j

With Neo4j 2.0 we got automatic schema indexes based on labels and properties for exact lookups of nodes on property values.

Fulltext and other indexes (spatial, range) are on the roadmap but not addressed yet.

For fulltext indexes you still have to use legacy indexes.

As you probably don’t want to add nodes to an index manually, the existing “auto-index” mechanism should be a good fit.

Read more…


Random Thoughts (Ordering) in Cypher

Posted by Michael Hunger on Mar 5, 2014 in cypher

Random Sort

There was a question on the Neo4j Google Group about returning results in a random order in Cypher. So I thought explaining it in a blog post (this) and an interactive GraphGist is better than just to answer the email.

Read more…


Cleaning Out Your Graph

Posted by Michael Hunger on Feb 28, 2014 in cypher

If you want to delete lots of data from a Neo4j database with Cypher

Just stop the server and delete the directory and start again

Fastest way with no leftovers, just delete db/data/graph.db and you’re done.

Cypher Statement before 2.1

“Unknown Error” or OutOfMemoryException is a symptom that your transaction size gets too big and consumes too much memory.

That is unrelated to your config, you just have to keep it in check.

If you want to delete elements in a batched way use something like this:

LIMIT 10000

Run until the result stays 0. This query will find at most 10000 nodes then find all their rels and then delete both. But how would you do it in Neo4j 2.1 ?

Read more…


State of the Cypher UNION

Posted by Michael Hunger on Feb 27, 2014 in cypher, neo4j

State of the Cypher UNION

Neo4j 2.0 introduced the UNION (ALL) clause which can join the results of 2 or more complete statements into a single result. Each of the statements is fully formed, it can contain result projection, pagination and

You need to have the same amount and names of columns to be joined in an UNION. UNION by default returns the distinct set of results.
Using UNION ALL will return the full results (and will be faster and less memory intensive).

Read more…


Your next Neo4j App is just a few Lines of Code away

Posted by Michael Hunger on Feb 18, 2014 in code, neo4j

The transactional http endpoint that was added to Neo4j 2.0 is really easy to use.

You can stream batches of cypher statements with their parameters to the server and receive the answers in a streaming fashion too.

The basic usage, of one transaction per batched request can be used with 3 lines of javascript.

Read more…


On Using Neo4j-Shell with Basic-Auth over Http on a Remote Server

Posted by Michael Hunger on Feb 18, 2014 in code, neo4j

Accessing your beloved Neo4j-Shell via RMI works ok on localhost or in your intranet. But over the internet you don’t really want to expose RMI ports.

So most installations of Neo4j, e.g. on EC2 use basic auth as simplest security measure.
Also the Neo4j hosting providers like GrapheneDB.com or GraphHost.com offer basic auth by default to “secure” access your Neo4j instance, hopefully also SSL soon.

How do you access the Neo4j Shell on the server from your usual terminal command line ?

Read more…


Exploring an Unknown Neo4j Database

Posted by Michael Hunger on Feb 17, 2014 in code, neo4j

Sometimes when you work with the Neo4j community you get a database handed that you don’t know anything about.
Then it is handy to get an idea what’s in there. Which kinds of node-labels are used, what relationship-types connect these nodes and which properties are floating around.

Usually all those answers are just a Cypher Statement away:

Labels and their occurrence:

MATCH (n) RETURN labels(n)[0],count(*);

What is connected and how (stolen from Neo4j-Browser):

MATCH (a)-[r]->(b)
RETURN labels(a)[0] AS This, type(r) as To, labels(b)[0] AS That, count(*) AS Count

A Problem: Large Database

Read more…

Copyright © 2007-2014 Better Software Development All rights reserved.
Multi v1.4.5 a child of the Desk Mess Mirrored v1.4.6 theme from BuyNowShop.com.