ted.neward@newardassociates.com | Blog: http://blogs.newardassociates.com | Github: tedneward | LinkedIn: tedneward
Find out what Neo4J is?
Why and when should you consider using it?
How do you use it?
"... an abstract, self-contained, logical definition of the objects, operators, and so forth, that together constitute the abstract machine with which users interact."
"The Third Manifesto", p11 (Date, Darwen, 1994)
A particular domain schema or collection of entity types
A particular "shape" to domain data models
Influences storage and relationships between entities
Influences performance of reads, writes, etc
Structures: how is the data formatted ("shaped")?
Constraints: what rules are enforced on the data (if any)?
Operations: what can we do with the data (retrieve, manipulate, etc)?
Relational: relations, tuples, and relvars
If you don't know what these are, you don't know your relational theory
strongly-typed, enforced by the database
Objects
capturing the object graphs that appear in O-O systems
strongly-typed, defined by an O-O language (not external schema)
Key-value pairs
CRUD based solely on the primary key; no joins
weakly- or untyped
Documents
collections of named fields holding data (or more collections)
weakly- or untyped
Graphs
capturing not just graph structures, but the "arcs" between nodes
graph-based query API/language
Columns
think tables, then turn your head 90 degrees
we group by columns, not by rows
Hierarchical
single-rooted strictly "one-way" acyclic graphs
generally, these are XML stores
"Network"
predecessor type to RDBMSs, finding some interest again
collection of values with "pointers" to related data
users manually "follow" the pointers in code
Hybrids of all the above ("multi-model")
any attempt to use one model from a different model is problematic
fundamental assumptions of one not present in the other, and vice versa
called an impedance mismatch
significant loss of functionality
attempts to mitigate through tooling
Object-Relational impedance mismatch is only the most obvious
which we attempt to solve through tooling (O/R-Ms)
Object-Hierarchical, Object-Document, Object-Graph, ...
the "object" on the left-hand side is because of our choice of languages
sort of begs the question: what if we had different languages?
four fundamental atoms:
nodes/vertices
labels (group nodes into sets)
relationships/arcs (between the nodes)
properties (name/value pairs on nodes and relationships)
how we use those depends on the problem
nodes are often "nouns"
arcs often connect nodes as modifiers or qualifiers
flexible
like document- or hierarchical data models
allows for easy(er) refactoring
data model is "whiteboard friendly"
data model matches the whiteboard exactly
captures data about relationships
in other models, this would need to be modeled
data can be associated with the relationship
cyclic relationships trivial to see, model
nodes can have any number of arc connections
concerns over scale
concerns over query performance
nonstandard query language
lack of schema enforcement (beyond nodes/arcs/etc)
Neo4J (https://neo4j.com)
TitanDB (http://thinkaurelius.github.io/titan/)
flexible data storage engines
Infogrid (http://www.infogrid.org)
Neo4J: graph-oriented
data model is basically "just" nodes and arcs
graph API (Visitor) for navigation/query
Java implementation
http://www.neo4j.org
Upshot: perfect for graph analysis and storage
Download: https://neo4j.com/download-neo4j-now/
Package managers:
macOS: brew install neo4j
Linux:
Windows:
neo4j-admin
CLI tool
restricted subset available through neo4j
neo4j
/neo4j-admin server
commands:
status
: return server status
console
: start the server in the foreground
start
: start server in the background
stop
: stop the background server
restart
: restart the background server
http://localhost:7474/browser/ brings up GUI dashboard/UI
this is the Editor, the primary (developer) interface
help topics
connection to Neo4J databases
execute Editor commands (colon-prefixed)
execute Cypher (query language) commands
each command creates a "result frame"
system
: administrative upkeep and management
neo4j
: default database, empty at start
Cypher uses a unique query language format
vaguely SQL-ish, but well-suited to graph needs
make sure to use Editor help features when stuck!
CREATE ({})
creates an empty node
Breakdown:
CREATE
: command to execute
parentheses define the node in question
brackets define the properties for the node
CREATE (Person {name: 'Fred', from: 'Bedrock', age: 30})
creates a single node with some properties
Breakdown:
Person
: node label
name
, from
, age
: property names
'Fred'
, 'Bedrock'
, 30
: property values
not mandatory
offer a means of categorization or "type"
often an entity/noun descriptor
akin to "table name" in RDBMS
used in queries as a filter
nodes can have multiple labels (colon-separated)
Create multiple nodes at once
CREATE (fred:Person { name: 'Fred' }), (wilma:Person { name: 'Wilma' }), (barney:Person { name: 'Barney' }), (betty:Person { name: 'Betty' })
CREATE (ee:Person { name: 'Emil', from: 'Sweden' }), (js:Person { name: 'Johan', from: 'Sweden', learn: 'surfing' }), (ir:Person { name: 'Ian', from: 'England', title: 'author' }), (rvb:Person { name: 'Rik', from: 'Belgium', pet: 'Orval' }), (ally:Person { name: 'Allison', from: 'California', hobby: 'surfing' }), (ee)-[:KNOWS {since: 2001}]->(js),(ee)-[:KNOWS {rating: 5}]->(ir), (js)-[:KNOWS]->(ir),(js)-[:KNOWS]->(rvb), (ir)-[:KNOWS]->(js),(ir)-[:KNOWS]->(ally), (rvb)-[:KNOWS]->(ally)
MATCH (fred:Person) WHERE fred.name = 'Fred' RETURN fred
Analysis:
MATCH
: command to execute
fred
: node variable binding
(fred:Person)
: node pattern, with label Person
WHERE
: predicate clause
RETURN
: which particular results to yield
returns all Person
-labeled nodes with a name
attribute of 'Fred'
MATCH (all) RETURN all
returns all nodes in the database
MATCH (br:Person) WHERE br.city = 'Bedrock' RETURN br
return all Persons living in Bedrock
use MATCH
to identify nodes
use CREATE
and "relationship arrow" to create relationship
Fred is married to Wilma (and vice versa)
MATCH (fred:Person), (wilma:Person) WHERE fred.name = 'Fred' AND wilma.name = 'Wilma' CREATE (fred)-[:MARRIED {since:1959}]->(wilma), (wilma)-[:MARRIED {since:1959}]->(fred)
This is obviously a very high-level view
further investigation is necessary!
prototypes are necessary!
allowing yourself time to fail with it is necessary!
Architect, Engineering Manager/Leader, "force multiplier"
http://www.newardassociates.com
http://blogs.newardassociates.com
Sr Distinguished Engineer, Capital One
Educative (http://educative.io) Author
Performance Management for Engineering Managers
Books
Developer Relations Activity Patterns (w/Woodruff, et al; APress, forthcoming)
Professional F# 2.0 (w/Erickson, et al; Wrox, 2010)
Effective Enterprise Java (Addison-Wesley, 2004)
SSCLI Essentials (w/Stutz, et al; OReilly, 2003)
Server-Based Java Programming (Manning, 2000)