Datomic Pull

Pull is a declarative way to make hierarchical (and possibly nested) selections of information about entities. Pull applies a pattern to a collection of entities, building a map for each entity. Pull is available

Patterns support forward and reverse attribute navigation, wildcarding, nesting, recursion, defaults, and limits on the results returned. Entities can be passed to pull by any kind of entity identifier: entity ids, idents, or lookup refs.

Examples

Example Data

The examples in this document use the mbrainz 1968-1973 sample set. Download and untar this file:

wget http://s3.amazonaws.com/mbrainz/datomic-mbrainz-1968-1973-backup-2014-10-15.tar -O mbrainz.tar
tar -xvf mbrainz.tar
bin/datomic restore-db file:///path/to/mbrainz-backup datomic:dev://localhost:4334/mbrainz-1968-1973

Example Code

You can follow the examples below in Java or Clojure code.

Example Entities

The examples below pull the following example entities:

EntityLookup RefEntity Id
ledZeppelin[:artist/gid #uuid "678d88b2-87b0-403b-b63d-5da7465aecc3"]17592186050305
mccartney[:artist/gid #uuid "ba550d0e-adac-4864-b88b-407cab5e76af"]17592186046385
darkSideOfTheMoon[:release/gid #uuid "24824319-9bb8-3d1e-a2c5-b8b864dafd1b"]17592186121276
dylanHarrisonSessions[:release/gid #uuid "67bbc160-ac45-4caf-baae-a7e9f5180429"]17592186063798
dylanHarrisonCDN/A17592186063799
concertForBanglaDesh[:release/gid #uuid "f3bdff34-9a85-4adc-a014-922eef9cdaa5"]17592186072003
ghostRidersN/A17592186063810

Feel free to explore your own musical interests.

API

The pull API takes three arguments

and returns a map of data about an entity as specified by the pattern. In object-oriented languages, the database argument will be the method target. In Java:

db.pull("[*]", ledZeppelin);

In Clojure:

(pull db '[*] led-zeppelin)

The pullMany API is similar, except that it takes a collection of entity identifiers, and returns a collection of maps. For example, in Clojure:

(pull-many db '[*] [led-zeppelin jimi-hendrix janis-joplin])

Pattern Grammar

'' literal
"" string
[] = list or vector
{} = map {k1 v1 ...}
() grouping
| choice
+ one or more
pattern            = [attr-spec+]
attr-spec          = attr-name | wildcard | map-spec | attr-expr
attr-name          = an edn keyword that names an attr
wildcard           = "*" or '*'
map-spec           = { ((attr-name | limit-expr) (pattern | recursion-limit))+ }
attr-expr          = limit-expr | default-expr
limit-expr         = [("limit" | 'limit') attr-name (positive-number | nil)]
default-expr       = [("default" | 'default') attr-name any-value]
recursion-limit    = positive-number | '...'

Terminals such as "limit" can be strings, but where languages have a symbol type you should prefer the idiomatic symbolic type, e.g. (limit :friends 100) in Clojure instead of ("limit" "friends" 100).

The grammar types above are each described in more detail below.

The patterns are written in the Extensible Data Notation (edn), which is programming language neutral. In programs, you can create patterns programmatically out of your basic language data types, e.g. Java Strings, Lists, and Maps. Alternatively, you can pass the pattern argument as a serialized edn string.

The results below are also written with edn, and they use an ellipsis where large results have been elided for brevity.

Pattern

A pattern is a list of Attribute Specifications.

Attribute Specifications

An attribute spec specifies an attribute to be returned, and (optionally) how that attribute should be returned. Attribute specs can be attribute names, wildcards, map specs, or attribute expressions. Each is described below.

Attribute Names

The simplest attribute spec is just an attribute name For example, the following pattern uses two attribute names to return an :artist/name and :artist/gid, pulling on ledZeppelin:

;; pattern
[:artist/name :artist/gid]

;; result
{:artist/gid #uuid "678d88b2-87b0-403b-b63d-5da7465aecc3", :artist/name "Led Zeppelin"}

Reverse Lookup

An underscore prefix (_) on the local name component of an attribute name causes the attribute to be navigated in reverse.

You can find British artists by pulling :country/GB.

;; pattern
[:artist/_country]

;; result
{:artist/_country [{:db/id 17592186045751} {:db/id 17592186045755} ...]}

Component Defaults

If an attribute is a reference type, pull will return a map for the referenced value. If the attribute is a component attribute, the map will contain all attributes of the related entity as well. For example, :medium/tracks is a component attribute, so pulling :release/media will also pull related tracks. The example below pulls darkSideOfTheMoon.

;; pattern
[:release/media]

;; result
  {:release/media
   [{:db/id 17592186121277,
     :medium/format {:db/id 17592186045741},
     :medium/position 1,
     :medium/trackCount 10,
     :medium/tracks
     [{:db/id 17592186121278,
       :track/duration 68346,
       :track/name "Speak to Me",
       :track/position 1,
       :track/artists [{:db/id 17592186046909}]}
      {:db/id 17592186121279,
       :track/duration 168720,
       :track/name "Breathe",
       :track/position 2,
       :track/artists [{:db/id 17592186046909}]}
      {:db/id 17592186121280,
       :track/duration 230600,
       :track/name "On the Run",
       :track/position 3,
       :track/artists [{:db/id 17592186046909}]}
      ...]}]}

Non-component Defaults

If the reference is to a non-component attribute, the default is to pull only the :db/id. For example, pulling :artist/_country of :country/GB returns only the entity ids for the artists from Great Britain:

;; pattern
[:artist/_country]

;; result
{:artist/_country [{:db/id 17592186045751} {:db/id 17592186045755} ...]}

Multiple Results

If an attribute spec might lead to more than one value, the pull result will be a list of the values found. These cases include:

  • All forward cardinality-many references
  • Reverse references for non-component attributes.

The :artist/_country examples above demonstrate this.

Since component attributes should always point back to a single owner, reverse component references will return a single value, not a list. For example, medium is a component of a release. So, navigating backwards via :release/_media will return only a single map representing the release,

;; pattern
[:release/_media] 

;; result
{:release/_media {:db/id 17592186063798}}

Map Specifications

You can explicitly specify the handling of referenced entities by using a map instead of just an attribute name. The keys in the map are keywords that name reference attributes, and the values are subpatterns for those references. In the example below, the :track/name attribute spec is a keyword, and is handled generically as described above. The :track/artists attribute appears in a map spec, causing the the :db/id and :artist/name to be sub-pulled for each artist on the track ghostRiders.

;; pattern
[:track/name {:track/artists [:db/id :artist/name]}]

;; result
{:track/artists [{:db/id 17592186048186, :artist/name "Bob Dylan"}
                 {:db/id 17592186049854, :artist/name "George Harrison"}],
 :track/name "Ghost Riders in the Sky"}

Nesting

Map specs can nest arbitrarily. The pattern below pulls concertForBanglaDesh's media's tracks' titles and artists' names:

 ;; pattern
[{:release/media
  [{:medium/tracks
    [:track/name {:track/artists [:artist/name]}]}]}]

;; result
 [{:medium/tracks
   [{:track/artists
     [{:artist/name "Ravi Shankar"} {:artist/name "George Harrison"}],
     :track/name "George Harrison / Ravi Shankar Introduction"}
    {:track/artists [{:artist/name "Ravi Shankar"}],
     :track/name "Bangla Dhun"}]}
  {:medium/tracks
   [{:track/artists [{:artist/name "George Harrison"}],
     :track/name "Wah-Wah"}
    {:track/artists [{:artist/name "George Harrison"}],
     :track/name "My Sweet Lord"}
    {:track/artists [{:artist/name "George Harrison"}],
     :track/name "Awaiting on You All"}
    {:track/artists [{:artist/name "Billy Preston"}],
     :track/name "That's the Way God Planned It"}]
   ...]}

Wildcard Specifications

The wildcard specification * pulls all attributes of an entity, and recursively pulls any component attributes:

;; pattern
[*]

;; result
{:release/name "The Concert for Bangla Desh",
 :release/artists [{:db/id 17592186049854}],
 :release/country {:db/id 17592186045504},
 :release/gid #uuid "f3bdff34-9a85-4adc-a014-922eef9cdaa5",
 :release/day 20,
 :release/status "Official",
 :release/month 12,
 :release/artistCredit "George Harrison",
 :db/id 17592186072003,
 :release/year 1971,
 :release/media
 [{:db/id 17592186072004,
   :medium/format {:db/id 17592186045741},
   :medium/position 1,
   :medium/trackCount 2,
   :medium/tracks
   [{:db/id 17592186072005,
     :track/duration 376000,
     :track/name "George Harrison / Ravi Shankar Introduction",
     :track/position 1,
     :track/artists [{:db/id 17592186048829} {:db/id 17592186049854}]}
    {:db/id 17592186072006,
     :track/duration 979000,
     :track/name "Bangla Dhun",
     :track/position 2,
     :track/artists [{:db/id 17592186048829}]}]}
  ...
  ]}

A map specification can be used in conjunction with the wildcard to provide subpatterns for specific attributes. In the example below, the wildcard pulls all attributes of the ghostRiders track, and an explicit map overrides the handling of :track/artists to pull :artist/name.

;; pattern
["*" {:track/artists [:artist/name]}]

;; result
{:db/id 17592186063810,
 :track/duration 218506,
 :track/name "Ghost Riders in the Sky",
 :track/position 11,
 :track/artists
 [{:artist/name "Bob Dylan"} {:artist/name "George Harrison"}]}

Attribute Expressions

Attribute specifications can be wrapped in expressions to control the attribute's default or limit. An attribute expression is a list containing a symbol, the attribute name, and expression-specific additional arguments.

Default Expressions

A default expression specifies a value to use if an attribute is not present for an entity. The following select reports a zero :artist/endYear for Paul McCartney, who is still active.

;; pattern
[:artist/name (default :artist/endYear 0)] 

;; result
{:artist/endYear 0, :artist/name "Paul McCartney"}

The default need not be of the same type as the attribute's values:

;; pattern
[:artist/name (default :artist/endYear "N/A")]

;; result
{:artist/endYear "N/A", :artist/name "Paul McCartney"}

Missing Attributes

In the absence of a default, attribute specifications that do not match an entity are omitted from that entity's result map:

;; pattern
[:artist/name :died-in-1966?]

;; result
{:artist/name "Paul McCartney"}

Limit Expression

A limit expression controls how many values will be returned for a cardinality-many attribute. To return only 10 of ledZeppelin's tracks:

;; pattern
[:artist/name (limit :track/_artists 10)]

;; result
{:artist/name "Led Zeppelin",
 :track/_artists
 [{:db/id 17592186057344}
  {:db/id 17592186057345}
  {:db/id 17592186057346}
  {:db/id 17592186057347}
  {:db/id 17592186057348}
  {:db/id 17592186057349}
  {:db/id 17592186057350}
  {:db/id 17592186057351}
  {:db/id 17592186057352}
  {:db/id 17592186057355}]}

Limit expressions can be the keys in map expressions, so you can get a limited set of ledZeppelin's track names with:

;; pattern
[{(limit :track/_artists 10) [:track/name]}]

;; result
{:track/_artists
 [{:track/name "Whole Lotta Love"}
  {:track/name "What Is and What Should Never Be"}
  {:track/name "The Lemon Song"}
  {:track/name "Thank You"}
  {:track/name "Heartbreaker"}
  {:track/name "Living Loving Maid (She's Just a Woman)"}
  {:track/name "Ramble On"}
  {:track/name "Moby Dick"}
  {:track/name "Bring It on Home"}
  {:track/name "Whole Lotta Love"}]}

A nil limit causes all values to be returned, and should be used with caution:

;; pattern
[:artist/name (limit :track/_artists nil)]

;; result
{:artist/name "Led Zeppelin",
 :track/_artists
 [{:db/id 17592186057344}
  {:db/id 17592186057345}
  {:db/id 17592186057346}
  {:db/id 17592186057347}
  {:db/id 17592186057348}
  {:db/id 17592186057349}
  {:db/id 17592186057350}
  {:db/id 17592186057351}
  {:db/id 17592186057352}
  {:db/id 17592186057355}
  {:db/id 17592186057356}
  {:db/id 17592186057357}
  {:db/id 17592186057358}
  {:db/id 17592186057359}
  {:db/id 17592186057360}
  {:db/id 17592186057361}
  {:db/id 17592186057362}
  {:db/id 17592186057363}
  {:db/id 17592186057366}
  {:db/id 17592186057367}
  ...]} ;; lots more

In the absence of an explicit limit, Datomic will return the first 1000 values for a cardinality-many attribute.

Recursive Specifications

A map specification can govern recursion. If a map specification has a numeric value, then the selector containing that specification will be applied recursively up to N times. For example, the following (non-mbrainz) specification will find the first and last names of friends-of-friends up to six degrees of separation from the original entity.

[:person/firstName :person/lastName {:person/friends 6}]

The ellipsis symbol () will follow recursive references to arbitrary depth, and should be used with caution. The following specification will find all reachable friends, which might be most of the friends in the entire database.

[:person/firstName :person/lastName {:person/friends ...}]

If a recursive subselect encounters an entity that it has already seen, it will not apply the pattern, instead returning only the :db/id of the entity. Thus recursive select is safe in the presence of cycles.

Empty Results

If there is no match between a pattern and an entity, then pull will return nil (not an empty map):

;; pattern
[:penguins]

;; entity
led-zeppelin

;; result
nil

Non-matching results will be removed entirely from collections. Even though ghost-riders has artists, none of those artists have :penguins:

;; pattern
[{:track/artists [:penguins]}]

;; entity
ghost-riders

;; result
nil

Pull API vs. Entity API

The Pull API has two important advantages over the Entity API:

Pull uses a declarative, data-driven spec, whereas Entity encourages building results via code. Data-driven specs are easier to build, compose, transmit and store. Pull patterns are smaller than entity code that does the same job, and can be easier to understand and maintain.

Pull API results match standard collection interfaces (e.g. Java maps) in programming languages, where Entity results do not. This eliminates the need for an additional allocation/transformation step per entity.