Query Reference

This topic documents the data format for Datomic datalog queries and rules.

Notes on the Examples

  • The ellipsis ... is used in query results to shows that a large result set has been elided for brevity.
  • Entity ids vary from one database to another, so do not expect to see the same entity ids as shown here.

Query Grammar

Syntax Used In Grammar

'' literal
"" string
[] = list or vector
{} = map {k1 v1 ...}
() grouping
| choice
? zero or one
+ one or more

Query Arg Grammar

query                      = [find-spec with-clause? inputs? where-clauses?]
find-spec                  = ':find' find-rel
find-rel                   = find-elem+
find-elem                  = (variable | pull-expr | aggregate)
pull-expr                  = ['pull' variable pattern]
pattern                    = (input-name | pattern-data-literal)
aggregate                  = [aggregate-fn-name fn-arg+]
fn-arg                     = (variable | constant | src-var)
with-clause                = ':with' variable+
where-clauses              = ':where' clause+
inputs                     = ':in' (src-var | variable | pattern-var | rules-var)+
src-var                    = symbol starting with "$"
variable                   = symbol starting with "?"
rules-var                  = the symbol "%"
plain-symbol               = symbol that does not begin with "$" or "?"
pattern-var                = plain-symbol
and-clause                 = [ 'and' clause+ ]
expression-clause          = (data-pattern | pred-expr | fn-expr | rule-expr)
rule-expr                  = [ src-var? rule-name (variable | constant | '_')+]
not-clause                 = [ src-var? 'not' clause+ ]
not-join-clause            = [ src-var? 'not-join' [variable+] clause+ ]
or-clause                  = [ src-var? 'or' (clause | and-clause)+]
or-join-clause             = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]
rule-vars                  = [variable+ | ([variable+] variable*)]
clause                     = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause)
data-pattern               = [ src-var? (variable | constant | '_')+ ]
constant                   = any non-variable data literal
pred-expr                  = [ [pred fn-arg+] ]
fn-expr                    = [ [fn fn-arg+] binding]
binding                    = (bind-scalar | bind-tuple | bind-coll | bind-rel)
bind-scalar                = variable
bind-tuple                 = [ (variable | '_')+]
bind-coll                  = [variable '...']
bind-rel                   = [ [(variable | '_')+] ]

See the pull pattern grammar for the description of the pattern-data-literal rule.

Query Rule Grammar

Note that the rule grammar reuses some terms from the query grammar above.

rule                       = [ [rule-head clause+]+ ]
rule-head                  = [rule-name rule-vars]
rule-name                  = plain-symbol

Queries

query                      = [find-spec with-clause? inputs? where-clauses?]

A query consists of:

  • a find-spec that specificies variables and aggregates to return
  • an optional with-clause to control how duplicate find values are handled
  • an optional inputs clause that names the databases, data, and rules available to the query engine
  • optional where-clauses that constrain and transform data

At least one of inputs or where-clauses must be specified.

Query Example

This query limits datoms to :artist/name "The Beatles", and returns the entity ids for such results:

;; query
[:find ?e
 :where [?e :artist/name "The Beatles"]]

;; args
[db]

;; results
[[26757714973567138]]

Find Specs

find-spec                  = ':find' find-rel
find-rel                   = find-elem+
find-elem                  = (variable | pull-expr | aggregate)

A find-spec is the literal :find followed by one or more find-elems, which can be

  • a variable that returns variables directly
  • a pull-expr that hierarchically selects data about an entity variable
  • an aggregate that summarizes all values of a variable

The order of find-elems determines the order variables appear in a result tuple.

Variables

variable                   = symbol starting with "?"

A variable is a symbol that begins with ?. In a find-spec, variables control which variables are returns, and what order those variables appear in the result tuple.

Find Variables Example

The query below specifies that the result tuples should contain the track name and duration. Note that the ?track and ?e variables are used in the query but are not returned

;; query
[:find ?name ?duration
 :where [?e :artist/name "The Beatles"]
        [?track :track/artists ?e]
        [?track :track/name ?name]
        [?track :track/duration ?duration]]

;; args
[db]

;; results
[["Here Comes the Sun" 186000]
 ["Come Together" 257000]
 ["Hey Jude" 428000]
 ...]

Pull Expressions

pull-expr                  = ['pull' variable pattern]
pattern                    = (input-name | pattern-data-literal)

A pull expression returns information about a variable as specified by a pattern. Pull expressions are fully described in the Pull reference.

Finding Pull Expression Example

Rather than returning just a variable, this query uses a pull expression to return specific details about The Beatles:

;; query
[:find (pull ?e [:artist/startYear :artist/endYear])
 :where [?e :artist/name "The Beatles"]]

;; args
[db]

;; results
[[#:artist{:startYear 1957, :endYear 1970}]]

Aggregates

aggregate                  = [aggregate-fn-name fn-arg+]
fn-arg                     = (variable | constant | src-var)

An aggregate function appears in the find clause and transforms a result. Aggregate functions can take variables, constants, or src-vars as arguments.

Aggregates appear as lists in a find-spec. Query variables not in aggregate expressions will group the results and appear intact in the result.

Example Aggregate

This query binds ?a ?b ?c ?d, then groups by ?a and ?c, and produces a result for each aggregate expression for each group, yielding 5-tuples.

[:find ?a (min ?b) (max ?b) ?c (sample 12 ?d)
 :where ...]

Aggregates Returning a Single Value

The aggregation functions that return a single value are listed below, and all behave as their names suggest.

  • min and max
    The following query finds the smallest and largest track lengths:
    ;; query 
    [:find (min ?dur) (max ?dur)
     :where [_ :track/duration ?dur]]
    
    ;; inputs
    db
    
    ;;result 
    [[3000 3894000]]
    

    The min and max aggregation functions support all database types (via comparators), not just numbers.

  • sum
    The following query uses sum to find the total number of tracks on all media in the database.
    ;; query
    [:find (sum ?count) 
     :with ?medium
     :where [?medium :medium/trackCount ?count]]
    
    ;; inputs
    db
    
    ;; result
    [[100759]]
    
  • count and count-distinct
    More than one artist can have the same name. The following query uses count to report the total number of artist names, and count-distinct to report the total number of unique artist names.
    ;; query
    [:find (count ?name) (count-distinct ?name)
     :with ?artist
     :where [?artist :artist/name ?name]]
    
    ;; inputs
    db
    
    ;; result
    [[4601 4588]]
    

    Note the use of a with-clause so that equal names do not coalesce.

  • Statistics: median, avg, variance, and stddev
    Are musicians becoming more verbose when naming songs? The following query reports the median, avg, and stddev of song title lengths (in characters), and includes year in the find set to break out the results by year.
    ;; query
    [:find ?year (median ?namelen) (avg ?namelen) (stddev ?namelen)
     :with ?track
     :where [?track :track/name ?name]
            [(count ?name) ?namelen]
            [?medium :medium/tracks ?track]
            [?release :release/media ?medium]
            [?release :release/year ?year]]
    
    ;; inputs 
    db
    
    ;; result
    [[1968 16 18.92181098534824 12.898760656290333] 
     [1969 16 18.147895557287608 11.263945894977244] 
     [1970 15 18.007481296758105 12.076103750401026] 
     [1971 15 18.203682039283294 13.715552693168124] 
     [1972 15 17.907170949841063 11.712941060399375] 
     [1973 16 18.19300100438759 12.656827911058622]]
    

Aggregates Returning Collections

Where n is specified, fewer than n items may be returned if not enough items are available.

  • distinct
    The distinct aggregate returns the set of distinct values in the collection.
    ;; query
    [:find (distinct ?sortName)
     :with ?artist
     :where [?artist :artist/name "Fire"]
            [?artist :artist/sortName ?sortName]]
    
    ;; inputs
    db
    
    ;; result
    [[#{"Fire"}]]
    
  • min n and max n
    The min n and max n aggregates return up to n least/greatest items. The following query returns the five shortest and five longest track lengths in the database.
    ;; query
    [:find (min 5 ?millis) (max 5 ?millis)
     :where [?track :track/duration ?millis]]
    
    ;; inputs 
    db
    
    ;; result
    [[[3000 4000 5000 6000 7000] 
      [3894000 3407000 2928000 2802000 2775000]]]
    
  • rand n and sample n
    The rand n aggregate selects exactly n items with potential for duplicates, and the sample n aggregate returns up to n distinct items.

    The following query returns two random and two sampled artist names.

    ;; query
    [:find (rand 2 ?name) (sample 2 ?name)
     :where [_ :artist/name ?name]]
    
    ;; inputs
    db
    
    ;; result
    [[("Four Tops" "Ethel McCoy") 
     ["Gábor Szabó" "Zapata"]]]
    

Inputs

inputs                     = ':in' (src-var | variable | pattern-var | rules-var)+

The inputs clause names and orders the inputs to a query. Inputs can be

  • a database name, i.e. a symbol starting with $
  • a variable, i.e. a symbol starting with ?
  • a pattern var, i.e. a plain symbol
  • the rules var, i.e. the symbol %

A query has as many inputs as it has :args values, and the inputs bind the :args values for use inside the query.

Inputs Example

The query below takes the artist name as an input, so that this parameterized query can be re-used with different artist names.

Inside the query, $ is bound to db, and ?name is bound to "The Beatles".

;; query
'[:find (pull ?e [:artist/startYear :artist/endYear])
  :in $ ?name
  :where [?e :artist/name ?name]]

;; args
[db "The Beatles"]

;; results
[[#:artist{:startYear 1957, :endYear 1970}]]

Default Inputs

Most queries operate against a single database. So as a convenience, the inputs clause can be elided, and will default to a single database whose name is the dollar sign $.

For example, the following three queries are equivalent:

;; use $data to name the database
[:find ?e 
 :in $data ?age 
 :where [$data ?e :age ?age]]

;; use the shorter name $, which can be omitted from where clauses
[:find ?e 
 :in $ ?age 
 :where [?e :age ?age]]

;; with only $ as input, the :in clause can be dropped
[:find ?e 
 :where [?e :age ?age]]

Pattern Inputs

An input can be a pattern var, specifying a pattern to be used in pull expressions in the find clause.

The query below binds pattern to the artist's start year and end year.

;; query
'[:find (pull ?e pattern)
  :in $ ?name pattern
  :where [?e :artist/name ?name]]

;; args
[db "The Beatles" [:artist/startYear :artist/endYear]]

;; results
[[#:artist{:startYear 1957, :endYear 1970}]]

Binding Forms

A binding form tells how to map data onto variables. A variable name like ?artist-name is the simplest kind of binding, assigning its value directly to variable. Other forms supprt destructuring the data into a tuple, a collection, or a relation:

Binding FormBinds
?ascalar
[?a ?b]tuple
[?a …]collection
[ [?a ?b ] ]relation

Tuple Binding

bind-tuple                 = [ (variable | '_')+]

A tuple binding binds a set of variables to a single value each, passed in as a collection. The query below binds both artist name and release name to find the entity ids for releases of John Lennon's Mind Games:

;; query
[:find ?release
 :in $ [?artist-name ?release-name]
 :where
 [?artist :artist/name ?artist-name]
 [?release :release/artists ?artist]
 [?release :release/name ?release-name]]

;; args
[db ["John Lennon" "Mind Games"]]

;; result
#{[17592186157686]
  [17592186157672]
  [17592186157690]
  [17592186157658]}

Collection Binding

bind-coll                  = [variable '...']

A collection binding binds a single variable to multiple values passed in as a collection. This can be used to ask "or" questions like "What releases are associated with either Paul McCartney or George Harrison?"

;; query
[:find ?release-name
 :in $ [?artist-name ...]
 :where [?artist :artist/name ?artist-name]
        [?release :release/artists ?artist]
        [?release :release/name ?release-name]]

;; args
[db ["Paul McCartney" "George Harrison"]]

;; result
#{["My Sweet Lord"]
  ["Electronic Sound"]
  ["Give Me Love (Give Me Peace on Earth)"]
  ["All Things Must Pass"]
  ...}

Relation Binding

bind-rel                   = [ [(variable | '_')+] ]

A relation binding is fully general, binding multiple variables positionally to a relation (collection of tuples) passed in. This can be used to ask "or" questions involving multiple variables. For example, what releases are associated with either John Lennon's Mind Games or Paul McCartney's Ram?

;; query
[:find ?release
 :in $ [[?artist-name ?release-name]]
 :where [?artist :artist/name ?artist-name]
        [?release :release/artists ?artist]
        [?release :release/name ?release-name]]

;; args
[db [["John Lennon" "Mind Games"]
      ["Paul McCartney" "Ram"]]]

;; result
#{[17592186157686]
  [17592186157672]
  [17592186157690]
  [17592186157658]
  [17592186063566]}

Where Clauses

where-clauses              = ':where' clause+
clause                     = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause)
expression-clause          = (data-pattern | pred-expr | fn-expr | rule-expr)

A where clause limits the results returned. The most common kind of where clause is a data pattern that is matched against datoms in the database, but there are many other kinds of clauses to support negation, disjunction, predicates, and functions.

Data Patterns

data-pattern               = [ src-var? (variable | constant | '_')+ ]

A data pattern is a tuple that begins with an optional src-var which binds to a relation. The src-var is followed one or more elements that match the tuples of that relation in order. The relation is almost always a Datomic database, so the components are E, A, V, Tx, and Op. The elements of data pattern can be

  • variables, which unify and bind to values
  • constants, which limit results to tuples that match the constant
  • the blank _ which matches anything

The example below has a single data pattern which operates as follows:

  • $mbrainz binds to the db argument
  • the constant :artist/name limits results to datoms with that value in their Attribute (A) position
  • the constant "The Beatles" limits results to datoms with that value in their Value (V) position
  • the variables ?e, ?tx?, and ~?op bind to those positions in the matching datoms, if any
;;query
[:find ?e ?tx ?op
 :in $mbrainz
 :where [$mbrainz ?e :artist/name "The Beatles" ?tx ?op]]

;; args
[db]

;; results
[[26757714973567138 13194139533421 true]]

Blanks

Sometimes you don't care about certain elements of the tuples in a query, but you must put something in the clause in order to get to the positions that you do care about. The underscore symbol (_) is a blank placeholder, and matches anything without binding or unifying.

For example, if you wanted a random artist name, you would need a data pattern that that talked about A and V, but you would not care about the E component which precedes them. The following query uses the blank in the E position:

;; query
[:find (sample 1 ?name)
 :where [_ :artist/name ?name]]

;; args
[db]

;; results
[[["Aerosmith"]]]

Do not use a dummy variable instead of the blank. This will make the query engine do extra work by tracking binding and unification for a variable that you never intend to use. It will also make human readers do extra work, puzzling out that the dummy variable is intentionally not used.

Implicit Blanks

In data patterns, you should elide any trailing components you don't care about, rather than explicitly padding with blanks. The previous example already demonstrates this, by omitting the Tx and Op components from the pattern:

;; unnecessary trailing blanks
[_ :artist/name ?name _ _] 

;; better
[_ :artist/name ?name] 

Predicates

pred-expr                  = [ [pred fn-arg+] ]

A predicate is an arbitrary Java or Clojure function. Predicates must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.

Predicates are invoked against variables are that are already bound to further constrain the result set. If the predicate returns false or nil for a set of variable bindings, that set is removed.

Predicate Example

The query below uses the built-in predicates <= and < to limit the reslts to artists whose name sorts less than or equal to "Q" and less than "R", i.e. the artists whose name begins with "Q":

;; query
[:find ?name
 :where
 [_ :artist/name ?name]
 [(<= "Q" ?name)]
 [(< ?name "R")]]

;; args
[db]

;; result
[["Quiet World"]
 ["Queen"]
 ["Quintessence"]
 ...]

You can use any pure function from the clojure.core namespace as a predicate.

Range Predicates

The predicates =, !=, <=, <, >, and >= are special, in that they take direct advantage of Datomic's AVET index. This makes them much more efficient than equivalent formulations using ordinary predicates. For example, the "artists whose name starts with 'Q'" query shown above is much more efficient than an equivalent version using starts-with?

;; fast -- uses AVET index
[(<= "Q" ?name)]
[(< ?name "R")]

;; slower -- must consider every value of ?name
[(clojure.string/starts-with? ?name "Q")]

Unlike their Clojure equivalents, the Datomic range predicates require exactly two arguments.

The section Built-in Predicates and Functions lists all built-in predicates.

Functions

fn-expr                    = [ [fn fn-arg+] binding]

Queries can call arbitrary Java or Clojure functions. Such functions must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.

Functions are invoked against variables are that are already bound, and their results are interpreted via binding forms to bind additional variables.

Function Example

The example below uses the division function quot call to convert track lengths from milliseconds to minutes:

;; query
[:find ?track-name ?minutes
 :in $ ?artist-name
 :where
 [?artist :artist/name ?artist-name]
 [?track :track/artists ?artist]
 [?track :track/duration ?millis]
 [(quot ?millis 60000) ?minutes]
 [?track :track/name ?track-name]]

;; args
[db "John Lennon"]

;; result
#{["Crippled Inside" 3] 
  ["Working Class Hero" 3] 
  ["Sisters, O Sisters" 3] 
  ["Only People" 3] 
  ...}

The section Built-in Predicates and Functions lists all built-in functions.

Built-in Predicates and Functions

Datomic provides the following built-in expression functions and predicates:

  • Two-argument comparison predicates: !=, <, <=, >, and >=.
  • Two-argument mathematical operators: +, -, *, and /.
*Note:* Datomic's /// operator is similar to Clojure's
[[http://clojuredocs.org/clojure.core/_fs][/]] in terms of
promotion and
[[http://clojure.org/data_structures#Data%20Structures-Numbers][contagion]]
with a notable exception: Datomic's /// operator does not return
a /clojure.lang.Ratio/ to callers. Instead, it returns a quotient
as per [[https://clojuredocs.org/clojure.core/quot][/quot/]].

get-else

[(get-else src-var ent attr default) ?val-or-default]

The get-else function takes a database, an entity identifier, a cardinality-one attribute, and a default value. It returns that entity's value for the attribute, or the default value if entity does not have a value.

The query below reports "N/A" whenever an artist's startYear is not in the database:

;; query
[:find ?artist-name ?year
 :in $ [?artist-name ...]
 :where [?artist :artist/name ?artist-name]
        [(get-else $ ?artist :artist/startYear "N/A") ?year]]

;; inputs
db, ["Crosby, Stills & Nash" "Crosby & Nash"]

;; result
#{["Crosby, Stills & Nash" 1968] 
  ["Crosby & Nash" "N/A"]}

get-some

[(get-some src-var ent attr+) [?attr ?val]]

The get-some function takes a database, an entity identifier, and one or more cardinality-one attributes, returning a tuple of the entity id and value for the first attribute possessed by the entity.

The query below tries to find a :country/name for an entity, and then falls back to :artist/name:

;; query
[:find [?e ?attr ?name]
 :in $ ?e
 :where [(get-some $ ?e :country/name :artist/name) [?attr ?name]]]

;; inputs
db, :country/US

;; result
[:country/US 84 "United States"]

ground

[(ground const) binding]

The ground function takes a single argument, which must be a constant, and returns that same argument. Programs that know information at query time should prefer ground over e.g. identity, as the former can be used inside the query engine to enable optimizations.

[(ground [:a :e :i :o :u]) [?vowel ...]]

missing?

[(missing? src-var ent attr)]

The missing? predicate takes a database, an entity identifier, and an attribute and returns true if the entity has no value for attribute in the database.

The following query finds all artists whose start year is not recorded in the database.

;; query
[:find ?name
 :where [?artist :artist/name ?name]
        [(missing? $ ?artist :artist/startYear)]]

;; inputs
db

;; result
#{["Sigmund Snopek III"] ["De Labanda's"] ["Baby Whale"] ...}

tx-ids

[(tx-ids ?log ?start ?end) [?tx ...]]

Given a database log, start, and end, tx-ids returns a collection of transaction ids. Start and end can be specified as database t, transaction id, or instant in time, and can be nil.

The following query finds transactions from time t 1000 through 1050:

;; query
[:find [?tx ...]
 :in ?log
 :where [(tx-ids ?log 1000 1050) [?tx ...]]]

;; inputs
log

;; result
[13194139534340 13194139534312 13194139534313 13194139534314]

tx-ids is often used in conjunction with tx-data, to first locate transactions and then the data within those transactions.

tx-data

[(tx-data ?log ?tx) [[?e ?a ?v _ ?op]]]

Given a log and a database, tx-data returns a collection of the datoms added by a transaction. You should not bind the transaction position of the result, as the transaction is already bound on input.

The following query finds the entities referenced by transaction id

;; query
[:find [?e ...]
 :in ?log ?tx
 :where [(tx-data ?log ?tx) [[?e]]]]

;; inputs
log, 13194139534312

;; result
[13194139534312 63 0 64 65 66 67 68 69 70 71 ...]

Calling Java Methods

Java methods can be used as query expression functions and predicates, and should be type hinted for performance. Java code used in this way must be on the Java process classpath.

Java methods should only be used when there is not an equivalent function in clojure.core.

The sections below show how to call both static methods and instance methods.

Calling Static Methods

Java static methods can be called with the (ClassName/methodName …) form. For example, the following code calls System.getProperties, binding property names to ?k and property values to ?v.

;; query
[:find ?k ?v
 :where [(System/getProperties) [[?k ?v]]]]

;; no args

;; result
#{["java.vendor.url.bug" "http://bugreport.sun.com/bugreport/"] 
  ["sun.cpu.isalist" ""] 
  ["sun.jnu.encoding" "UTF-8"]
  ...}

Calling Instance Methods

Java instance methods can be called with the (.methodName obj …) form. For example, the following code finds artists whose name contains "woo"?

;; query
[:find ?name
 :where [_ :artist/name ?name]
        [(.contains ^String ?name "woo")]]

;; args
db

;; result
[["Lee Hazlewood"] ["Cottonwood"] ["Chris Harwood"] ["Mirkwood"]
 ["Under Milkwood"] ["Dorothy Norwood"] ["Fleetwood Mac"]]

Note the ^String type hint on ?name. Type hints outside java.lang will need to be fully qualified, and complex method signatures may require more than one hint to be unambiguous.

Calling Clojure Functions

Clojure functions can be used as query expression functions and predicates. Clojure code used in this way must be on the Clojure process classpath. The example below uses subs as an expression function to extract prefixes of words:

;; query
'[:find [?prefix ...]
  :in [?word ...]
  :where [(subs ?word 0 5) ?prefix]]

;; inputs
["hello" "antidisestablishmentarianism"]

;; result
["hello" "antid"]

Function names outside clojure.core need to be fully qualified, and their namespaces must be loaded before use in query.

Not Clauses

not-clause                 = [ src-var? 'not' clause+ ]

With not clauses, you can express that one or more logic variables inside a query must not satisfy all of a set of predicates. removes already-bound tuples that satisfy the clauses. Unless you specify an explicit src-var, not clauses will target a source named $.

Not Example

The following query uses a not clause to find the count of all artists who are not Canadian:

;; query
[:find (count ?eid)
 :where [?eid :artist/name]
 (not [?eid :artist/country :country/CA])]

;; args
[db]

;; result
[[4538]]

How Not Clauses Work

One can understand not clauses as if they turn into subqueries where all of the variables and sources unified by the negation are propagated to the subquery. The results of the subquery are removed from the enclosing query via set difference. Note that, because they are implemented using set logic, not clauses can be much more efficient than building your own expression predicate that executes a query, as expression predicates are run on each tuple in turn.

Insufficient Binding for a Not Clause

All variables used in a not clause will unify with the surrounding query. This includes both the arguments to nested expression clauses as well as any bindings made by nested function expressions. Datomic will attempt to push the not clause down until all necessary variables are bound, and will throw an ::anom/incorrect anomaly if that is not possible

The query below demonstrates the problem. It attempts to remove eids that are not associated with an :artist/country, without ever finding a set of eids to begin with:

;; query 
[:find (count ?eid)
 :where (not [?eid :artist/country :country/CA])]

;; args
db

;; result
;; an ::anom/incorrect anomaly

Not-join Clauses

not-join-clause            = (src-var? 'not-join' [var+] clause+)

A not-join clause works exactly like a not clause, but also allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run.

var specifies which variables should unify.

Not-join Example

In this next query, which returns the number of artists who didn't release an album in 1970, ?artist is in the var clause and must unify with the surrounding query. ?release is used only inside the not-join clause and will not unify.

;; query
[:find (count ?artist)
 :where [?artist :artist/name]
 (not-join [?artist]
           [?release :release/artists ?artist]
           [?release :release/year 1970])]
;; args
[db]

;; result
[[3263]]

Multiple Clauses In not Or not-join

When more than one clause is supplied to not or not-join, you should read the clauses as if they are connected by "and", just as they are in :where.

The following query counts the number of releases named "Live at Carnegie Hall" that were not by Bill Withers.

;; query
[:find (count ?r)
 :where
 [?r :release/name "Live at Carnegie Hall"]
 (not-join [?r]
           [?r :release/artists ?a]
           [?a :artist/name "Bill Withers"])]

;; args
[db]

;; result
[[2]]

Or Clauses

or-clause                  = [ src-var? 'or' (clause | and-clause)+]

With or clauses, you can express that one or more logic variables inside a query satisfy at least one of a set of predicates. An or clause constrains the result to tuples that satisfy at least one of its /clause/s or /and-clauses/s

The following query uses an or clause to find the count of all vinyl media by listing the complete set of media that make up vinyl in the or clause:

Or Clause Example

;; query
[:find (count ?medium)
 :where (or [?medium :medium/format :medium.format/vinyl7]
            [?medium :medium/format :medium.format/vinyl10]
            [?medium :medium/format :medium.format/vinyl12]
            [?medium :medium/format :medium.format/vinyl])]

;; args
[db]

;; result
[[74751]]

Or Clause Variables

All clauses used in an or clause must use the same set of variables, which will unify with the surrounding query. This includes both the arguments to nested expression clauses as well as any bindings made by nested function expressions. Datomic will attempt to push the or clause down until all necessary variables are bound, and will throw an exception if that is not possible.

How Or Clauses Work

One can imagine or clauses turn into an invocation of an anonymous rule whose predicates comprise the or clauses. As with rules, src-vars are not currently supported within the clauses of or, but are supported on the or clause as a whole at top level.

And Clause

and-clause                 = [ 'and' clause+ ]

Inside an or clause, you may use an and clause to specify conjunction. The and clauses is not available (or needed) outside of an or clause, since conjunction is the default in other clauses.

And Clause Example

The following query uses an and clause inside the or clause to find the number of artists who are either groups or females:

;; query
[:find (count ?artist)
 :where (or [?artist :artist/type :artist.type/group]
            (and [?artist :artist/type :artist.type/person]
                 [?artist :artist/gender :artist.gender/female]))]

;; args
[db]

;; result
[[2323]]

Or-join Clause

or-join-clause             = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]

An or-join clause is simlar to an or clause, but it allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run. The /variable/s specifies which variables should unify.

Or-join Example

In this query, which returns the number of releases that are either by Canadian artists or released in 1970, ?artist is only used inside the or clause and doesn't need to unify with the outer clause. or-join is used to specify that only ?release needs unifying.

;; query
[:find (count ?release)
 :where [?release :release/name]
 (or-join [?release]
          (and [?release :release/artists ?artist]
               [?artist :artist/country :country/CA])
          [?release :release/year 1970])]

;; args
[db]

;; result
[[2124]]

With Clauses

A with-clause considers additional variables not named in the find-spec when forming the basis set for a query result. The with variables are then removed, leaving a bag (not a set!) of values to be consumed by the find-spec. This is particularly useful when finding aggregates.

Example with-clause

Consider the following incorrect query, which attempts to return the total number of heads possessed by a set of mythological monsters:

;; incorrect query
[:find (sum ?heads)
 :in [[_ ?heads]]]

;; inputs
[[["Cerberus" 3]
  ["Medusa" 1]
  ["Cyclops" 1]
  ["Chimera" 1]]]

;; result
[[4]]

The monsters clearly have six total heads, but set logic coalesces Medusa, the Cyclops, and the Chimera together, since each has the same number of heads.

A with-clause correct this query:

;; fixed query
[:find (sum ?heads)
 :with ?monster
 :in [[?monster ?heads]]]

;; inputs
[[["Cerberus" 3]
  ["Medusa" 1]
  ["Cyclops" 1]
  ["Chimera" 1]]]

;; result
[[6]]

Rules

Datomic datalog allows you to package up sets of :where clauses into named rules. These rules make query logic reusable, and also composable, meaning that you can bind portions of a query's logic at query time.

Defining a Rule

rule                       = [ [rule-head clause+]+ ]
rule-head                  = [rule-name rule-vars]
rule-name                  = plain-symbol
rule-vars                  = [variable+ | ([variable+] variable*)]

As with transactions and queries, rules are described using data structures. A rule is a list of lists. The first list in the rule is the rule-head. It names the rule and specifies its rule-vars. The rest of the lists are clauses that make up the body of the rule.

In the example below, the rule-head is track-info, and the three clauses of the rule body join artists to name and duration information about tracks:

[(track-info ?artist ?name ?duration)
 [?track :track/artists ?artist]
 [?track :track/name ?name]
 [?track :track/duration ?duration]]

Using a Rule

inputs                     = ':in' (src-var | variable | pattern-var | rules-var)+
rules-var                  = the symbol "%"
rule-expr                  = [ src-var? rule-name (variable | constant | '_')+]

You have to do two things to use a rule in a query. First, you have to pass a rule set (collection of rules) as an input source and reference it in the :in section of your query using the '%' symbol. Second, you have to invoke one or more rules with a rule-expr in the :where section of your query.

The example below puts the track-info rule into a collection and names the rules with %. It then invokes the rule track-info by name in the where clause:

(def rules
  '[[(track-info ?artist ?name ?duration)
     [?track :track/artists ?artist]
     [?track :track/name ?name]
     [?track :track/duration ?duration]]])

(d/q '[:find ?name ?duration
       :in $ % ?aname
       :where [?artist :artist/name ?aname]
              (track-info ?artist ?name ?duration)]
     db rules "The Beatles")

=> [["Here Comes the Sun" 186000] 
    ["Hey Jude" 428000] 
    ["Come Together" 257000]
    ...]

Multiple Rule Heads

Rules with multiple definitions will evaluate them as different logical paths to the same conclusion (i.e. logical OR). In the rule below, the rule name benelux is defined three times. As a result, the rule matches artists from any of the three Benelux countries:

(def rules
  '[[(benelux ?artist)
     [?artist :artist/country :country/BE]]
    [(benelux ?artist)
     [?artist :artist/country :country/NL]]
    [(benelux ?artist)
     [?artist :artist/country :country/LU]]])

(d/q '[:find ?name
       :in $ %
       :where
       (benelux ?artist)
       [?artist :artist/name ?name]]
     db rules)

=> [["Earth and Fire"]
    ["André Brasseur"]
    ["Nico Gomez & His Afro Percussion Inc."]
    ...]

Required Bindings

Rules normally operate exactly like other items in a where clause. They must unify with the variables already bound, and must bind any variables not already bound.

But sometimes you know that a rule will only be correct, or only be efficient, if some variables are already bound. You can require that some variables be bound before a rule can fire by enclosing the required variables in a vector or list as the first argument to the rule. If the required variables are not bound, Datomic will report an incorrect anomaly.

In the example below, the track-info rule has ?artist as a required binding, and a query that does not bind ?artist fails:

(def rules
  '[[(track-info [?artist] ?name ?duration)
     [?track :track/artists ?artist]
     [?track :track/name ?name]
     [?track :track/duration ?duration]]])

(d/q '[:find ?artist ?name ?duration
       :in $ %
       :where (track-info ?artist ?name ?duration)]
     db rules)

=> ExceptionInfo [?artist] not bound in clause: ...

Rule Database Scoping

rule-expr                  = [ src-var? rule-name (variable | constant | '_')+]

By default, rules operate against the default database named by $. As with other where clauses, you may specify a database as a src-var before the rule-name to scope the rule to that database. Databases cannot be used as arguments in a rule.

The example below passes in two sources: the $mbrainz database, and an $artists relation. Every where clause must therefore begin with a src-var name:

(d/q '[:find ?name ?duration
       :in $mbrainz $artists %
       :where [$artists ?aname]
              [$mbrainz ?artist :artist/name ?aname]
              ($mbrainz track-info ?artist ?name ?duration)]
     db [["The Beatles"]] rules)

=> [["Here Comes the Sun" 186000]
   ["Come Together" 257000]
   ["Hey Jude" 428000]
   ...]

Rule Generality

In all the examples above, the body of each rule is made up solely of data clauses. However, rules can contain any type of clause that a where clause might contain: data, expressions, or even other rule invocations.

Next: Pull