Query Reference
This topic documents the data format for Datomic datalog queries and rules. If you want to follow along at a REPL, most of the examples on this page work use the mbrainz-subset database and are in the Day of Datomic Cloud repository.
Query Grammar
Syntax Used In Grammar
'' literal "" string [] = list or vector {} = map {k1 v1 ...} () grouping | choice ? zero or one + one or more
Query Arg Grammar
query = [find-spec return-map-spec? with-clause? inputs? where-clauses?]
find-spec = ':find' (find-rel | find-coll | find-tuple | find-scalar)
find-rel = find-elem+
find-coll = [find-elem '...']
find-scalar = find-elem '.'
find-tuple = [find-elem+]
find-elem = (variable | pull-expr | aggregate)
variable = symbol starting with "?"
pull-expr = ['pull' variable pattern]
pattern = (pattern-name | pattern-data-literal)
pattern-name = plain-symbol
plain-symbol = symbol that does not begin with "$", "?", or "%"
aggregate = [aggregate-fn-name fn-arg+]
fn-arg = (variable | constant | src-var)
constant = any non-variable data literal
src-var = symbol starting with "$"
return-map-spec = (return-keys | return-syms | return-strs)
return-keys = ':keys' plain-symbol+
return-syms = ':syms' plain-symbol+
return-strs = ':strs' plain-symbol+
with-clause = ':with' variable+
inputs = ':in' (src-var | binding | pattern-name | rules-var)+
binding = (bind-scalar | bind-tuple | bind-coll | bind-rel)
bind-scalar = variable
bind-tuple = [ (variable | '_')+]
bind-coll = [variable '...']
bind-rel = = [ [(variable | '_')+] ]
rules-var = '%'
where-clauses = ':where' clause+
clause = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause)
not-clause = [ src-var? 'not' clause+ ]
not-join-clause = [ src-var? 'not-join' [variable+] clause+ ]
or-clause = [ src-var? 'or' (clause | and-clause)+]
or-join-clause = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]
and-clause = [ 'and' clause+ ]
expression-clause = (data-pattern | pred-expr | fn-expr | rule-expr)
data-pattern = [ src-var? (variable | constant | '_')+ ]
pred-expr = [ [pred fn-arg+] ]
fn-expr = [ [fn fn-arg+] binding]
rule-expr = [ src-var? rule-name (variable | constant | '_')+]
rule-name = plain-symbol
See the pull pattern grammar for the description of the pattern-data-literal rule.
Query Rule Grammar
Note that the rule grammar reuses some terms from the query grammar above.
rule = [ [rule-head clause+]+ ] rule-head = [rule-name rule-vars] rule-name = plain-symbol
Queries
query = [find-spec with-clause? inputs? where-clauses?]
A query consists of:
- a find-spec that specifies variables and aggregates to return
- an optional with-clause to control how duplicate find values are handled
- an optional inputs clause that names the databases, data, and rules available to the query engine
- optional where-clauses that constrain and transform data
At least one of inputs or where-clauses must be specified.
Datomic offers multiple ways to query:
q
Peer API | Client APIquery
Peer API | Client API arity-1qseq
Peer API | Client API
Query Example
This query limits datoms to :artist/name
"The Beatles", and returns the
entity ids for such results:
(d/q '[:find ?e :where [?e :artist/name "The Beatles"]] db)
=> #{[26757714973567138]} ;; Result. Your exact value may differ.
Note as well the quoted vector [:find ...]
. q
requires a quoted sequence passed to it for the query.
Failure to quote the query will result in a Unable to resolve symbol:
error.
Find Specs
find-spec = ':find' (find-rel | find-coll | find-tuple | find-scalar) find-rel = find-elem+ find-coll = [find-elem '...'] find-scalar = find-elem '.' find-tuple = [find-elem+] find-elem = (variable | pull-expr | aggregate)
Find Spec | Returns | Supported API |
---|---|---|
:find ?a ?b | relation | Peer & Client |
:find [?a …] | collection | Peer |
:find [?a ?b] | single tuple | Peer |
:find ?a . | single scalar | Peer |
A find-spec is the literal :find
followed by one or more
find-elems, which can be
- a variable that returns variables directly
- a pull-expr that hierarchically selects data about an entity variable
- an aggregate that summarizes all values of a variable
The order of find-elems determines the order variables appear in a result tuple.
Variables
variable = symbol starting with "?"
A variable is a symbol that begins with ?
. In a find-spec,
variables control which variables are returned, and what order those
variables appear in the result tuple.
Like the rest of Clojure, variables are case-sensitive e.g. ?track
and
?Track
are different variables.
Find Variables Example
The query below specifies that the result tuples should contain the
track name and duration. Note that the ?track
and ?e
variables are
used in the query, but are not returned.
(d/q '[:find ?name ?duration :where [?e :artist/name "The Beatles"] [?track :track/artists ?e] [?track :track/name ?name] [?track :track/duration ?duration]] db)
=> [["Here Comes the Sun" 186000] ["Come Together" 257000] ["Hey Jude" 428000] ...]
In this example, the variable ?track
unifies. The clauses for
:track/artists
, :track/name
and :track/duration
all must have ?track
in the entity slot to unify around the same entity.
Pull Expressions
pull-expr = ['pull' variable pattern] pattern = (pattern-name | pattern-data-literal)
A pull expression returns information about a variable as specified by a pattern. Each variable can appear in at most one pull expression. Pull expressions are fully described in the Pull reference.
Finding Pull Expression Example
Rather than returning just a variable, this query uses a pull expression
to specify which attributes to return values for about the entity whose
:artist/name
is "The Beatles":
(d/q '[:find (pull ?e [:artist/startYear :artist/endYear]) :where [?e :artist/name "The Beatles"]] db)
=> [[#:artist{:startYear 1957, :endYear 1970}]]
Separation of Concerns
The Pull API provides a declarative interface where you specify what information you want for an entity without specifying how to find it. Pull expressions can be used in queries to find entities and return an explicit map with the specified information about each entity.
This example uses songs-by-artist
to find all tracks for an artist, then uses different
pull patterns to pull different information about the resulting entities.
(def songs-by-artist '[:find (pull ?t pattern) :in $ pattern ?artist-name :where [?a :artist/name ?artist-name] [?t :track/artists ?a]]) (def track-releases-and-artists [:track/name {:medium/_tracks [{:release/_media [{:release/artists [:artist/name]} :release/name]}]}]) ;; Pull only the :track/name (d/q songs-by-artist db [:track/name] "Bob Dylan")
=> ([#:track{:name "California"}] [#:track{:name "Grasshoppers in My Pillow"}] [#:track{:name "Baby Please Don't Go"}] [#:track{:name "Man of Constant Sorrow"}] [#:track{:name "Only a Hobo"}] ...)
;; Use a different pull pattern to get the track name, the release name, and the artists on the release. (d/q songs-by-artist db track-releases-and-artists "Bob Dylan")
=> ([{:track/name "California", :medium/_tracks #:release{:_media #:release{:artists [#:artist{:name "Bob Dylan"}], :name "A Rare Batch of Little White Wonder"}}] [{:track/name "Grasshoppers in My Pillow", :medium/_tracks #:release{:_media #:release{:artists [#:artist{:name "Bob Dylan"}], :name "A Rare Batch of Little White Wonder"}}] [{:track/name "Baby Please Don't Go", :medium/_tracks #:release{:_media #:release{:artists [#:artist{:name "Bob Dylan"}], :name "A Rare Batch of Little White Wonder"}}] [{:track/name "Man of Constant Sorrow", :medium/_tracks #:release{:_media #:release{:artists [#:artist{:name "Bob Dylan"}], :name "A Rare Batch of Little White Wonder"}}] [{:track/name "Only a Hobo", :medium/_tracks #:release{:_media #:release{:artists [#:artist{:name "Bob Dylan"}], :name "A Rare Batch of Little White Wonder"}}] ...)
Custom Query Functions
You can write your own custom functions for use as aggregate, predicate, or function clauses in query. To make these functions available in Datomic Pro, follow the instructions to deploy transaction functions. To make these functions available in Datomic Cloud, follow the instructions to deploy as an ion.
You can cancel custom query functions.
Return Maps
Supplying a return-map will cause the query to return maps instead of
tuples. Each entry in the :keys
/ :strs
/ :syms
clause will become a
key mapped to the corresponding item in the :find
clause.
keyword | symbols become |
---|---|
:keys | keyword keys |
:strs | string keys |
:syms | symbol keys |
In the example below, the :keys
artist
and release
are
used to construct a map for reach row returned.
(d/q '[:find ?artist-name ?release-name :keys artist release :where [?release :release/name ?release-name] [?release :release/artists ?artist] [?artist :artist/name ?artist-name]] db)
=> #{{:artist "George Jones" :release "With Love"} {:artist "Shocking Blue" :release "Hello Darkness / Pickin' Tomatoes"} {:artist "Junipher Greene" :release "Friendship"} ...}
Return maps also preserve the order of the :find
clause. In
particular, return maps
- implement
clojure.lang.Indexed
- support
nth
- support vector style destructuring
For example, the first result from the previous query can be destructured in two ways:
;; positional destructure (let [[artist release] (first result)] ...) ;; key destructure (let [{:keys [artist release]} (first result)] ...)
Aggregates
aggregate = [aggregate-fn-name fn-arg+] fn-arg = (constant | src-var)
An aggregate function appears in the find clause and transforms a result. Aggregate functions can take variables, constants, or src-vars as arguments.
Aggregates appear as lists in a find-spec. Query variables not in aggregate expressions will group the results and appear intact in the result.
Example Aggregate
This query binds ?a ?b ?c ?d
, then groups by ?a
and ?c
, and produces a result for each aggregate expression for each
group, yielding 5-tuples.
[:find ?a (min ?b) (max ?b) ?c (sample 12 ?d) :where ...]
Built-In Aggregates
Each of these is described in more detail below.
aggregate | # returned | notes |
---|---|---|
avg | 1 | |
count | 1 | counts duplicates |
count-distinct | 1 | counts only unique values |
distinct | n | set of distinct values |
max | 1 | compares all types, not just numbers |
max n | n | returns up to n largest |
median | 1 | |
min | 1 | compares all types, not just numbers |
min n | n | returns up to n smallest |
rand n | n | random up to n with duplicates |
sample n | n | sample up to n, no duplicates |
stddev | 1 | |
sum | 1 | |
variance | 1 |
Aggregates Returning a Single Value
The aggregation functions that return a single value are listed below, and all behave as their names suggest.
- min and max
The following query finds the smallest and largest track lengths:
(d/q '[:find (min ?dur) (max ?dur) :where [_ :track/duration ?dur]] db)
=> [[3000 3894000]]
The min and max aggregation functions support all database types (via comparators), not just numbers.
- sum
The following query uses sum to find the total number of tracks on all media in the database.
(d/q '[:find (sum ?count) :with ?medium :where [?medium :medium/trackCount ?count]] db)
=> [[100759]]
- count and count-distinct
More than one artist can have the same name. The following query uses count to report the total number of artist names, and count-distinct to report the total number of unique artist names.
(d/q '[:find (count ?name) (count-distinct ?name) :with ?artist :where [?artist :artist/name ?name]] db)
=> [[4601 4588]]
Note the use of a with-clause so that equal names do not coalesce.
- Statistics: median, avg, variance, and stddev
Are musicians becoming more verbose when naming songs? The following query reports the median, avg, and stddev of song title lengths (in characters), and includes year in the find set to break out the results by year.
(d/q '[:find ?year (median ?namelen) (avg ?namelen) (stddev ?namelen) :with ?track :where [?track :track/name ?name] [(count ?name) ?namelen] [?medium :medium/tracks ?track] [?release :release/media ?medium] [?release :release/year ?year]] db)
=> [[1968 16 18.92181098534824 12.898760656290333] [1969 16 18.147895557287608 11.263945894977244] [1970 15 18.007481296758105 12.076103750401026] [1971 15 18.203682039283294 13.715552693168124] [1972 15 17.907170949841063 11.712941060399375] [1973 16 18.19300100438759 12.656827911058622]]
Aggregates Returning Collections
Where n is specified, fewer than n items may be returned if not enough items are available.
- distinct
The distinct aggregate returns the set of distinct values in the collection.
(d/q '[:find (distinct ?sort-name) :with ?artist :where [?artist :artist/name "Fire"] [?artist :artist/sortName ?sort-name]] db)
=> [[#{"Fire"}]]
- min n and max n
The min n and max n aggregates return up to n least/greatest items. The following query returns the five shortest and five longest track lengths in the database.
(d/q '[:find (min 5 ?millis) (max 5 ?millis) :where [?track :track/duration ?millis]] db)
=> [[[3000 4000 5000 6000 7000] [3894000 3407000 2928000 2802000 2775000]]]
- rand n and sample n
The rand n aggregate selects exactly n items with potential for duplicates, and the sample n aggregate returns up to n distinct items.
The following query returns two random and two sampled artist names.
(d/q '[:find (rand 2 ?name) (sample 2 ?name) :where [_ :artist/name ?name]] db)
=> [[("Four Tops" "Ethel McCoy") ["Gábor Szabó" "Zapata"]]]
Inputs
inputs = ':in' (src-var | binding | pattern-name | rules-var)+
The inputs clause names and orders the inputs to a query. Inputs can be
- a database name, i.e. a symbol starting with
$
- a variable binding, e.g. a symbol starting with
?
- a pattern name, i.e. a plain symbol
- the rules var, i.e. the symbol
%
A query has as many inputs as it has :args
values, and the inputs bind the
:args
values for use inside the query.
Default Inputs
Most queries operate against a single database. So as a convenience, the inputs clause can be elided, and will default to a single database whose name is the dollar sign $.
For example, the following three queries are equivalent:
;; with only $ as input, the :in clause can be dropped [:find ?e :where [?e :age ?age]] ;; use the shorter name $, which can be omitted from where clauses [:find ?e :in $ ?age :where [?e :age ?age]] ;; use $data to name the database [:find ?e :in $data ?age :where [$data ?e :age ?age]]
Inputs Example
The query below takes the artist name as an input, so that this parameterized query can be re-used with different artist names.
Inside the query, $
is bound to db
, and ?name
is bound to "The
Beatles".
(d/q '[:find (pull ?e [:artist/startYear :artist/endYear]) :in $ ?name :where [?e :artist/name ?name]] db "The Beatles")
=> [[#:artist{:startYear 1957, :endYear 1970}]]
Pattern Inputs
An input can be a pattern var, specifying a pattern to be used in pull expressions in the find clause.
The query below binds pattern
to the artist's start year and end
year.
(d/q '[:find (pull ?e pattern) :in $ ?name pattern :where [?e :artist/name ?name]] db "The Beatles" [:artist/startYear :artist/endYear])
=> [[#:artist{:startYear 1957, :endYear 1970}]]
Binding Forms
A binding form tells how to map data onto variables. A variable name like ?artist-name is the simplest kind of binding, assigning its value directly to variable. Other forms support destructuring the data into a tuple, a collection, or a relation:
Binding Form | Binds |
---|---|
?a | scalar |
[?a ?b] | tuple |
[?a …] | collection |
[ [?a ?b ] ] | relation |
Tuple Binding
bind-tuple = [ (variable | '_')+]
A tuple binding binds a set of variables to a single value each, passed in as a collection. The query below binds both artist name and release name to find the entity ids for releases of John Lennon's Mind Games:
(d/q '[:find ?release :in $ [?artist-name ?release-name] :where [?artist :artist/name ?artist-name] [?release :release/artists ?artist] [?release :release/name ?release-name]] db ["John Lennon" "Mind Games"])
=> #{[17592186157686] [17592186157672] [17592186157690] [17592186157658]}
Collection Binding
bind-coll = [variable '...']
A collection binding binds a single variable to multiple values passed in as a collection. This can be used to ask "or" questions involving the values of the collection binding.
This query shows how to ask "What releases are associated with either Paul McCartney or George Harrison?"
(d/q '[:find ?release-name :in $ [?artist-name ...] :where [?artist :artist/name ?artist-name] [?release :release/artists ?artist] [?release :release/name ?release-name]] db ["Paul McCartney" "George Harrison"])
=> #{["My Sweet Lord"] ["Electronic Sound"] ["Give Me Love (Give Me Peace on Earth)"] ["All Things Must Pass"] ...}
Relation Binding
bind-rel = [ [(variable | '_')+] ]
A relation binding is fully general, binding multiple variables positionally to a relation (collection of tuples) passed in. This can be used to ask "or" questions involving variables in the relation binding. For example, what releases are associated with either John Lennon's Mind Games or Paul McCartney's Ram?
(d/q '[:find ?release :in $ [[?artist-name ?release-name]] :where [?artist :artist/name ?artist-name] [?release :release/artists ?artist] [?release :release/name ?release-name]] db [["John Lennon" "Mind Games"] ["Paul McCartney" "Ram"]])
=> #{[17592186157686] [17592186157672] [17592186157690] [17592186157658] [17592186063566]}
Where Clauses
where-clauses = ':where' clause+ clause = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause) expression-clause = (data-pattern | pred-expr | fn-expr | rule-expr)
A where clause limits the results returned. The most common kind of where clause is a data pattern that is matched against datoms in the database, but there are many other kinds of clauses to support negation, disjunction, predicates, and functions.
Implicit Joins
where
clauses implicitly join. If a variable appears in the same place
in multiple clauses, those matches must unify.
To start we'll form two queries to find the years of releases of The Beatles and Janis Joplin separately. (Remember the database covers only from 1968 to 1973).
;; query to find years of all Beatles Releases (d/q '[:find ?year :where [?artist :artist/name "The Beatles"] [?release :release/artists ?artist] [?release :release/year ?year]] db)
=> [[1969] [1970] [1973] [1968]] ;; (trivia question: where is that 1973 result from?)
;; query to find years of all Janis Joplin Releases (d/q '[:find ?year :where [?artist :artist/name "Janis Joplin"] [?release :release/artists ?artist] [?release :release/year ?year]] db)
=> [[1969] [1971] [1972] [1973]]
We can take advantage of implicit joins by combining these queries but utilizing the same ?year
variable in the :release/year
clause while looking for the artists separately
Now the years when both The Beatles and Janis Joplin released an album can be found.
(d/q '[:find ?year :where [?artist :artist/name "The Beatles"] [?release :release/artists ?artist] [?release :release/year ?year] [?artist2 :artist/name "Janis Joplin"] [?release2 :release/artists ?artist2] [?release2 :release/year ?year]] db)
=> [[1969] [1973]]
?year
was matched for both ?release
and release2
.
Data Patterns
data-pattern = [ src-var? (variable | constant | '_')+ ]
A data pattern is a tuple that begins with an optional src-var which binds to a relation. The src-var is followed one or more elements that match the tuples of that relation in order. The relation is almost always a Datomic database, so the components are E, A, V, Tx, and Op. The elements of data pattern can be
- variables, which unify and bind to values
- constants, which limit results to tuples that match the constant
- the blank
_
which matches anything
The example below, utilizing the mbrainz database via the mbrainz importer, has a single data pattern which operates as follows:
$mbrainz
binds to the db argument- the constant
:artist/name
limits results to datoms with that value in their Attribute (A) position - the constant "The Beatles" limits results to datoms with that value in their Value (V) position
- the variables
?e
,?tx
, and?op
bind to those positions in the matching datoms, if any
(d/q '[:find ?e ?tx ?op :in $mbrainz :where [$mbrainz ?e :artist/name "The Beatles" ?tx ?op]] db)
=> [[26757714973567138 13194139533421 true]]
Blanks
Sometimes you don't care about certain elements of the tuples in a
query, but you must put something in the clause in order to get to the
positions that you do care about. The underscore symbol (_
) is a
blank placeholder, and matches anything without binding or unifying.
For example, if you wanted a random artist name, you would need a data pattern that talked about A and V, but you would not care about the E component which precedes them. The following query uses the blank in the E position:
(d/q '[:find (sample 1 ?name) :where [_ :artist/name ?name]] db)
=> [[["Aerosmith"]]]
Do not use a dummy variable instead of the blank. This will make the query engine do extra work by tracking binding and unification for a variable that you never intend to use. It will also make human readers do extra work, puzzling out that the dummy variable is intentionally not used.
Blanks do not cause unification. Clauses with multiple blanks will not unify despite appearing to have the same symbol used.
Implicit Blanks
In data patterns, you should elide any trailing components you don't care about, rather than explicitly padding with blanks. The previous examples already demonstrates this by omitting the Tx and Op components from the pattern
;; unnecessary trailing blanks [_ :artist/name ?name _ _] ;; better [_ :artist/name ?name]
Predicates
pred-expr = [ [pred fn-arg+] ]
A predicate is an arbitrary Java or Clojure function. Predicates must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.
Predicates are invoked against variables that are already bound to
further constrain the result set. If the predicate returns false
or
nil
for a set of variable bindings, that set is removed.
Predicate Example
The query below uses the built-in predicates <=
and <
to limit the
results to artists whose name sorts greater than or equal to "Q" and less
than "R", i.e. the artists whose name begins with "Q":
(d/q '[:find ?name :where [_ :artist/name ?name] [(<= "Q" ?name)] [(< ?name "R")]] db)
=> [["Quiet World"] ["Queen"] ["Quintessence"] ...]
You can use any pure function from the clojure.core namespace as a predicate.
Range Predicates
The predicates =
, !=
, <=
, <
, >
, and >=
are special, in
that they take direct advantage of Datomic's AVET index. This makes
them much more efficient than equivalent formulations using ordinary
predicates. For example, the "artists whose name starts with 'Q'"
query shown above is much more efficient than an equivalent version
using starts-with?
;; fast -- uses AVET index [(<= "Q" ?name)] [(< ?name "R")] ;; slower -- must consider every value of ?name [(clojure.string/starts-with? ?name "Q")]
Unlike their Clojure equivalents, the Datomic range predicates require exactly two arguments.
The section Built-in Predicates and Functions lists all built-in predicates.
Functions
fn-expr = [ [fn fn-arg+] binding]
Queries can call arbitrary Java or Clojure functions. Such functions must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.
Functions are invoked against variables are that are already bound, and their results are interpreted via binding forms to bind additional variables.
Function Example
The example below uses the division function quot
call to convert track lengths from milliseconds to minutes:
(d/q '[:find ?track-name ?minutes :in $ ?artist-name :where [?artist :artist/name ?artist-name] [?track :track/artists ?artist] [?track :track/duration ?millis] [(quot ?millis 60000) ?minutes] [?track :track/name ?track-name]] db "John Lennon")
=> #{["Crippled Inside" 3] ["Working Class Hero" 3] ["Sisters, O Sisters" 3] ["Only People" 3] ...}
An alternate example utilizing a predicate with a function binding to find artists with names under 7 characters and show the number of characters in their name.
(d/q '[:find ?name ?count :where [_ :artist/name ?name] [(<= "Q" ?name)] [(count ?name) ?count] [(< ?count 7)]] db)
=> [["井上陽水" 4] ["Spring" 6] ["頭脳警察" 4] ["Sonoma" 6] ["Selda" 5] ["Tony" 4] ["Smile" 5] ["Sun Ra" 6] ...]
The section Built-in Predicates and Functions lists all built-in functions.
Built-in Predicates and Functions
Datomic provides the following built-in expression functions and predicates:
- Two-argument comparison predicates: !=, <, <=, >, and >=.
- Two-argument mathematical operators: +, -, *, and /.
- All of the functions from the clojure.core namespace of Clojure, except eval.
- A set of functions and predicates that are aware of Datomic data structures, documented below:
- [Peer API] A set of functions that are aware of Datomic's Log:
Comparison and math operators work as in Clojure with the exception that /
will work like quot
when called with integer arguments
to avoid introducing Clojure's ratio type to other language callers that cannot support it.
get-else
[(get-else src-var ent attr default) ?val-or-default]
The get-else function takes a database, an entity identifier, a cardinality-one attribute, and a default value. It returns that entity's value for the attribute, or the default value if entity does not have a value.
The query below reports "N/A" whenever an artist's startYear is not in the database:
(d/q '[:find ?artist-name ?year :in $ [?artist-name ...] :where [?artist :artist/name ?artist-name] [(get-else $ ?artist :artist/startYear "N/A") ?year]] db ["Crosby, Stills & Nash" "Crosby & Nash"])
=> #{["Crosby, Stills & Nash" 1968] ["Crosby & Nash" "N/A"]}
get-some
[(get-some src-var ent attr+) [?attr ?val]]
The get-some function takes a database, an entity identifier, and one or more cardinality-one attributes, returning a tuple of the entity id and value for the first attribute possessed by the entity.
The query below tries to find a :country/name for an entity, and then falls back to :artist/name:
(d/q '[:find ?e ?attr ?name :in $ ?e :where [(get-some $ ?e :country/name :artist/name) [?attr ?name]]] db :country/US)
=> [:country/US 84 "United States"]
ground
[(ground const) binding]
The ground function takes a single argument, which must be a constant, and returns that same argument. Programs that know information at query time should prefer ground over e.g. identity, as the former can be used inside the query engine to enable optimizations.
[(ground [:a :e :i :o :u]) [?vowel ...]]
missing?
[(missing? src-var ent attr)]
The missing? predicate takes a database, an entity identifier, and an attribute and returns true if the entity has no value for attribute in the database.
The following query finds all artists whose start year is not recorded in the database.
(d/q '[:find ?name :where [?artist :artist/name ?name] [(missing? $ ?artist :artist/startYear)]] db)
=> #{["Sigmund Snopek III"] ["De Labanda's"] ["Baby Whale"] ...}
q
The q function allows you to perform nested queries, and takes the same arguments as the variable-arity q api function.
The example below shows using a nested query to bind the
the ?duration
variable for use by an enclosing
query that returns the entity id and name of the shortest tracks:
(d/q '[:find ?track ?name ?duration :where [(q '[:find (min ?duration) :where [_ :track/duration ?duration]] $) [[?duration]]] [?track :track/duration ?duration] [?track :track/name ?name]] db)
=> [[8334298138708635 "Nutopian International Anthem" 3000] [9007199254790299 "Nutopian International Anthem" 3000]]
tuple
Calling Java Methods
Java methods can be used as query expression functions and predicates, and should be type hinted for performance. Java code used in this way must be on the Java process classpath.
Java methods should only be used when there is not an equivalent function in clojure.core.
The sections below show how to call both static methods and instance methods.
Calling Static Methods
Java static methods can be called with the (ClassName/methodName …) form. For example, the following code calls System.getProperties, binding property names to ?k and property values to ?v.
(d/q '[:find ?k ?v :where [(System/getProperties) [[?k ?v]]]]
=> #{["java.vendor.url.bug" "http://bugreport.sun.com/bugreport/"] ["sun.cpu.isalist" ""] ["sun.jnu.encoding" "UTF-8"] ...}
Calling Instance Methods
Java instance methods can be called with the (.methodName obj …) form. For example, the following code finds artists whose name contains "woo"?
(d/q '[:find ?name :where [_ :artist/name ?name] [(.contains ^String ?name "woo")]] db)
=> [["Lee Hazlewood"] ["Cottonwood"] ["Chris Harwood"] ["Mirkwood"] ["Under Milkwood"] ["Dorothy Norwood"] ["Fleetwood Mac"]]
Note the ^String
type hint on ?name
. Type hints outside java.lang
will need to be fully qualified, and complex method signatures
may require more than one hint to be unambiguous.
Calling Clojure Functions
Clojure functions can be used as query expression functions and predicates. The example below uses subs as an expression function to extract prefixes of words:
(d/q '[:find ?prefix :in [?word ...] :where [(subs ?word 0 5) ?prefix]] ["hello" "antidisestablishmentarianism"])
=> [["hello"] ["antid"]]
Not Clauses
not-clause = [ src-var? 'not' clause+ ]
With not clauses, you can express that one or more logic variables inside a query must not satisfy all of a set of predicates. removes already-bound tuples that satisfy the clauses. Unless you specify an explicit src-var, not clauses will target a source named $.
Not Example
The following query uses a not clause to find the count of all artists who are not Canadian:
(d/q '[:find (count ?eid) :where [?eid :artist/name] (not [?eid :artist/country :country/CA])] db)
=> [[4538]]
How Not Clauses Work
Not clauses are evaluated like a subquery and return a set of tuples that is used to remove tuples from the query's result set via set difference. Since the removal of tuples from the query's result set is performed using set logic, not clauses have the potential to be much more efficient than expression predicates which must be applied iteratively to each tuple in the result set instead of to the entire result set.
Insufficient Binding for a Not Clause
All variables used in a not clause will unify with the surrounding
query. This includes both the arguments to nested expression clauses as
well as any bindings made by nested function expressions. Datomic will
attempt to push the not clause down until all necessary variables are
bound, and will throw an ::anom/incorrect
anomaly if that is not possible.
The query below demonstrates the problem. It attempts to remove
eids that are not associated with an :artist/country
, without ever
finding a set of eids to begin with:
(d/q '[:find (count ?eid) :where (not [?eid :artist/country :country/CA])] db) (ex-data *e)
=> {:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message "[?eid] not bound in not clause: (not-join [?eid] [?eid :artist/country :country/CA])", :db/error :db.error/insufficient-binding}
Not-join Clauses
not-join-clause = (src-var? 'not-join' [var+] clause+)
A not-join clause works exactly like a not clause, but also allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run.
var specifies which variables should unify.
Not-join Example
In this next query, which returns the number of artists who didn't
release an album in 1970, ?artist
is in the var clause and must
unify with the surrounding query. ?release
is used only inside the
not-join clause and will not unify.
(d/q '[:find (count ?artist) :where [?artist :artist/name] (not-join [?artist] [?release :release/artists ?artist] [?release :release/year 1970])] db)
=> [[3263]]
Multiple Clauses In not Or not-join
When more than one clause is supplied to not or not-join, they are evaluated as if connected by an and clause.
The following query counts the number of releases named "Live at Carnegie Hall" that were not by Bill Withers.
(d/q '[:find (count ?r) :where [?r :release/name "Live at Carnegie Hall"] (not-join [?r] [?r :release/artists ?a] [?a :artist/name "Bill Withers"])] db)
=> [[2]]
Or Clauses
or-clause = [ src-var? 'or' (clause | and-clause)+]
With or clauses, you can express that one or more logic variables inside a query satisfy at least one of a set of predicates. An or clause constrains the result to tuples that satisfy at least one of its clause or and-clauses
The following query uses an or clause to find the count of all vinyl media by listing the complete set of media that make up vinyl in the or clause:
Or Clause Example
(d/q '[:find (count ?medium) :where (or [?medium :medium/format :medium.format/vinyl7] [?medium :medium/format :medium.format/vinyl10] [?medium :medium/format :medium.format/vinyl12] [?medium :medium/format :medium.format/vinyl])] db)
=> [[74751]]
Or Clause Variables
All clauses used in an or clause must use the same set of variables, which will unify with the surrounding query. This includes both the arguments to nested expression clauses as well as any bindings made by nested function expressions. Datomic will attempt to push the or clause down until all necessary variables are bound, and will throw an exception if that is not possible.
How Or Clauses Work
One can imagine or clauses turn into an invocation of an anonymous rule whose predicates comprise the or clauses. As with rules, src-vars are not currently supported within the clauses of or, but are supported on the or clause as a whole at top level.
And Clause
and-clause = [ 'and' clause+ ]
Inside an or clause, you may use an and clause to specify conjunction. The and clauses are not available (or needed) outside of an or clause, since conjunction is the default in other clauses.
And Clause Example
The following query uses an and clause inside the or clause to find the number of artists who are either groups or females:
(d/q '[:find (count ?artist) :where (or [?artist :artist/type :artist.type/group] (and [?artist :artist/type :artist.type/person] [?artist :artist/gender :artist.gender/female]))] db)
=> [[2323]]
Or-join Clause
or-join-clause = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]
An or-join clause is similar to an or clause, but it allows you to
specify which variables should unify with the surrounding clause; only
this list of variables needs binding before the clause can run. [variable+]
specifies which logic variables should unify.
Or-join Example
In this query, which returns the number of releases that are either
by Canadian artists or released in 1970, ?artist
is only used inside
the or clause and doesn't need to unify with the outer clause.
or-join is used to specify that only ?release
needs unifying.
(d/q '[:find (count ?release) :where [?release :release/name] (or-join [?release] (and [?release :release/artists ?artist] [?artist :artist/country :country/CA]) [?release :release/year 1970])] db)
=> [[2124]]
With Clauses
A with-clause considers additional variables not named in the find-spec when forming the basis set for a query result. The with variables are then removed, leaving a bag (not a set!) of values to be consumed by the find-spec. This is particularly useful when finding aggregates.
Example with-clause
Consider the following example, where our intention is to find out the years of every Bob Dylan release.
;; incorrect query (d/q '[:find ?year :where [?artist :artist/name "Bob Dylan"] [?release :release/artists ?artist] [?release :release/year ?year]] db)
=> [[1969] [1970] [1971] [1973] [1968]]
Bob Dylan was clearly a more prolific artist than this. The query returned the years Bob Dylan released records, rather than the release years of each of the records.
Set logic combines all of the releases that came out in the same year, and this is not what is wanted for the particular query.
A with-clause correct this query:
;; fixed query (d/q '[:find ?year :with ?release :where [?artist :artist/name "Bob Dylan"] [?release :release/artists ?artist] [?release :release/year ?year]] db)
=> [[1973] [1971] [1973] [1973] [1970] [1968] [1971] [1969] [1968] [1970] [1973] [1970] [1971] [1970] [1973] [1968] [1971] [1973] [1970] [1969] [1971] [1970]]
This result is more like what we wanted. This is a list of the year of each release between 1968 and 1973.
Rules
Datomic datalog allows you to package up sets of :where
clauses into
named rules. These rules make query logic reusable, and also composable,
meaning that you can bind portions of a query's logic at query time.
Defining a Rule
rule = [ [rule-head clause+]+ ] rule-head = [rule-name rule-vars] rule-name = plain-symbol rule-vars = [variable+ | ([variable+] variable*)]
As with transactions and queries, rules are described using data structures. A rule is a list of lists. The first list in the rule is the rule-head. It names the rule and specifies its rule-vars. The rest of the lists are clauses that make up the body of the rule.
In the example below, the rule-head is track-info
, and the three
clauses of the rule body join artists to name and duration
information about tracks:
[(track-info ?artist ?name ?duration) [?track :track/artists ?artist] [?track :track/name ?name] [?track :track/duration ?duration]]
Using a Rule
inputs = ':in' (src-var | binding | pattern-name | rules-var)+ rules-var = the symbol "%" rule-expr = [ src-var? rule-name (variable | constant | '_')+]
You have to do two things to use a rule in a query. First, you have to pass a rule set (collection of rules) as an input source and reference it in the :in section of your query using the '%' symbol. Second, you have to invoke one or more rules with a rule-expr in the :where section of your query.
The example below puts the track-info
rule into a collection and
names the rules with %
. It then invokes the rule track-info
by
name in the where clause:
(def rules '[[(track-info ?artist ?name ?duration) [?track :track/artists ?artist] [?track :track/name ?name] [?track :track/duration ?duration]]]) (d/q '[:find ?name ?duration :in $ % ?aname :where [?artist :artist/name ?aname] (track-info ?artist ?name ?duration)] db rules "The Beatles")
=> [["Here Comes the Sun" 186000] ["Hey Jude" 428000] ["Come Together" 257000] ...]
Multiple Rule Heads
Rules with multiple definitions will evaluate them as different logical
paths to the same conclusion (i.e. logical OR). In the rule below, the
rule name benelux
is defined three times. As a result, the rule
matches artists from any of the three Benelux countries:
(def rules '[[(benelux ?artist) [?artist :artist/country :country/BE]] [(benelux ?artist) [?artist :artist/country :country/NL]] [(benelux ?artist) [?artist :artist/country :country/LU]]]) (d/q '[:find ?name :in $ % :where (benelux ?artist) [?artist :artist/name ?name]] db rules)
=> [["Earth and Fire"] ["André Brasseur"] ["Nico Gomez & His Afro Percussion Inc."] ...]
Required Bindings
Rules normally operate exactly like other items in a where clause. They must unify with the variables already bound, and must bind any variables not already bound.
But sometimes you know that a rule will only be correct, or only be efficient, if some variables are already bound. You can require that some variables be bound before a rule can fire by enclosing the required variables in a vector or list as the first argument to the rule. If the required variables are not bound, Datomic will report an incorrect anomaly.
In the example below, the track-info
rule has ?artist
as a
required binding, and a query that does not bind ?artist
fails:
(def rules '[[(track-info [?artist] ?name ?duration) [?track :track/artists ?artist] [?track :track/name ?name] [?track :track/duration ?duration]]]) (d/q '[:find ?artist ?name ?duration :in $ % :where (track-info ?artist ?name ?duration)] db rules) (ex-data *e)
=> {:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message "[?artist] not bound in clause: (track-info ?artist ?name ?duration)", :db/error :db.error/insufficient-binding}
This can be fixed easily:
(def rules '[[(track-info [?artist] ?name ?duration) [?track :track/artists ?artist] [?track :track/name ?name] [?track :track/duration ?duration]]]) (d/q '[:find ?artist ?name ?duration :in $ % :where [?artist :artist/name "The Rolling Stones"] (track-info ?artist ?name ?duration)] db rules)
=> [[62821696364619130 "Sittin’ on a Fence" 184040] [62821696364619130 "2000 Light Years From Home" 287000] [62821696364619130 "Jumpin' Jack Flash" 242066] [62821696364619130 "What to Do" 158000] ...]
Rule Database Scoping
rule-expr = [ src-var? rule-name (variable | constant | '_')+]
By default, rules operate against the default database named by
$
. As with other where clauses, you may specify a database as a
src-var
before the rule-name to scope the rule to that
database. Databases cannot be used as arguments in a rule.
The example below passes in two sources: the $mbrainz
database, and
an $artists
relation. Every where clause must therefore begin with
a src-var
name:
(def rules '[[(track-info [?artist] ?name ?duration) [?track :track/artists ?artist] [?track :track/name ?name] [?track :track/duration ?duration]]]) (d/q '[:find ?name ?duration :in $mbrainz $artists % :where [$artists ?aname] [$mbrainz ?artist :artist/name ?aname] ($mbrainz track-info ?artist ?name ?duration)] db [["The Beatles"]] rules)
=> [["Here Comes the Sun" 186000] ["Come Together" 257000] ["Hey Jude" 428000] ...]
Rule Generality
In all the examples above, the body of each rule is made up solely of data clauses. However, rules can contain any type of clause that a where clause might contain: data, expressions, or even other rule invocations.
Next: Pull