«

Query Reference

This topic documents the data format for Datomic datalog queries and rules.

Query Grammar

Syntax Used In Grammar

'' literal
"" string
[] = list or vector
{} = map {k1 v1 ...}
() grouping
| choice
? zero or one
+ one or more

Query Arg Grammar

query             = [find-spec return-map-spec? with-clause? inputs? where-clauses?]
find-spec = ':find' find-rel
find-rel = find-elem+
find-elem = (variable | pull-expr | aggregate)
variable = symbol starting with "?"
pull-expr = ['pull' variable pattern]
pattern = (pattern-name | pattern-data-literal)
pattern-name = plain-symbol
plain-symbol = symbol that does not begin with "$", "?", or "%"
aggregate = [aggregate-fn-name fn-arg+]
fn-arg = (variable | constant | src-var)
constant = any non-variable data literal
src-var = symbol starting with "$"
return-map-spec = (return-keys | return-syms | return-strs)
return-keys = ':keys' plain-symbol+
return-syms = ':syms' plain-symbol+
return-strs = ':strs' plain-symbol+
with-clause = ':with' variable+
inputs = ':in' (src-var | binding | pattern-name | rules-var)+
binding = (bind-scalar | bind-tuple | bind-coll | bind-rel)
bind-scalar = variable
bind-tuple = [ (variable | '_')+]
bind-coll = [variable '...']
bind-rel = = [ [(variable | '_')+] ]
rules-var = '%'
where-clauses = ':where' clause+
clause = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause)
not-clause = [ src-var? 'not' clause+ ]
not-join-clause = [ src-var? 'not-join' [variable+] clause+ ]
or-clause = [ src-var? 'or' (clause | and-clause)+]
or-join-clause = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]
and-clause = [ 'and' clause+ ]
expression-clause = (data-pattern | pred-expr | fn-expr | rule-expr)
data-pattern = [ src-var? (variable | constant | '_')+ ]
pred-expr = [ [pred fn-arg+] ]
fn-expr = [ [fn fn-arg+] binding]
rule-expr = [ src-var? rule-name (variable | constant | '_')+]
rule-name = plain-symbol

See the pull pattern grammar for the description of the pattern-data-literal rule.

Query Rule Grammar

Note that the rule grammar reuses some terms from the query grammar above.

rule                       = [ [rule-head clause+]+ ]
rule-head                  = [rule-name rule-vars]
rule-name                  = plain-symbol

Queries

query                      = [find-spec with-clause? inputs? where-clauses?]

A query consists of:

  • a find-spec that specifies variables and aggregates to return
  • an optional with-clause to control how duplicate find values are handled
  • an optional inputs clause that names the databases, data, and rules available to the query engine
  • optional where-clauses that constrain and transform data

At least one of inputs or where-clauses must be specified.

Find Specs

find-spec                  = ':find' find-rel
find-rel                   = find-elem+
find-elem                  = (variable | pull-expr | aggregate)

A find-spec is the literal :find followed by one or more find-elems, which can be

  • a variable that returns variables directly
  • a pull-expr that hierarchically selects data about an entity variable
  • an aggregate that summarizes all values of a variable

The order of find-elems determines the order variables appear in a result tuple.

Variables

variable                   = symbol starting with "?"

A variable is a symbol that begins with ?. In a find-spec, variables control which variables are returned, and what order those variables appear in the result tuple.

Like the rest of Clojure, variables are case-sensitive. ?track and ?Track are different variables.

Pull Expressions

pull-expr                  = ['pull' variable pattern]
pattern                    = (pattern-name | pattern-data-literal)

A pull expression returns information about a variable as specified by a pattern. Pull expressions are fully described in the Pull reference.

NOTE Each variable can appear in at most one pull expression.

Custom Query Functions

You can write your own custom functions for use as aggregate, predicate, or function clauses in query. To make these functions available in Datomic Cloud:

You can cancel custom query functions.

Return Maps

NOTE Return maps are available in client 0.8.78.

Supplying a return-map will cause the query to return maps instead of tuples. Each entry in the :keys / :strs / :syms clause will become a key mapped to the corresponding item in the :find clause.

keyword symbols become
:keys keyword keys
:strs string keys
:syms symbol keys

Aggregates

aggregate                  = [aggregate-fn-name fn-arg+]
fn-arg                     = (constant | src-var)

An aggregate function appears in the find clause and transforms a result. Aggregate functions can take variables, constants, or src-vars as arguments.

Aggregates appear as lists in a find-spec. Query variables not in aggregate expressions will group the results and appear intact in the result.

Built-In Aggregates

Each of these is described in more detail below.

aggregate # returned notes
avg 1  
count 1 counts duplicates
count-distinct 1 counts only unique values
distinct n set of distinct values
max 1 compares all types, not just numbers
max n n returns up to n largest
median 1  
min 1 compares all types, not just numbers
min n n returns up to n smallest
rand n n random up to n with duplicates
sample n n sample up to n, no duplicates
stddev 1  
sum 1  
variance 1  

Inputs

inputs                     = ':in' (src-var | binding | pattern-name | rules-var)+

The inputs clause names and orders the inputs to a query. Inputs can be

  • a database name, i.e. a symbol starting with $
  • a variable binding, e.g. a symbol starting with ?
  • a pattern name, i.e. a plain symbol
  • the rules var, i.e. the symbol %

A query has as many inputs as it has :args values, and the inputs bind the :args values for use inside the query.

Binding Forms

A binding form tells how to map data onto variables. A variable name like ?artist-name is the simplest kind of binding, assigning its value directly to variable. Other forms support destructuring the data into a tuple, a collection, or a relation:

Binding Form Binds
?a scalar
[?a ?b] tuple
[?a …] collection
[ [?a ?b ] ] relation

Tuple Binding

bind-tuple                 = [ (variable | '_')+]

A tuple binding binds a set of variables to a single value each, passed in as a collection.

Collection Binding

bind-coll                  = [variable '...']

A collection binding binds a single variable to multiple values passed in as a collection and can be used to as "or" questions about the collection.

Relation Binding

bind-rel                   = [ [(variable | '_')+] ]

A relation binding is fully general, binding multiple variables positionally to a relation (collection of tuples) passed in. This can be used to ask "or" questions involving multiple variables.

Where Clauses

where-clauses              = ':where' clause+
clause                     = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause)
expression-clause          = (data-pattern | pred-expr | fn-expr | rule-expr)

A where clause limits the results returned. The most common kind of where clause is a data pattern that is matched against datoms in the database, but there are many other kinds of clauses to support negation, disjunction, predicates, and functions.

Implicit Joins

where clauses implicitly join. If the same variable appears in multiple clauses, those matches must unify.

Data Patterns

data-pattern               = [ src-var? (variable | constant | '_')+ ]

A data pattern is a tuple that begins with an optional src-var which binds to a relation. The src-var is followed one or more elements that match the tuples of that relation in order. The relation is almost always a Datomic database, so the components are E, A, V, Tx, and Op. The elements of data pattern can be

  • variables, which unify and bind to values
  • constants, which limit results to tuples that match the constant
  • the blank _ which matches anything

Blanks

The underscore symbol (_) is a blank placeholder, and matches anything without binding or unifying. Blanks do not cause unification. Clauses with multiple blanks will not unify despite appearing to have the same symbol used.

Implicit Blanks

In data patterns, you should elide any trailing components you don't care about, rather than explicitly padding with blanks.

Predicates

pred-expr                  = [ [pred fn-arg+] ]

A predicate is an arbitrary Java or Clojure function. Predicates must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.

Predicates are invoked against variables are that are already bound to further constrain the result set. If the predicate returns false or nil for a set of variable bindings, that set is removed.

Built-in Predicates and Functions lists all built-in predicates.

Range Predicates

The predicates =, !=, <=, <, >, and >= are special, in that they take direct advantage of Datomic's AVET index. This makes them much more efficient than equivalent formulations using ordinary predicates. For example, the "artists whose name starts with 'Q'" query shown above is much more efficient than an equivalent version using starts-with?

Unlike their Clojure equivalents, the Datomic range predicates require exactly two arguments.

Built-in Predicates and Functions lists all built-in predicates.

Functions

fn-expr                    = [ [fn fn-arg+] binding]

Queries can call arbitrary Java or Clojure functions. Such functions must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.

Functions are invoked against variables are that are already bound, and their results are interpreted via binding forms to bind additional variables.

Built-in Predicates and Functions lists all built-in functions.

Built-in Predicates and Functions

Datomic provides the following built-in expression functions and predicates:

  • Two-argument comparison predicates: !=, <, <=, >, and >=.
  • Two-argument mathematical operators: +, -, *, and /.

Note: Datomic's / operator is similar to Clojure's / in terms of promotion and contagion with a notable exception: Datomic's / operator does not return a clojure.lang.Ratio to callers. Instead, it returns a quotient as per quot.

get-else

[(get-else src-var ent attr default) ?val-or-default]

get-else takes a database, an entity identifier, a cardinality-one attribute, and a default value. It returns that entity's value for the attribute, or the default value if entity does not have a value.

get-some

[(get-some src-var ent attr+) [?attr ?val]]

get-some takes a database, an entity identifier, and one or more cardinality-one attributes, returning a tuple of the entity id and value for the first attribute possessed by the entity.

ground

[(ground const) binding]

ground takes a single argument, which must be a constant, and returns that same argument. Programs that know information at query time should prefer ground over e.g. identity, as the former can be used inside the query engine to enable optimizations.

missing?

[(missing? src-var ent attr)]

missing? takes a database, an entity identifier, and an attribute and returns true if the entity has no value for attribute in the database.

q

[(q query & inputs) ?ret]

q takes the same arguments as the variable-arity q API function, and can be used to perform nested queries.

Peer | Client

tuple

[(tuple ?a ...) ?tup]

Given one or more values, tuple returns a tuple containing each value. See also untuple.

untuple

[(untuple ?tup) [?a ?b]]

Given a tuple, untuple can be used to name each element of the tuple. See also tuple.

Calling Java Methods

Java methods can be used as query expression functions and predicates, and should be type hinted for performance. Java code used in this way must be on the Java process classpath.

Java methods should only be used when there is not an equivalent function in clojure.core.

Calling Static Methods

[(ClassName/methodName) [[?k ?v]]]

Java static methods can be called with the (ClassName/methodName …) form.

Calling Instance Methods

[(.methodName obj)]
[(.methodName ^Class obj)]
[(.methodName obj ...) ?ret]

Java instance methods can be called with the (.methodName obj …) form. Type hints outside java.lang must be fully qualified, and complex method signatures may require more than one hint to be unambiguous.

Calling Clojure Functions

Clojure functions can be used as query expression functions and predicates. Function names outside clojure.core need to be fully qualified and included on the classpath for Pro, or in the ion :allow list for Cloud. The query engine will automatically require namespaces as necessary.

Not Clauses

not-clause                 = [ src-var? 'not' clause+ ]

not clauses can express that one or more logic variables inside a query must not satisfy all of a set of predicates. Removes already bound tuples that satisfy the clauses. Unless you specify an explicit src-var, not clauses will target a source named $.

How Not Clauses Work

One can understand not clauses as if they turn into subqueries where all of the variables and sources unified by the negation are propagated to the subquery. The results of the subquery are removed from the enclosing query via set difference. Note that, because they are implemented using set logic, not clauses can be much more efficient than building your own expression predicate that executes a query, as expression predicates are run on each tuple in turn.

Insufficient Binding for a Not Clause

All variables used in a not clause will unify with the surrounding query. This includes both the arguments to nested expression clauses as well as any bindings made by nested function expressions. Datomic will attempt to push the not clause down until all necessary variables are bound, and will throw an ::anom/incorrect anomaly if that is not possible

Not-join Clauses

not-join-clause            = (src-var? 'not-join' [var+] clause+)

A not-join clause works exactly like a not clause, but also allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run.

var specifies which variables should unify.

Multiple Clauses In not Or not-join

When more than one clause is supplied to not or not-join, you should read the clauses as if they are connected by "and", just as they are in :where.

Or Clauses

or-clause                  = [ src-var? 'or' (clause | and-clause)+]

or clauses can express that one or more logic variables inside a query satisfy at least one of a set of predicates. An or clause constrains the result to tuples that satisfy at least one of its /clause/s or /and-clauses/s

Or Clause Variables

All clauses used in an or clause must use the same set of variables, which will unify with the surrounding query. This includes both the arguments to nested expression clauses as well as any bindings made by nested function expressions. Datomic will attempt to push the or clause down until all necessary variables are bound, and will throw an exception if that is not possible.

How Or Clauses Work

One can imagine or clauses turn into an invocation of an anonymous rule whose predicates comprise the or clauses. As with rules, src-vars are not currently supported within the clauses of or, but are supported on the or clause as a whole at top level.

And Clause

and-clause                 = [ 'and' clause+ ]

Inside an or clause, you may use an and clause to specify conjunction. The and clauses is not available (or needed) outside of an or clause, since conjunction is the default in other clauses.

Or-join Clause

or-join-clause             = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]

An or-join clause is similar to an or clause, but it allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run. The /variable/s specifies which variables should unify.

With Clauses

A with-clause considers additional variables not named in the find-spec when forming the basis set for a query result. The with variables are then removed, leaving a bag (not a set!) of values to be consumed by the find-spec. This is particularly useful when finding aggregates.

Rules

Datomic datalog allows you to package up sets of :where clauses into named rules. These rules make query logic reusable, and also composable, meaning that you can bind portions of a query's logic at query time.

Defining a Rule

rule                       = [ [rule-head clause+]+ ]
rule-head                  = [rule-name rule-vars]
rule-name                  = plain-symbol
rule-vars                  = [variable+ | ([variable+] variable*)]

As with transactions and queries, rules are described using data structures. A rule is a list of lists. The first list in the rule is the rule-head. It names the rule and specifies its rule-vars. The rest of the lists are clauses that make up the body of the rule.

Using a Rule

inputs                     = ':in' (src-var | binding | pattern-name | rules-var)+
rules-var                  = the symbol "%"
rule-expr                  = [ src-var? rule-name (variable | constant | '_')+]

You have to do two things to use a rule in a query. First, you have to pass a rule set (collection of rules) as an input source and reference it in the :in section of your query using the '%' symbol. Second, you have to invoke one or more rules with a rule-expr in the :where section of your query.

Multiple Rule Heads

Rules with multiple definitions will evaluate them as different logical paths to the same conclusion (i.e. logical OR). In the rule below, the rule name benelux is defined three times.

Required Bindings

Rules normally operate exactly like other items in a where clause. They must unify with the variables already bound, and must bind any variables not already bound.

But sometimes you know that a rule will only be correct, or only be efficient, if some variables are already bound. You can require that some variables be bound before a rule can fire by enclosing the required variables in a vector or list as the first argument to the rule. If the required variables are not bound, Datomic will report an incorrect anomaly.

Rule Database Scoping

rule-expr                  = [ src-var? rule-name (variable | constant | '_')+]

By default, rules operate against the default database named by $. As with other where clauses, you may specify a database as a src-var before the rule-name to scope the rule to that database. Databases cannot be used as arguments in a rule.

Rule Generality

Rules can contain any type of clause that a where clause might contain: data, expressions, or even other rule invocations.

Next: Pull