Query Reference
This topic documents the data format for Datomic datalog queries and rules.
Query Grammar
Syntax Used In Grammar
'' literal "" string [] = list or vector {} = map {k1 v1 ...} () grouping | choice ? zero or one + one or more
Query Arg Grammar
query = [find-spec return-map-spec? with-clause? inputs? where-clauses?]
find-spec = ':find' find-rel
find-rel = find-elem+
find-elem = (variable | pull-expr | aggregate)
variable = symbol starting with "?"
pull-expr = ['pull' variable pattern]
pattern = (pattern-name | pattern-data-literal)
pattern-name = plain-symbol
plain-symbol = symbol that does not begin with "$", "?", or "%"
aggregate = [aggregate-fn-name fn-arg+]
fn-arg = (variable | constant | src-var)
constant = any non-variable data literal
src-var = symbol starting with "$"
return-map-spec = (return-keys | return-syms | return-strs)
return-keys = ':keys' plain-symbol+
return-syms = ':syms' plain-symbol+
return-strs = ':strs' plain-symbol+
with-clause = ':with' variable+
inputs = ':in' (src-var | binding | pattern-name | rules-var)+
binding = (bind-scalar | bind-tuple | bind-coll | bind-rel)
bind-scalar = variable
bind-tuple = [ (variable | '_')+]
bind-coll = [variable '...']
bind-rel = = [ [(variable | '_')+] ]
rules-var = '%'
where-clauses = ':where' clause+
clause = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause)
not-clause = [ src-var? 'not' clause+ ]
not-join-clause = [ src-var? 'not-join' [variable+] clause+ ]
or-clause = [ src-var? 'or' (clause | and-clause)+]
or-join-clause = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]
and-clause = [ 'and' clause+ ]
expression-clause = (data-pattern | pred-expr | fn-expr | rule-expr)
data-pattern = [ src-var? (variable | constant | '_')+ ]
pred-expr = [ [pred fn-arg+] ]
fn-expr = [ [fn fn-arg+] binding]
rule-expr = [ src-var? rule-name (variable | constant | '_')+]
rule-name = plain-symbol
See the pull pattern grammar for the description of the pattern-data-literal rule.
Query Rule Grammar
Note that the rule grammar reuses some terms from the query grammar above.
rule = [ [rule-head clause+]+ ] rule-head = [rule-name rule-vars] rule-name = plain-symbol
Queries
query = [find-spec with-clause? inputs? where-clauses?]
A query consists of:
- a find-spec that specifies variables and aggregates to return
- an optional with-clause to control how duplicate find values are handled
- an optional inputs clause that names the databases, data, and rules available to the query engine
- optional where-clauses that constrain and transform data
At least one of inputs or where-clauses must be specified.
Find Specs
find-spec = ':find' find-rel find-rel = find-elem+ find-elem = (variable | pull-expr | aggregate)
A find-spec is the literal :find
followed by one or more
find-elems, which can be
- a variable that returns variables directly
- a pull-expr that hierarchically selects data about an entity variable
- an aggregate that summarizes all values of a variable
The order of find-elems determines the order variables appear in a result tuple.
Variables
variable = symbol starting with "?"
A variable is a symbol that begins with ?
. In a find-spec,
variables control which variables are returned, and what order those
variables appear in the result tuple.
Like the rest of Clojure, variables are case-sensitive. ?track
and
?Track
are different variables.
Pull Expressions
pull-expr = ['pull' variable pattern] pattern = (pattern-name | pattern-data-literal)
A pull expression returns information about a variable as specified by a pattern. Pull expressions are fully described in the Pull reference.
NOTE Each variable can appear in at most one pull expression.
Custom Query Functions
You can write your own custom functions for use as aggregate, predicate, or function clauses in query. To make these functions available in Datomic Cloud:
- Include your functions in an ion project.
- Add your functions' fully-qualified names under the
:allow
key in your ion-config.edn file. - Push and deploy your ion to the compute group that will use the functions.
You can cancel custom query functions.
Return Maps
NOTE Return maps are available in client 0.8.78.
Supplying a return-map will cause the query to return maps instead of
tuples. Each entry in the :keys
/ :strs
/ :syms
clause will become a
key mapped to the corresponding item in the :find
clause.
keyword | symbols become |
---|---|
:keys | keyword keys |
:strs | string keys |
:syms | symbol keys |
Aggregates
aggregate = [aggregate-fn-name fn-arg+] fn-arg = (constant | src-var)
An aggregate function appears in the find clause and transforms a result. Aggregate functions can take variables, constants, or src-vars as arguments.
Aggregates appear as lists in a find-spec. Query variables not in aggregate expressions will group the results and appear intact in the result.
Built-In Aggregates
Each of these is described in more detail below.
aggregate | # returned | notes |
---|---|---|
avg | 1 | |
count | 1 | counts duplicates |
count-distinct | 1 | counts only unique values |
distinct | n | set of distinct values |
max | 1 | compares all types, not just numbers |
max n | n | returns up to n largest |
median | 1 | |
min | 1 | compares all types, not just numbers |
min n | n | returns up to n smallest |
rand n | n | random up to n with duplicates |
sample n | n | sample up to n, no duplicates |
stddev | 1 | |
sum | 1 | |
variance | 1 |
Inputs
inputs = ':in' (src-var | binding | pattern-name | rules-var)+
The inputs clause names and orders the inputs to a query. Inputs can be
- a database name, i.e. a symbol starting with
$
- a variable binding, e.g. a symbol starting with
?
- a pattern name, i.e. a plain symbol
- the rules var, i.e. the symbol
%
A query has as many inputs as it has :args
values, and the inputs bind the
:args
values for use inside the query.
Binding Forms
A binding form tells how to map data onto variables. A variable name like ?artist-name is the simplest kind of binding, assigning its value directly to variable. Other forms support destructuring the data into a tuple, a collection, or a relation:
Binding Form | Binds |
---|---|
?a | scalar |
[?a ?b] | tuple |
[?a …] | collection |
[ [?a ?b ] ] | relation |
Tuple Binding
bind-tuple = [ (variable | '_')+]
A tuple binding binds a set of variables to a single value each, passed in as a collection.
Collection Binding
bind-coll = [variable '...']
A collection binding binds a single variable to multiple values passed in as a collection and can be used to as "or" questions about the collection.
Relation Binding
bind-rel = [ [(variable | '_')+] ]
A relation binding is fully general, binding multiple variables positionally to a relation (collection of tuples) passed in. This can be used to ask "or" questions involving multiple variables.
Where Clauses
where-clauses = ':where' clause+ clause = (not-clause | not-join-clause | or-clause | or-join-clause | expression-clause) expression-clause = (data-pattern | pred-expr | fn-expr | rule-expr)
A where clause limits the results returned. The most common kind of where clause is a data pattern that is matched against datoms in the database, but there are many other kinds of clauses to support negation, disjunction, predicates, and functions.
Implicit Joins
where
clauses implicitly join. If the same variable appears
in multiple clauses, those matches must unify.
Data Patterns
data-pattern = [ src-var? (variable | constant | '_')+ ]
A data pattern is a tuple that begins with an optional src-var which binds to a relation. The src-var is followed one or more elements that match the tuples of that relation in order. The relation is almost always a Datomic database, so the components are E, A, V, Tx, and Op. The elements of data pattern can be
- variables, which unify and bind to values
- constants, which limit results to tuples that match the constant
- the blank
_
which matches anything
Blanks
The underscore symbol (_
) is a blank placeholder, and matches anything
without binding or unifying.
Blanks do not cause unification.
Clauses with multiple blanks will not unify despite appearing to have the same symbol used.
Implicit Blanks
In data patterns, you should elide any trailing components you don't care about, rather than explicitly padding with blanks.
Predicates
pred-expr = [ [pred fn-arg+] ]
A predicate is an arbitrary Java or Clojure function. Predicates must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.
Predicates are invoked against variables are that are already bound to
further constrain the result set. If the predicate returns false
or
nil
for a set of variable bindings, that set is removed.
Built-in Predicates and Functions lists all built-in predicates.
Range Predicates
The predicates =
, !=
, <=
, <
, >
, and >=
are special, in
that they take direct advantage of Datomic's AVET index. This makes
them much more efficient than equivalent formulations using ordinary
predicates. For example, the "artists whose name starts with 'Q'"
query shown above is much more efficient than an equivalent version
using starts-with?
Unlike their Clojure equivalents, the Datomic range predicates require exactly two arguments.
Built-in Predicates and Functions lists all built-in predicates.
Functions
fn-expr = [ [fn fn-arg+] binding]
Queries can call arbitrary Java or Clojure functions. Such functions must be pure functions, i.e. they must be free of side effects and always return the same thing given the same arguments.
Functions are invoked against variables are that are already bound, and their results are interpreted via binding forms to bind additional variables.
Built-in Predicates and Functions lists all built-in functions.
Built-in Predicates and Functions
Datomic provides the following built-in expression functions and predicates:
- Two-argument comparison predicates: !=, <, <=, >, and >=.
- Two-argument mathematical operators: +, -, *, and /.
Note: Datomic's / operator is similar to Clojure's / in terms of promotion and contagion with a notable exception: Datomic's / operator does not return a clojure.lang.Ratio to callers. Instead, it returns a quotient as per quot.
get-else
[(get-else src-var ent attr default) ?val-or-default]
get-else takes a database, an entity identifier, a cardinality-one attribute, and a default value. It returns that entity's value for the attribute, or the default value if entity does not have a value.
get-some
[(get-some src-var ent attr+) [?attr ?val]]
get-some takes a database, an entity identifier, and one or more cardinality-one attributes, returning a tuple of the entity id and value for the first attribute possessed by the entity.
ground
[(ground const) binding]
ground takes a single argument, which must be a constant, and returns that same argument. Programs that know information at query time should prefer ground over e.g. identity, as the former can be used inside the query engine to enable optimizations.
missing?
[(missing? src-var ent attr)]
missing? takes a database, an entity identifier, and an attribute and returns true if the entity has no value for attribute in the database.
q
tuple
Calling Java Methods
Java methods can be used as query expression functions and predicates, and should be type hinted for performance. Java code used in this way must be on the Java process classpath.
Java methods should only be used when there is not an equivalent function in clojure.core.
Calling Static Methods
[(ClassName/methodName) [[?k ?v]]]
Java static methods can be called with the (ClassName/methodName …) form.
Calling Instance Methods
[(.methodName obj)] [(.methodName ^Class obj)] [(.methodName obj ...) ?ret]
Java instance methods can be called with the (.methodName obj …) form. Type hints outside java.lang must be fully qualified, and complex method signatures may require more than one hint to be unambiguous.
Calling Clojure Functions
Clojure functions can be used as query expression functions and
predicates. Function names outside clojure.core need to be fully qualified and
included on the classpath for Pro, or in the ion :allow
list
for Cloud. The query engine will automatically require namespaces as necessary.
Not Clauses
not-clause = [ src-var? 'not' clause+ ]
not clauses can express that one or more logic variables inside a query must not satisfy all of a set of predicates. Removes already bound tuples that satisfy the clauses. Unless you specify an explicit src-var, not clauses will target a source named $.
How Not Clauses Work
One can understand not clauses as if they turn into subqueries where all of the variables and sources unified by the negation are propagated to the subquery. The results of the subquery are removed from the enclosing query via set difference. Note that, because they are implemented using set logic, not clauses can be much more efficient than building your own expression predicate that executes a query, as expression predicates are run on each tuple in turn.
Insufficient Binding for a Not Clause
All variables used in a not clause will unify with the surrounding
query. This includes both the arguments to nested expression clauses as
well as any bindings made by nested function expressions. Datomic will
attempt to push the not clause down until all necessary variables are
bound, and will throw an ::anom/incorrect
anomaly if that is not possible
Not-join Clauses
not-join-clause = (src-var? 'not-join' [var+] clause+)
A not-join clause works exactly like a not clause, but also allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run.
var specifies which variables should unify.
Multiple Clauses In not Or not-join
When more than one clause is supplied to not or not-join, you should read the clauses as if they are connected by "and", just as they are in :where.
Or Clauses
or-clause = [ src-var? 'or' (clause | and-clause)+]
or clauses can express that one or more logic variables inside a query satisfy at least one of a set of predicates. An or clause constrains the result to tuples that satisfy at least one of its /clause/s or /and-clauses/s
Or Clause Variables
All clauses used in an or clause must use the same set of variables, which will unify with the surrounding query. This includes both the arguments to nested expression clauses as well as any bindings made by nested function expressions. Datomic will attempt to push the or clause down until all necessary variables are bound, and will throw an exception if that is not possible.
How Or Clauses Work
One can imagine or clauses turn into an invocation of an anonymous rule whose predicates comprise the or clauses. As with rules, src-vars are not currently supported within the clauses of or, but are supported on the or clause as a whole at top level.
And Clause
and-clause = [ 'and' clause+ ]
Inside an or clause, you may use an and clause to specify conjunction. The and clauses is not available (or needed) outside of an or clause, since conjunction is the default in other clauses.
Or-join Clause
or-join-clause = [ src-var? 'or-join' [variable+] (clause | and-clause)+ ]
An or-join clause is similar to an or clause, but it allows you to specify which variables should unify with the surrounding clause; only this list of variables needs binding before the clause can run. The /variable/s specifies which variables should unify.
With Clauses
A with-clause considers additional variables not named in the find-spec when forming the basis set for a query result. The with variables are then removed, leaving a bag (not a set!) of values to be consumed by the find-spec. This is particularly useful when finding aggregates.
Rules
Datomic datalog allows you to package up sets of :where
clauses into
named rules. These rules make query logic reusable, and also composable,
meaning that you can bind portions of a query's logic at query time.
Defining a Rule
rule = [ [rule-head clause+]+ ] rule-head = [rule-name rule-vars] rule-name = plain-symbol rule-vars = [variable+ | ([variable+] variable*)]
As with transactions and queries, rules are described using data structures. A rule is a list of lists. The first list in the rule is the rule-head. It names the rule and specifies its rule-vars. The rest of the lists are clauses that make up the body of the rule.
Using a Rule
inputs = ':in' (src-var | binding | pattern-name | rules-var)+ rules-var = the symbol "%" rule-expr = [ src-var? rule-name (variable | constant | '_')+]
You have to do two things to use a rule in a query. First, you have to pass a rule set (collection of rules) as an input source and reference it in the :in section of your query using the '%' symbol. Second, you have to invoke one or more rules with a rule-expr in the :where section of your query.
Multiple Rule Heads
Rules with multiple definitions will evaluate them as different logical
paths to the same conclusion (i.e. logical OR). In the rule below, the
rule name benelux
is defined three times.
Required Bindings
Rules normally operate exactly like other items in a where clause. They must unify with the variables already bound, and must bind any variables not already bound.
But sometimes you know that a rule will only be correct, or only be efficient, if some variables are already bound. You can require that some variables be bound before a rule can fire by enclosing the required variables in a vector or list as the first argument to the rule. If the required variables are not bound, Datomic will report an incorrect anomaly.
Rule Database Scoping
rule-expr = [ src-var? rule-name (variable | constant | '_')+]
By default, rules operate against the default database named by
$
. As with other where clauses, you may specify a database as a
src-var
before the rule-name to scope the rule to that
database. Databases cannot be used as arguments in a rule.
Rule Generality
Rules can contain any type of clause that a where clause might contain: data, expressions, or even other rule invocations.
Next: Pull