Tutorial

Introduction

Information, not CRUD

Datomic models information. Information is stored in fine-grained datoms, as opposed to coarse-grained tables or documents. Information accumulates over time. Information is not forgotten as a side effect of acquiring new information.

This is in stark contrast with the Create/Read/Update/Delete (CRUD) paradigm:

CRUDDatomic
CreateAssert
ReadRead
UpdateAccumulate
DeleteRetract
  • Assertions are granular statements of fact.
  • Reads are always peformed against an immutable database value at a particular point in time. Time is globally ordered in a database by ACID transactions.
  • New transactions only Accumulate new data. Existing datoms never change.
  • Retractions state than an assertion no longer holds at some later point in time. The original assertion remains unchanged.

Assert/Read/Accumulate/Retract (ARAR) should be pronounced doubled and in a pirate voice "Ar Ar Ar Ar".

Iterative Development

Datomic is uniquely suited for iterative development. Change is easy, due to the granular data model and small but powerful schema. And change is always tracked within the database itself, so you do not need a parallel infrastructure of version-controlled migration files as your application evolves.

The examples in this tutorial will use Clojure, a language that is also particularly strong at interactive development. You will see data modeling, transactions, query, and multiple evolutionary steps all performed within a single REPL (read-eval-print loop) session.

Using the Tutorial

This tutorial will demonstrate iteratively building an inventory database. We will start from a blank slate and show how you can rapidly iterate as you model your domain.

Example data is shown in this tutorial in the Extensible Data Notation.

If you have followed the Getting Stared instructions in Clojure up through First Query, then you are already positioned at a REPL ready to go. If not, you will need to

Now let's get started.

Assertion

List and Map Forms

An assertion adds a single atomic fact to Datomic. Assertions are represented by ordinary data structures (lists or maps). Our inventory database will need to have enumerated types for various product attributes such as color, so let's start with a list that asserts the color green:

[:db/add "foo" :db/ident :green]
  • :db/add specifies that this is an assertion
  • "foo" is a temporary entity id for the new entity
  • :db/ident is an attribute used for programmatic identifiers
  • :green is the datom's value

The same datom can be represented by a map:

{:db/ident :green}

Maps imply (and are equivalent to) a set of assertions, all about the same entity.

Transactions

Datomic databases are updated via ACID transactions, which add a set of datoms to the database. Execute the code below at the Clojure REPL to add colors to the inventory database in a single transaction.

The transaction below adds four colors to the database.

  (<!! (client/transact
        conn 
        {:tx-data [{:db/ident :red}
                   {:db/ident :green}
                   {:db/ident :blue}
                   {:db/ident :yellow}]}))
=> ;; returns a big map

Client transactions are asynchronous and return a core.async channel. Here we have used the channel take operator <!! to block the REPL thread until the transaction result is available.

A successful transaction returns a map with information about the transaction and the state of the database. We will explore this more later.

Programming with Data

In addition to colors, our inventory database will also track sizes and types of items. Since we are programming with data, it is easy to write a helper function to make these transactions more concise. The make-idents function shown below will take a collection of keywords, and return transaction-ready maps.

(defn make-idents
  [x]
  (mapv #(hash-map :db/ident %) x))

You can quickly see that this works by trying it out at the REPL:

(def sizes [:small :medium :large :xlarge])
(make-idents sizes)
=> [#:db{:ident :small} #:db{:ident :medium} 
    #:db{:ident :large} #:db{:ident :xlarge}]

Note the because make-idents function takes and returns pure data, no database is necessary to develop and test this function.

Let's put the types and sizes into the database and define a collection of colors we already added:

(def types [:shirt :pants :dress :hat])
(def colors [:red :green :blue :yellow])
(<!! (client/transact conn {:tx-data (make-idents sizes)}))
(<!! (client/transact conn {:tx-data (make-idents types)}))

Schema

The :db/ident attribute type is preinstalled with Datomic, because it represents a concern (programmatic identifiers) that crosscuts all domains. Now we want to add some inventory-specific attributes:

  • sku, a unique string identifier for a particular product
  • color, a reference to a color entity
  • size, a reference to a size entity
  • type, a reference to a type entity

In Datomic, schema are entities just like program data. A schema entity must include:

  • :db/ident, a programmatic name
  • :db/valueType, a reference to an entity that specifies what type the attribute allows
  • :db/cardinality, a reference to an entity that specifies whether a particular entity can possess more than one value for the attribute at a given time.

So we can add our schema like this:

(def schema-1
  [{:db/ident :inv/sku
    :db/valueType :db.type/string
    :db/unique :db.unique/identity
    :db/cardinality :db.cardinality/one}
   {:db/ident :inv/color
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/one}
   {:db/ident :inv/size
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/one}
   {:db/ident :inv/type
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/one}])
(<!! (client/transact conn {:tx-data schema-1}))

Notice that the :inv/sku attribute also has a :db/unique datom. This datom specifies that every :inv/sku must be unique within the database.

Sample Data

Now let's make some sample data. Again, no special API is necessary, we can just use the collection support in our language to make some collections. The following expressions creates one example inventory entry for each combination of color, size, and type.

(def sample-data
  (->> (for [color colors size sizes type types]
         {:inv/color color
          :inv/size size
          :inv/type type})
       (map-indexed
        (fn [idx map]
          (assoc map :inv/sku (str "SKU-" idx))))
        vec))
sample-data
=> ;; 64 (4 x 4 x 4) maps

(<!! (client/transact conn {:tx-data sample-data}))

Now that we have asserted some data, let's look at some different ways we can retrieve it.

Read

Database Values

Datomic maintains the entire history of your data. From this, you can query against a database value as of a particular point in time.

The db API returns the latest database value from a connection.

(def db (client/db conn))

An analogy with source control is helpful here. A Datomic connection references the entire history of your data, analogous to a source code repository. A database value from db is analogous to checkout.

Pull

If you know an entity id, you can use the pull API to return information about that entity (or recursively about related entities. Better still, if the entity has a unique attribue, you do not even need to knows its entity id. A lookup ref is a two element list of unique attribute + value that uniquely identifies an entity, e.g.

[:inv/sku "SKU-42"]

The following call pulls the color, type, and size for SKU-42.

(<!! (client/pull
      db
      {:eid [:inv/sku "SKU-42"]
       :selector [{:inv/color [:db/ident]}
                  {:inv/size [:db/ident]}
                  {:inv/type [:db/ident]}]}))
=> #:inv{:color #:db{:ident :blue}, 
         :size #:db{:ident :large}, 
         :type #:db{:ident :dress}}

Note that the arguments and return value of pull are both just ordinary data structures, i.e. lists and maps.

Query

Storing and retrieving data by unique id is useful, but a database needs also to provide a declarative, logic-based query. Datomic uses Datalog with negation, which has expressive power similar to SQL + recursion.

The following query finds the skus of all products that share a color with SKU-42

(<!! (client/q
      conn
      {:query '[:find ?sku
                :where
                [?e :inv/sku "SKU-42"]
                [?e :inv/color ?color]
                [?e2 :inv/color ?color]
                [?e2 :inv/sku ?sku]]
       :args [db]}))
=> [["SKU-42"] 
    ["SKU-32"] 
    ...]

Note that the arguments and return value of q are both just ordinary data structures, i.e. lists and maps.

In the :where clauses, each list further constrains the results. For each list:

  • the first element matches the entity id
  • the second element matches an attribute
  • the third element matches an attribute's value

Symbols beginning with a question mark are datalog variables. When the same symbol occurs more than once, it causes a join. In the query above

  • ?e joins SKU-42 to its color
  • ?e2 joins to all entities sharing the color
  • ?sku joins all ?e2 entities to their skus

Now we are confident that we can get basic inventory in and out. Just in time, too, because our stakeholders are back with more feature requests.

Accumulate

More Schema

Our stakeholders have a new request. Now it isn't just an inventory database, it also needs to track orders:

  • an order is a collection of line items
  • each line item has a count and references an item in inventory

We can model this directly in Datomic schema without translation:

(def order-schema
  [{:db/ident :order/items
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/many
    :db/isComponent true}
   {:db/ident :item/id
    :db/valueType :db.type/ref
    :db/cardinality :db.cardinality/one}
   {:db/ident :item/count
    :db/valueType :db.type/long
    :db/cardinality :db.cardinality/one}])
(<!! (client/transact conn {:tx-data order-schema}))
=> ;; transaction result map

Notice that :db.cardinality/many captures the notion that a single order can have more than one item.

More Data

Now let's add a sample order:

(def add-order
  {:order/items
   [{:item/id [:inv/sku "SKU-25"]
     :item/count 10}
    {:item/id [:inv/sku "SKU-26"]
     :item/count 20}]})

(<!! (client/transact conn {:tx-data [add-order]}))

Here you see a nested entity map. The top level is the order, which has multiple line items. The nested level has two line items.

With this data in hand, let's explore some more features of query.

Read Revisted: More Query

First, don't forget to acquire the latest value of the database, after the transaction that added the order.

(def db (client/db conn))

Parameterized Query

Now let's try a more complex query. We would like to be able to suggest additional items to shoppers, so we need a query that, given any inventory item, finds all the other items that have ever appeared in the same order.

Such a query will have two parameters:

  • a database value
  • an inventory entity id

Parameters enter query via an :args list, and they are named by a corresponding :in clause. The special $ name is a placeholder for the database value.

(<!! (client/q
      conn
      {:query '[:find ?sku
                :in $ ?inv
                :where
                [?item :item/id ?inv]
                [?order :order/items ?item]
                [?order :order/items ?other-item]
                [?other-item :item/id ?other-inv]
                [?other-inv :inv/sku ?sku]]
       :args [db [:inv/sku "SKU-25"]]}))
=> [["SKU-25"] ["SKU-26"]]

Notice how variables are used to join:

  • ?inv is bound on input to the entity id for SKU-25, which
  • joins to every order ?item mentioning ?inv, which
  • joins to every ?order of that ?item, which
  • joins to every ?other-item in those orders, which
  • joins to every ?other-inv inventory entity, which
  • joins to all the skus ?sku

Rules

The "related items" feature is so nice that we would like to use it in a bunch of different queries. You can name query logic as a rule and reuse it in multiple queries.

Create a rule named ordered-together that binds two variables ?inv and ?other-inv if they have ever appeared in the same order:

(def rules
  '[[(ordered-together ?inv ?other-inv)
     [?item :item/id ?inv]
     [?order :order/items ?item]
     [?order :order/items ?other-item]
     [?other-item :item/id ?other-inv]]])

Now you can pass these rules to a query, using the special :in name %, and then refer to the rules by name:

(<!! (client/q
      conn
      {:query '[:find ?sku
                :in $ % ?inv
                :where
                (ordered-together ?inv ?other-inv)
                [?other-inv :inv/sku ?sku]]
       :args [db rules [:inv/sku "SKU-25"]]}))
=> [["SKU-25"] ["SKU-26"]]

So far we have created an accumulated data. Now let's look at what happens when things change over time.

Retract

Explicit Retract

We would like to keep a count of items in inventory, so let's add a bit more schema:

(def inventory-counts
  [{:db/ident :inv/count
    :db/valueType :db.type/long
    :db/cardinality :db.cardinality/one}])

(<!! (client/transact conn {:tx-data inventory-counts}))

Now we can assert that we have seven of SKU-21 and a thousand of SKU-42:

;; deliberate mistakes here!
(def inventory-update
  [[:db/add [:inv/sku "SKU-21"] :inv/count 7]
   [:db/add [:inv/sku "SKU-22"] :inv/count 7]
   [:db/add [:inv/sku "SKU-42"] :inv/count 100]])

(<!! (client/transact conn {:tx-data inventory-update}))

Curse my clumsy fingers, we just put some bad data into the system. We aren't supposed to have any SKU-22, but we just added seven. We can fix this with a retraction, which cancels the effect of an assertion:

(<!! (client/transact
      conn
      {:tx-data [[:db/retract [:inv/sku "SKU-22"] :inv/count 7]
                 [:db/add "datomic.tx" :db/doc "remove incorrect assertion"]]}))

The :db/retract above removes the incorrect value, but note that we are also adding an assertion about the special tempid "datomic.tx". Every transaction in Datomic is its own entity, making it easy to add facts about why a transaction was added (or who added it, or from where, etc.)

Implicit Retract

We also miskeyed the entry for SKU-42, asserting 100 instead of 1000. We can fix this simply by asserting the correct value. We do not need also to retract the old value; since :inv/count is :cardinality/one, Datomic knows that there can only be one value at a time and will automatically retract the previous value:

(<!! (client/transact
      conn
      {:tx-data [[:db/add [:inv/sku "SKU-42"] :inv/count 1000]
                 [:db/add "datomic.tx" :db/doc "correct data entry error"]]}))

When we look only at the most recent database value, all we can see is the net effect after our corrections.

(def db (client/db conn))
(<!! (client/q
      conn
      {:query '[:find ?sku ?count
                :where
                [?inv :inv/sku ?sku]
                [?inv :inv/count ?count]]
       :args [db]}))
=> [["SKU-42" 1000] ["SKU-21" 7]]

Knowing the present truth is a starting point, but Datomic's model of time will let us do a lot more.

History

asOf Query

Imagine that SKU-22 requires cold storage. Pat notices the database entry showing we have some SKU-22 in stock and turns the thermostat down to 56F. This turns out to be very unpopular with everybody that works there. By the time somebody else checks the database to verify Pat's finding, the data error has been fixed.

This simple example shows why systems of record should never delete data, even if that data is mistaken. Other parties may have acted on that data, and a key job of record-keeping is to provide an audit trail in these situations.

With Datomic, you can make a database query as-of any previous point in time, where time can be specified either as an instant or as a transaction id. If you are following along in code, you probably don't remember the exact instant in time that you made the correction above–and you don't have to. You can query the system for the most recent transactions:

(<!! (client/q
      conn
      {:query '[:find (max 3 ?tx)
                :where
                [?tx :db/txInstant]]
       :args [db]}))
=> [[[13194139534402 13194139534401 13194139534400]]]

The max in find limits the results to the three highest valued (most recent) transaction ids. Take the smallest of these, and use as-of to back up past the two "correction" transactions. Now you can see the data about SKU-22 that justifies Pat's unpopular decision:

;;Your transaction ids may differ
(def txid 13194139534400)
(def db-before (client/as-of db txid))

(<!! (client/q
      conn
      {:query '[:find ?sku ?count
                :where
                [?inv :inv/sku ?sku]
                [?inv :inv/count ?count]]
       :args [db-before]}))
=> [["SKU-42" 100] ["SKU-42" 1000] ["SKU-21" 7] ["SKU-22" 7]]

history Query

In addition to point-in-time auditing, you can also review the entire history of your data. When you query against a history database value, the query will return all assertions and retractions, regardless of when they were in effect. The following query shows the complete history of :inv/count data for items that have a SKU:

(require '[clojure.pprint :as pp])
(def db-hist (client/history db))
(->> (<!! (client/q
           conn
           {:query '[:find ?tx ?sku ?val ?op
                     :where
                     [?inv :inv/count ?val ?tx ?op]
                     [?inv :inv/sku ?sku]]
            :args [db-hist]}))
     (sort-by first)
     (pp/pprint))
=> ([13194139534399 "SKU-21" 7 true]
    [13194139534399 "SKU-42" 100 true]
    [13194139534399 "SKU-22" 7 true]
    [13194139534400 "SKU-22" 7 false]
    [13194139534402 "SKU-42" 1000 true]
    [13194139534402 "SKU-42" 100 false])

The ?op is true for assertions, and false for retractions. You can see that:

  • Transaction …399 set the count for three SKUs.
  • Transaction …400 retracted the count for SKU-22.
  • Transaction …402 "changed" the count for SKU-42.

Conclusion

In Datomic, information is open, flexible, associative, and indelible. This matches well with the original, non-software definitions of "record" and "data", and supports building systems that

  • accurately record the past
  • provide powerful access to those records
  • evolve easily and naturally

While this tutorial has introduced the core ideas, it has only scratched the surface of Datomic's capabilities. As you build your database, consult the reference section of docs.datomic.com for more in-depth coverage.