A Brief Overview of Datomic
Welcome to Datomic! Datomic is a new kind of database, designed to be:
- distributed (conceived with the cloud in mind from day 1)
- transactional (it is fully ACID compliant and always consistent)
- immutable (it stores an accumulation of facts over time)
- queryable (Datomic uses datalog, a powerful declarative query language)
- flexible (in schema design and modification, in deployment topology, and in usage scenarios)
Datomic is an operational database management system - designed for transactional, domain-specific data. It is, by definition, not designed to be a data warehouse, nor a high-churn, high-throughput system, such as a time-series database or log store. It is a good fit for systems that store valuable information of record, require developer and operational flexibility, need history and audit capabilities, and require read scalability. Some examples of successful Datomic use cases include transactional data, business records, medical records, financial records, scientific records, inventory, configuration, web applications, departmental databases, and cloud applications.
What is different about Datomic?
The foremost thing that sets Datomic apart is that it accumulates immutable facts over time. Most databases assign values into named locations (a field in a particular row, a node in a particular document), and as those values change, the new values overwrite the older ones. Datomic tracks the entire history of a fact and allows you to access its previous states quickly and easily.
Secondly, Datomic is really a decomposed database: instead of a single monolith handling reading, writing and storing your data, those functions have been parceled out to multiple, interdependent components, each of which runs in a separate process. The components of a running Datomic installation will be some combination of:
- storage services: the destination for persistent data. Examples are: DynamoDB, Oracle, file system, etc.
- transactor: the process that controls inbound transactions, and coordinates persistence to the storage services
- peers: processes containing your application code and the Datomic peer library, which can query the persisted data
- peer servers: special kinds of peers that can act as a centralized query server for clients
- clients: processes containing your application code and the Datomic client library, a lighter-weight engine that relies on one or more peer servers to perform the heavy lifting
- console: a web server that provides a graphic interface for browsing and querying one or more Datomic databases
A running Datomic system consists of some or all of these processes running simultaneously. This decomposition means you can distribute reads across multiple processes that do not conflict with each other, or with writes, thereby providing horizontal read scalability. The transactor process acts as a single authority for inbound transactions, which is what allows the system to be ACID compliant and fully consistent.
Third, Datomic leverages existing storage systems. This means that your data can be safely persisted in a known, trusted storage service, and you can migrate to alternative services in the future as your needs change.
So how does it work?
There are two main models of developing with Datomic, which you can choose between (or integrate together). These two models are knows as the Peer model and the Client model. In both cases, you must run a Transactor, which is the central hub for inbound writes, and which is configured to know about your storage service, which is where indexes and values are persisted.
When you embed the Peer library in your application code, you are essentially adding your application code as part of the database system. When you issue transactions, they are sent to the Transactor, and thence to storage. Your Peer application node (and all other Peer nodes) are then notified of and sent the changes to the database. When you query the data or access the raw indexes, these operations happen locally in-memory, dramatically reducing latency of reads.
When you embed the Client library, you must be running one or more Peer Servers - special intermediate nodes that handle queries and caching (and can be scaled horizontally, just like Peers). Clients therefore have a network hop for read operations, as queries are sent to Peer Servers. The Client library is much lighter weight than the Peer library, at the cost of increasing latency on reads.
Try it out
Next, you can walk through a 5 minute guide that explores the most common operations of a database:
- connecting to a database
- defining a schema
- transacting data
- querying data
- seeing the history of that data
To get started, Get Datomic
The Peer-based Getting Started tutorial can be found at Peer Getting Started.