Configuration

Datomic analytics is implemented via a Datomic connector to Trino. To begin using analytics, you will need to perform the following steps:

step when
Prerequisites all Datomic Analytics Use
install Datomic analytics one-time setup
configure Trino one-time setup
configure a Datomic connection per Datomic system/db
configure a Datomic metaschema per Datomic metaschema
start and stop analytics as desired

The Datomic analytics distribution includes sample configuration and templates so that you can complete these steps in minutes. Each of these steps is explained below, along with pointers for deploying analytics for production use.

Prerequisites

Install Datomic Analytics

Download (1.17GB) and unzip the latest version (0.9.96) of the Datomic Presto Server. This zip file includes the PrestoSQL distribution (now named Trino, and referred to Trino from here on), the Datomic connector, and example configuration files for getting started.

Configure a Trino Cluster

Trino configuration lives under a directory that Trino commands call the etc-dir. The etc-dir is specified when launching trino.

The etc-samples directory of Datomic analytics is part of the Trino etc-dir configured to run a minimal Trino cluster of a single node on your local machine. You can leave these configuration files unchanged while exploring Trino on your local machine, and consult the Trino docs when you are ready to plan a production cluster.

Configure Connections

For each Datomic system, you will create a Trino catalog property file in the catalog subdirectory of the Trino etc-dir. A catalog file has a name of the form <catalog>.properties, where <catalog> is the catalog name. /etc-samples/catalog in the provided zip has an example catalog properties.

A Datomic catalog property file has the following entries:

property required? value Notes
connector.name yes "datomic" Do not change.
datomic.client.config yes a Datomic client connect map Must be on one line.
datomic.databases no a vector of Datomic database names  

The datomic.databases property lists the databases that will be available via Trino. Supplying one or more databases enables JDBC Metadata. If this property is omitted, all databases will be available and not automatically queried.

Any time you change a catalog file, you will need to stop and then start the Trino cluster for your changes to take effect.

The etc-samples directory includes a catalog named sample.properties. Before you start Trino, edit this file and set datomic.client.config to have a valid client connect map for your system.

You can also rename the file from sample.properties to something more descriptive of your system.

Enabling JDBC Metadata

Many analytics tools provide the ability to automatically explore all of the tables in your system. Such exploration involves issuing one (or many!) JDBC metadata queries, forcing each database in your system to be loaded.

Because these automatic queries can be so expensive, they are disabled by default. To enable JDBC metadata queries, you must explicitly enumerate the Datomic databases you want to query with the datomic.databases property described above.

Configure Metaschema

Metaschema files control the mapping between Datomic attributes and SQL tables and columns. Metaschema files are .edn files in the datomic subdirectory of Trino's etc-dir. Metaschema files can have any name you find convenient, and Datomic analytics will automatically associate metaschemas with any database that has matching attributes.

The etc-samples directory includes a metaschema file matching the mbrainz database.

To facilitate interactive development, Datomic analytics automatically discovers changes to metaschema files within a minute of changes. You do not need to restart your Trino cluster to pick up changes to metaschema files. However, if you use an analytics tool such as metabase that scans Trino to discover schema, you may need to re-run the scan manually to pick up metaschema changes.

Start and Stop Trino

Datomic analytics can be launched with Trino's bin/launcher command.

First make scripts executable:

chmod +x bin/launcher*

To run analytics in the foreground of a shell window, navigate to the root directory of the Datomic analytics bundle and enter:

bin/launcher run --etc-dir=etc-samples

To verify that your Trino cluster is running, go to localhost:8989, enter use the username admin and leave the password field blank. A Cluster Overview should be displayed with metrics about your cluster.

To stop Trino, you can simply interrupt or kill the process, e.g. with Ctrl-C.

Trino in Production

The sample configuration that ships with Datomic anlaytics will have you up and quering in minutes, but it only scratches the surface of things you may want to do. Trino also allows you to join across disparate data sources, and cluster for horizontal scale.

When you are ready to run Trino in production, you will want to

All of these options and more are covered in the Trino deployment docs.

Now you can use the SQL CLIent.