Configuration

Datomic analytics is implemented via a Datomic connector to Trino. To begin using analytics, perform the following steps:

Step	When
Prerequisites	All Datomic analytics use
Install Datomic analytics	One-time setup
Configure Trino	One-time setup
Configure a Datomic connection	Per Datomic system/db
Configure a Datomic metaschema	Per Datomic Metaschema
Start and stop analytics	As desired

The Datomic analytics distribution includes sample configuration and templates so that you can complete these steps in minutes. Each of these steps is explained below, along with pointers for deploying analytics for production use.

Prerequisites

An accessible Datomic Cloud system
Java 11

Install Datomic Analytics

Download (1.17GB) and unzip the latest version (0.9.96) of the Datomic Presto server. This zip file includes the PrestoSQL distribution (now named Trino, and referred to as Trino from here on), the Datomic connector, and example configuration files for getting started.

Configure a Trino Cluster

Trino configuration lives under a directory that Trino commands call the etc-dir. The etc-dir is specified when launching Trino.

The etc-samples directory of Datomic analytics is part of the Trino etc-dir configured to run a minimal Trino cluster of a single node on your local machine. You can leave these configurations files unchanged while exploring Trino on your local machine, and consult the Trino docs when you are ready to plan a production cluster.

Configure Connections

For each Datomic system, you have to create a Trino catalog property file in the catalog subdirectory of the Trino etc-dir. A catalog file has a name of the form <catalog>.properties, where <catalog> is the catalog name. /etc-samples/catalog in the provided zip has an example catalog properties.

A Datomic catalog property file has the following entries:

Property	Required?	Value	Notes
connector.name	Yes	"datomic"	Do not change
datomic.client.config	Yes	A Datomic client connect map	Must be on one line
datomic.databases	No	A vector of Datomic database names

The datomic.databases property lists the databases that are available via Trino. Supplying one or more databases enables JDBC Metadata. If this property is omitted, all databases will be available and not automatically queried.

Any time you change a catalog file, you have to stop and then start the Trino cluster for your changes to take effect.

The etc-samples directory includes a catalog named sample.properties. Before you start Trino, edit this file and set datomic.client.config to have a valid client connect map for your system.

You can also rename the file from sample.properties to something more descriptive of your system.

Enabling JDBC Metadata

Many analytics tools provide the ability to automatically explore all of the tables in your system. Such exploration involves issuing one (or many) JDBC metadata queries, forcing each database in your system to be loaded.

Because these automatic queries can be so expensive, they are disabled by default. To enable JDBC metadata queries, explicitly enumerate the Datomic databases you want to query with the datomic.databases property described above.

Configure Metaschema

Metaschema files control the mapping between Datomic attributes and SQL tables and columns. Metaschema files are .edn files in the datomic subdirectory of Trino's etc-dir. Metaschema files can have any name you find convenient, and Datomic analytics automatically associate Metaschemas with any database that has matching attributes.

The etc-samples directory includes a Metaschema file matching the mbrainz database.

To facilitate interactive development, Datomic analytics automatically discovers changes to Metaschema files within a minute of changes. You do not need to restart your Trino cluster to pick up changes to Metaschema files. However, if you use an analytics tool such as Metabase (that scans Trino to discover schema) you may need to re-run the scan manually to pick up Metaschema changes.

Start and Stop Trino

Datomic analytics can be launched with Trino's bin/launcher command.

First, make scripts executable:

chmod +x bin/launcher*

To run analytics in the foreground of a shell window,

navigate to the root directory of the Datomic analytics bundle and enter:

bin/launcher run --etc-dir=etc-samples

To verify that your Trino cluster is running, go to localhost:8989, enter use the username admin, and leave the password field blank. A cluster overview should be displayed with metrics about your cluster.
To stop Trino, interrupt or kill the process, e.g. with Ctrl-C.

Trino in Production

The sample configuration that ships with Datomic analytics will have you up and querying in minutes, but it only scratches the surface of things you may want to do. Trino also allows you to join across disparate data sources, and cluster for horizontal scale.

When you are ready to run Trino in production, you will want to

Tune your JVM config to take full advantage of available memory
Run Trino as a daemon or in a container
Configure authn/authz

All of these options and more are covered in the Trino deployment docs.

After these steps, it's possible to use the SQL CLIent.