Monitoring Ions

Topics

All the code examples shown below are available in the Ion Event Example project.

Overview

AWS CloudWatch provides a powerful set of tools for monitoring a software system running on AWS, and Datomic fully integrates with these tools.

The datomic.ion.cast namespace lets Ion application code add your own monitoring data alongside the monitoring data already being produced by Datomic applications. Cast supports four categories of monitoring data:

  • An event is an ordinary occurrence that is of interest to an operator, such as start and stop events for a process or activity.
  • An alert is an extraordinary occurrence that requires operator intervention, such as the failure of some important process.
  • Dev is information of interest only to developers, e.g. fine-grained logging to troubleshoot a problem during development. Dev data can be much higher volume than events or alerts.
  • A metric is a numeric value in a named time series, such as the latency for an operation.

Cast is part of the com.datomic/ion library.

Events

An event is an ordinary occurrence that is of interest to an operator, such as start and stop events for a process or activity. The event function takes a map with a single required key, :msg, a string, plus any additional namespace-qualified keys you choose.

For example, the metric call below logs the raw JSON sent to an AWS Lambda ion:

(cast/event {:msg "CodeDeployEvent" ::json input})

When Ion code is running on Datomic Cloud, events are posted to Datomic's CloudWatch log. Datomic transforms data to searchable JSON as follows:

  • Datomic adds the key/value pair {"Type "Event"}.
  • Keywords are converted to CamelCase strings, with the delimiters ., /, and _ introducing a new capital letter. For example, my.app/key becomes "MyAppKey".
  • Exceptions are converted to data via Clojure's Throwable->map.
  • Double/Nan and Double/Infinity are converted to the strings "NaN" and "Infinity".
  • Other datatypes are converted to strings via Clojure's str function.

Alerts

An alert is an extraordinary occurrence that requires operator intervention, such as the failure of some important process. The alert function takes a map with

  • a required :msg string
  • an optional :ex Java Throwable
  • any additional namespace-qualified keys you choose

For example, the alert call below is used in the catch block of a web service ion to report an unexpected server failure:

(cast/alert {:msg "SlackHandlerFailed" :ex t})

When Ion code is running on Datomic Cloud, alerts are posted to Datomic's CloudWatch log. The data transformation is the same as for events, except that the value for Type is "Alert".

In addition to the log entry, alerts create a Datomic CloudWatch metric value named "Alerts". You can use the occurrence of this metric to e.g. send an SNS message to an operator.

Dev

Dev is information of interest only to developers, e.g. fine-grained logging to troubleshoot a problem during development. Dev data can be much higher volume than events or alerts.

The cast/dev function takes an arbitrary map. In the example below, cast/dev shows the channel and text of a slack post (presumably so that a developer can verify that a funtion is being called correctly):

(cast/dev {:msg "PostingToSlack" ::channel channel ::text text})

You can redirect cast/dev as part of your local dev workflow.

NOTE Configuring a destination for cast/dev when running in Datomic Cloud is currently not supported.

Metrics

A metric is a numeric value in a named time series, such as the latency for an operation. The metric funtion takes a map with the following required keys:

  • name, a keyword
  • value, a number that can be cast via Clojure double
  • type, one of :msec, :bytes, :kb, :mb, :gb, :sec, :count

For example, the metric call below is used to record the occurrence of a CodeDeployEvent:

(cast/metric {:name :CodeDeployEvent :value 1 :units :count})

NOTE AWS will display all metrics in lowercase with the first character capitalized. As an example, the aforementioned :CodeDeployEvent will display as Codedeployevent in both the metrics and the logs.

When Ion code is running on Datomic Cloud in the Production Topology, metrics are translated into Datomic CloudWatch metrics. Choose metric names that do not collide with the built-in metric names.

NOTE Custom metrics are not supported in the Solo Topology. Calls to cast/metric will have no effect.

Local Workflow

You may find it useful to see monitoring data when you are developing on a machine outside Datomic. You can call initialize-redirect once per process to prn all alert, event, and dev output to one of the following targets:

:stdout standard output :stderr standard error (string) a filename

There is no redirection when running in Datomic Cloud, so it is ok to leave calls to initialize-redirect in your production code.

Java Logging

Datomic Cloud uses SLF4J to redirect output from all Java logging frameworks to cast/alert:

  • ERROR and WARN level logs produce an alert whose message includes the logger name, level, and message.
  • Lower log levels are ignored.