Troubleshooting

Not finding what you need here? Ask on the forum.

Troubleshooting CloudFormation Templates

Failures in Nested Templates

If a nested template fails, the root template will fail also, and the console will by default show only the root template's CREATE_FAILED event. To see the important detials in the nested template's events:

  • Click on the Filter popup, and choose "Deleted" images/template-filters.png
  • Select the nested template, and then choose the Events tab to see the cause of failure.

Check for CloudFormation failure

To check your CloudFormation for errors, find your stack in the CloudFormation window and click the checkbox at the start of its row.

If the status of the stack is "CREATE_IN_PROGRESS" the stack is still being created, and you should continue to wait for the stack to start. You can monitor it from the Cloudformation details window. The stack is done when it says "CREATE_COMPLETE."

If the status of the stack is "CREATE_FAILED" there was an error starting the stack. See the other topics on this page for help tracking down the source of the failure.

Production Topology Create Failed

If the production topology fails to create in CloudFormation with a CREATE_FAILED on the AWS::AutoScaling::AutoScalingGroup for the TxAutoScalingGroup resource with the message:

You have requested more instances (2) than your current instance limit of 1 allows for the specified instance type. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit. Launching EC2 instance failed.

You need to request a limit increase in your allowed number of running i3.large instances.

Starting a compute stack without a storage stack.

Without a storage system, compute will fail immediately and rollback. The stack event message will report the following error, substitute <system-name> for the system name you specified when launching your compute stack.

No export named <system-name>-FileSystemId found. Rollback requested by user.

./images/compute-only-error.png

Running in EC2 Classic

Datomic Cloud is currently only supported in accounts that support only EC2-VPC. If you attempt to run Datomic Cloud in an EC2-Classic, the Cloudformation stack will fail. The stack event message will have an error with a Type of "Custom::ResourceCheck" and a Logical ID of "EnsureEc2Vpc."

See Setting Up for more information.

Troubleshooting Compute Nodes

Check Alerts First

Datomic publishes a metric named Alert for any situation that might require operator intervention. You can create a Cloudwatch Alarm for the Alert metric, and then search the logs for specific details.

Reviewing alerts should be your first step when you are troubleshooting Datomic Cloud nodes.

Check Memory Second

Running out of memory is the number one cause of both availabilty and performance problems.

Every Datomic node publishes a JvmFreeMb metric. If the Minimum value for this metric dips below 15% of its initial startup value for an extended period, memory pressure is likely a problem.

  • If you believe the problem is individual large queries, upgrade to larger EC2 instance sizes.
  • If you believe the problem is a high volume of smaller queries, increase the number of instances in your compute group.

Troubleshooting Client Errors

Missing Connect Arguments

If you leave out a required argument.

CompilerException clojure.lang.ExceptionInfo: Expected string for :query-group {:cognitect.anomalies/category :cognitect.anomalies/incorrect, :cognitect.anomalies/message "Expected string for :query-group", :datomic.client.impl.shared.validator/got {:server-type :cloud, :region "us-east-1", :system "errortest2", :endpoint "http://entry.errortest2.us-east-1.datomic.net:8182/", :proxy-port 8888}, :datomic.client.impl.shared.validator/op :client, :datomic.client.impl.shared.validator/requirements {:region string, :system string, :query-group string, :endpoint string, :server-type keyword}}, compiling:(form-init8129817634595122502.clj:1:13)

::cognitect.anomalies/forbidden

If you get a forbidden error when using the Client API, this means that your AWS credentials do not grant permission to this functionality.

  • Make sure that you have an IAM policy that authorizes the operation you are trying to perform.
  • Make sure that policy is associated with the identity you are running under.

ExceptionInfo Forbidden to read keyfile

>d/create-database client {:db-name "test"})
ExceptionInfo Forbidden to read keyfile at s3://datomic-test-storagef7f305e7-1bpzuyaf5d-s3datomic-1wgl6uvtl9bei/datomic-test/datomic/access/admin/.keys. Make sure that your endpoint is correct, and that your ambient AWS credentials allow you to GetObject on the keyfile.  clojure.core/ex-info (core.clj:4739)

Ensure that you have sourced the right credentials with all necessary permissions.

Connection Failure

The client error:

Exception in thread "main" clojure.lang.ExceptionInfo: Unable to connect to system: 
{:cognitect.anomalies/category :cognitect.anomalies/fault, :cognitect.anomalies/message "SOCKS4 tunnel failed, connection closed", :cognitect.http-client/throwable #error {
 :cause "SOCKS4 tunnel failed, connection closed"
 :via
  [{:type java.io.IOException
    :message "SOCKS4 tunnel failed, connection closed"
    :at [org.eclipse.jetty.client.Socks4Proxy$Socks4ProxyConnection onFillable "Socks4Proxy.java" 166]}] ...

Accompanied by the following message from your SOCKS Proxy:

debug1: channel 2: new [dynamic-tcpip]
channel 2: open failed: administratively prohibited: open failed

Indicates that the client was unable to reach the Datomic system through the proxy. Check your configuration :endpoint carefully and use the Bastion Connection Test to ensure your proxy is configured correctly.

java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter

This issue is resolved on client-cloud 0.8.63

On com.datomic/client-cloud 0.8.56 and Java 9+ you'll run into this error when connecting:

CompilerException clojure.lang.ExceptionInfo: java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter #:cognitect.anomalies{:category :cognitect.anomalies/fault, :message "java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter"}, compiling:(form-init692047429040487558.clj:1:11)

As of Java 9 the jaxb module was removed from the default JDK. As such, you must explicitly require the javax.xml.bind/jaxb-api {:mvn/version "2.3.0"} dep to resolve this error.

Jetty Dependency Conflict

The following error will occur if your project has a transient dependency on a more recent version of Apache Jetty than the version used by Datomic:

\#error {
 :cause "org.eclipse.jetty.util.thread.Invocable$InvocationType"
  :via
   [{:type java.lang.NoClassDefFoundError
     :message "org/eclipse/jetty/util/thread/Invocable$InvocationType"
     :at [org.eclipse.jetty.io.ManagedSelector <init> "ManagedSelector.java" 79]}
    {:type java.lang.ClassNotFoundException
     :message "org.eclipse.jetty.util.thread.Invocable$InvocationType"
     :at [java.net.URLClassLoader findClass "URLClassLoader.java" 381]}]
 ...}

The conflict can be resolved by excluding the version of Jetty used by Datomic:

[com.datomic/client-cloud "0.8.54"
 :exclusions [org.eclipse.jetty/jetty-client
              org.eclipse.jetty/jetty-http
              org.eclipse.jetty/jetty-util]]

::cognitect.anomalies/busy

If you get a busy response from the Client API, your request rate has temporarily exceeded the capacity of a node, and has already been through an exponential backoff and retry implemented by the client. At this point you have three options:

  • When transacting continue to retry the request at the application level with you own exponential backoff. The Mbrainz Importer example project demonstrates a batch import with retry.
  • When querying expand the capacity of your system, by upgrading from Solo to Production or increasing the number of instances in your AutoScaling Group.
  • Give up on completing the request.

Troubleshooting Socks Proxy Errors

Unsupported AWS CLI Version

Symptom: The script prints a generic AWS CLI help message before failing:

$ bash datomic-socks-proxy -r eu-central-1 good-system
To see help text, you can run:

  aws help
  aws <command> help
  aws <command> <subcommand> help
aws: error: argument command: Invalid choice, valid choices are:
...

Fix: Update to the latest version of the AWS CLI.

Wrong AWS Creds

$ bash datomic-socks-proxy -r eu-central-1 good-system
Unable to read bastion key, make sure your AWS creds are correct.

Fix: Make sure your AWS creds are connected to a policy that is authorized for your Datomic system.

Wrong System Name

$ bash datomic-socks-proxy -r eu-central-1 not-a-system
Datomic system not-a-system not found, make sure your system name and AWS creds are correct.

Troubleshooting Ions

Ion push Requires a Clean git Commit

If you attempt to push an ion project with uncommitted local changes, you will see the following exception:

java.lang.IllegalArgumentException: You must either specify a uname or deploy from clean git commit

As detailed in the unreproducible push documentation, you must either push from a fully clean git commit or provide an unrepro name (~:uname) to deploy local uncommitted changes.

API Gateway Returns a 500 Internal Server Error

A HTTP 500 Internal server error returned from API Gateway indicates that the ion code threw an exception. Find the full exception text and stack trace in the Datomic system CloudWatch Logs:

  • Open the CloudWatch Logs in the AWS Console
  • Click on the Log Group named "datomic-{System}", where System is your system name. Each EC2 instance will create a separate log stream. Usually you will want to search across all log streams.
  • Click the "Search Log Group" button
  • Enter "Exception" in the search dialog
  • Find the event of interest in the search results and click the arrow to expand the details

The screenshot below shows finding an ion exception in the log: images/ion-exception-log.png

Supported Lambda Ion Return Types

Returning an unsupported type from a Lambda Ion will result in the exception:

datomic.ion.lambda.handler.exceptions.Incorrect: No implementation of method: :->bbuf of protocol: #'datomic.ion.lambda.dispatcher/ToBbuf found for class: clojure.lang.LazySeq

Lambda ions must return a String, InputStream, ByteBuffer, or File. Function signatures for all ion types can be found in the ion reference.

SOCKS4 tunnel failed, connection closed and connection refused errors in Client

This error will throw when the connection is unable to find a Datomic Socks proxy, if you're seeing this error while running remotely it is because the Client is expecting a proxy and you'll need to remove the proxy from your config map:

CompilerException clojure.lang.ExceptionInfo: Unable to connect to localhost:8182 {:cognitect.anomalies/category :cognitect.anomalies/fault, :cognitect.anomalies/message "SOCKS4 tunnel failed, connection closed", :cognitect.http-client/throwable #error {
 :cause "SOCKS4 tunnel failed, connection closed"
 :via
   [{:type java.io.IOException
 :message "SOCKS4 tunnel failed, connection closed"
 :at [org.eclipse.jetty.client.Socks4Proxy$Socks4ProxyConnection onFillable "Socks4Proxy.java" 166]}]
 :trace
   [[org.eclipse.jetty.client.Socks4Proxy$Socks4ProxyConnection onFillable "Socks4Proxy.java" 166]
   [org.eclipse.jetty.io.AbstractConnection$ReadCallback succeeded "AbstractConnection.java" 273]
   [org.eclipse.jetty.io.FillInterest fillable "FillInterest.java" 95]
   [org.eclipse.jetty.io.SelectChannelEndPoint$2 run "SelectChannelEndPoint.java" 75]
   [org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume produceAndRun "ExecuteProduceConsume.java" 213]
   [org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume run "ExecuteProduceConsume.java" 147]
   [org.eclipse.jetty.util.thread.QueuedThreadPool runJob "QueuedThreadPool.java" 654]
   [org.eclipse.jetty.util.thread.QueuedThreadPool$3 run "QueuedThreadPool.java" 572]
   [java.lang.Thread run "Thread.java" 745]]}, :config {:server-type :cloud, :region "us-east-1", :system "wrong", :query-group "wrong", :endpoint "http://entry.wrong.us-east-1.datomic.net:8182/", :proxy-port 8182, :endpoint-map {:headers {"host" "entry.wrong.us-east-1.datomic.net:8182"}, :scheme "http", :server-name "entry.wrong.us-east-1.datomic.net", :server-port 8182}}}, compiling:(sync_client_test.clj:138:13) 

This error will throw when the config map is pointing to the wrong proxy port.

CompilerException clojure.lang.ExceptionInfo: Unable to connect to localhost:8188 {:cognitect.anomalies/category :cognitect.anomalies/unavailable, :cognitect.anomalies/message "Connection refused", :config {:server-type :cloud, :region "us-east-1", :system "jaret-lambda-test", :query-group "jaret-lambda-test", :endpoint "http://entry.jaret-lambda-test.us-east-1.datomic.net:8182/", :proxy-port 8188, :endpoint-map {:headers {"host" "entry.jaret-lambda-test.us-east-1.datomic.net:8182"}, :scheme "http", :server-name "entry.jaret-lambda-test.us-east-1.datomic.net", :server-port 8182}}}, compiling:(form-init7068544986673255826.clj:1:13)

With both of these errors, you'll want to double check your connection config.