Datomic in a nutshell

In this blog post, I will be taking you though my journey of using Datomic database. The reasons why we chose Datomic, pros and cons of using Datomic. Datomic is unlike Relational/ NoSQL database that I have used in the past, its unique.

My first experience with Datomic started when I started working on a microservice in Clojure. In the product we were building, we had to track changes, who, what, when etc. In HealthCare domain audit is usually important. It is while looking for databases that support audit right out of the box, we came across Datomic. It is similar to version control, it never forgets anything. Datomic seemed like a good fit for our needs, after much evaluation and consideration we decided to use Datomic. Everything about Datomic is unique, the architecture, the data model to the new concept of assertion and retraction for saving data and using its powerful Querying mechanism for data retrieval.

Datomic is a relatively young ecosystem (initial release 2012), it’s a distributed database with ACID transactions. Datomic official site along with its numerous official blogs and blogs from other Datomic enthusiasts are a huge source of information, but I wanted some reference which would quickly cover the Datomic essentials in a nutshell.

Datom is the building block of Datomic Database. A datom is represented as a tuple of five fields: Entity (E), Attribute (A), Value (V), Transaction Id (T), Operation Type (Op).

Datom is immutable for a point in time. Because of immutability, any data modification does not remove the old value, it retracts the old value, asserts the new value. Thus, the database accumulates values over time. You can compare it to a commit in GIT.

Datomic is suited for applications where there is less write. Because when the write traffic is more, or the data volume is large, it results in bloating of database size as Datomic accumulates past facts.

Unlike RDBMS, Datomic does not have the concept of table or columns or rows.

An Entity in Datomic is a collection of datoms which are related to the same E.

Datomic architecture is different from the traditional database. There are separate Datomic processes for reading data from storage and transacting data to storage.

A transactor is a process that saves any schema or data to Datomic storage. A transactor will only write data. A peer, reads data from storage, it does not need transactor for data retrieval. Datomic does not write to any file system for storing data itself. It uses database service like DynamoDB which writes the data.

Simple and powerful means of querying data store is important for any database. Datomic provides query and rules which are extended form of Datalog. The power of declarative query, along with joins and clauses makes it easier to use. The query engine takes the database instance and the rule sets as input. A clause is the basic component of Datalog Query. A set of clauses form a rule. We can even use Java static and instance methods inside the clause!

As stated earlier, support for audit right out of box was important for us. Since Datomic database accumulates data over time, it allows user to time travel to a historic snapshot of the database. Using Datomic APIs, we can get snapshot of database in one of three ways

  • A time-point – will return a value as of that time-point. If a query uses this db value, any transactions after this time-point will not be used for evaluation.
  • Since some time-point – any transaction before that time point is not considered when evaluating a query.
  • History – returns a database value containing all assertions and retractions across time.

For every change, we can retrieve following bits of information for audit.

  • What entity and which attribute was modified.
  • Who made the changes
  • What context the change was for
  • When the transaction happened
  • What was the old value
  • What is the new value

As service owners, my team is responsible for backup and restore of Datomic database. Datomic uses differential backup. When a backup is taken for the first time, Datomic takes a full backup. Next time onwards, it takes differential backup. This reduces backup of redundant data and reduces overall backup data size. After a restore, its mandatory to restart the transactor and peers for the restore to be effective. We automated the backup and restore processes. The backup is taken daily at specified hour, and when required, we specify the backup instance, and the datomic will be restored, followed by the transactor and peers restart- all automated.

Overall, Datomic has worked out well for us. It supports Audit from the get-go. Datalog queries are powerful and gives flexibility of using with Clojure data structure. Also, no schema rigidity and no restriction on the attributes that an Entity can have, give extra flexibility to developer. We are happy with our choice!

Interesting reads: