### The "goodies" in CockroachDB

A few months ago, I was invited to present CockroachDB to a [tech
consulting office in Amsterdam](https://container-solutions.com/). The
audience was welcoming and receptive. They understood, appreciated,
and lauded the "flagship" features of CockroachDB: distribution,
scalability, high availability, operating simplicity.

Yet a question came up which I had not heard before: *all these are
features that solve known problems; now, what are the goodies?*

The goodies, the asker clarified, are those features which:

1) the user did not expect,
2) are not present in other products, and
3) are small-ish in nature so that a casual user can easily show them off to a
   peer.

Goodies enable users to *brag* about their product choice after the
choice is made, without too much attention for the rational trade-offs
that motivated the choice.

I paused, and recollected. What are CockroachDB's *goodies?*

Obviously, the main CockroachDB documentation is unlikely to highlight
features directly in this way: the documentation aims to treat all
features as novel and useful, making no assumptions about what a
particular reader may like more over another. Arguably, the doc site
is also a marketing tool aiming to convince users who do not use
CockroachDB yet, so it is bound to focus primarily on CockroachDB's
core features.

Finding "goodies" requires looking at the thing as if all its core
features were already considered familiar and uninteresting, and
contemplate what sticks out beyond that in an agreeable way.

Searching for a fancy feature suitable to impart a "wow" reaction in
demonstration booths, I quickly thought about the [**Node
map**](https://www.cockroachlabs.com/docs/stable/admin-ui-access-and-navigate.html#node-map-enterprise):
a graphical visualisation of the geographical distribution of
CockroachDB nodes in the world.

Arguably, this feature is very enterprise-y (and incidentally limited
to deployments with an "Enterprise license"), and perhaps of limited
use when the database operates properly.

We can instead look at the layer underneath, another goodie of a more
technical nature: the [configuration of replication
zones](https://www.cockroachlabs.com/docs/stable/configure-replication-zones.html)
which enable a user to configure which parts of which SQL tables is
replicated on which (sub-)sets of cluster nodes.

The **zone config language** is a *DSL* (domain-specific language) which
supports a *constraint algebra* against arbitrary attributes of the
underlying data stores. It supports both positive (mandate) and
negative (avoid) conjunctions (mandate/avoid compatibility with all
properties) and disjunctions (mandate/avoid compatibility with
either/or properties). Its *constraint solver* results in *automatic
migration events* which move the data where it is constrained. It also
interacts peacefully and constructively with the automatic load
balancing that happens independently to increase performance: data is
migrated *within its constrained zone* to bring it closer to where
it is needed.

I described this as solid and serious feature that is both practically
essential and appealing to an audience of erudite hackers. My
audience agreed.

For having contributed to some parts of the code base, I am aware of
several more goodies which I indirectly or directly contributed to.

For example, CockroachDB integrates a fancy **tracing infrastructure**
which can extract detailed debugging details. The collection of traces
can be enabled using a variety of mechanisms depending on the
troubleshooting scenario. For example, [one can request a detailed
trace](https://www.cockroachlabs.com/docs/stable/show-trace.html) of
all the processing done by CockroachDB *on behalf of a single query*,
but *throughout all the abstraction layers inside CockroachDB*
including across all the nodes in the cluster that participated in the
query's execution. Many other tracing endpoints beyond `SHOW TRACE`
are also available, including via the web browser. It's also possible
to trace all executions through particular files or functions in
CockroachDB's source code.

Given the commonly known arduousness of debugging large distributed
systems, developers will likely find some appeal in this powerful
tool. It has certainly improved the life of CockroachDB's contributors
already.

Speaking of which, a fancy advantage of exposing tracing data within
SQL is that one can then further use SQL queries to filter, transform
and reduce particular details of traces. In fact, CockroachDB
generalizes this principle: *any internal data produced by CockroachDB
that can be structured as a table should be available for further
processing by SQL queries*.

Here, I am not considering that CockroachDB, like other SQL databases,
exposes the SQL logical schema via SQL tables
(e.g. [`information_schema`](https://www.cockroachlabs.com/docs/stable/information-schema.html))
which can be queried for introspection.

Instead, beyond that, [any configuration or administration SQL
statements can also be used as a "virtual table" to query
data](https://www.cockroachlabs.com/docs/stable/table-expressions.html#using-the-output-of-other-statements). For
example, there exists a `SHOW JOBS` statement that lists the current
background tasks in the cluster (e.g. asynchronous online schema
changes, such as adding an index on a very large table); given that
this produces tabular data, one can refine the output with e.g. `SELECT
finished - created FROM [SHOW JOBS]` to determine the execution time
of completed jobs. This enables users to design their own views on the
current status of their cluster, without the need to request an
extension in CockroachDB's SQL syntax.

There exists also a command-line [SQL shell](https://www.cockroachlabs.com/docs/stable/use-the-built-in-sql-client.html) (invoked via `cockroach
sql`), analogous to the [`psql` shell](https://www.postgresql.org/docs/10/static/app-psql.html)—in fact, it's so compatible with
it that `psql` can connect to a CockroachDB cluster, and `cockroach
sql` can connect to a PostgreSQL database.

Despite its smaller set of features compared to `psql`, `cockroach
sql` contains its own goodies. For example, both `psql` and `cockroach sql`
can present the user with guidance about the syntax and usage of a SQL
statement using `\h`, but `cockroach sql` can also present this
help if the user presses `??` then the tab key *while they are
currently entering a query*. This enables the use of contextual help
without erasing the current entry, which is particularly convenient
while experimenting. To ease experimentation further, `cockroach sql`
also supports `\hf` (not known to `psql`) which is able to pull the
documentation of individual SQL built-in functions, unlike `psql`.

On a related note, the `cockroach` executable program contains many other
functions besides the main server function (`start`, `quit` etc) and
the SQL shell (`sql`). Some of them are gems of their own.

`cockroach demo` is a fantastic entry point for beginners, and for
teachers constructing a SQL tutorial: in one fell swoop, it
starts a RAM-only CockroachDB server and an interactive SQL shell,
*with no additional configuration needed*. Type this command in, then you
can start typing SQL immediately and work with CockroachDB. Lovers of
`sqlite` tend to like this a lot. (I do too. It's gorgeously helpful
to try out new code during development.)

`cockroach gen man` will generate CockroachDB's [unix manual
pages](https://en.wikipedia.org/wiki/Man_page) automatically, ready to
read or install. Cockroach Labs distributes a single `cockroach`
binary, to simplify the download process, but you can still install
its documentation in the Right Way, like for all your other beloved
unix programs.

`cockroach gen autocompletes` generates auto-completion data for
either Bash or Zsh. Avid users of the CockroachDB command line will
surely appreciate this convenience, which is designed to accelerate
operations and maintenance.

There is even an Easter egg hidden in `cockroach gen` somewhere, but I
am not telling. Will you be able to spot the CockroachDB logo?

There is much I could write about CockroachDB's unique technical
features. Yet, at this point, I would like to shift this exposé and
underline that CockroachDB's own [documentation
site](https://www.cockroachlabs.com/docs/) is itself quite a unique
achievement. To a casual observer, "it's just a documentation site for
a technical project". But for amateurs of documentation resources,
there is much to love.

For example, the documentation presents content both from the angle of
usage scenarios (e.g. "how to do this or that") and as a reference
manual (i.e. "what is everything I need to know about an aspect of the
product"). There is content both for absolute beginners (e.g. "Getting
started" guides) and technical audiences (e.g. an in-depth
presentation of CockroachDB's architecture). Cross-references are
exhaustive and relevant, so that it is particularly easy to idly surf
from one area to another, much like one can educate themselves by
casually browsing Wikipedia. Each documentation page has a link in the
top right where the reader can become an editor and propose
improvements (even propose direct changes to the text). For a project
as young as CockroachDB, the maturity of its documentation is
remarkable. (Disclaimer: I have personally contributed to parts of
it. I am very proud.)

To conclude, I would say it is rather easy to find ways to like (or
even love) CockroachDB *beyond* the moment you decide that it is suitable
for your purpose. Plenty of goodies indeed.

[Find me on twitter.](https://twitter.com/kena42)

Copyright © 2018 Raphael ‘kena’ Poss.
Permission is granted to distribute, reuse and modify this document
according to the terms of the Creative Commons Attribution-ShareAlike
4.0 International License.  To view a copy of this license, visit
[http://creativecommons.org/licenses/by-sa/4.0/](http://creativecommons.org/licenses/by-sa/4.0/).