Monday, May 24, 2021

Out of touch with C++, and other new stuff

MongoDB is quite cool. Apart from being written in C++, it's a document DB (one that can store and index JSON / JSON-like content) in a distributed database that supports sharding, replication, and eventual consistency. You can run it as a single server too, which is useful for writing application code that connects to it.

  1. The server is called mongod.
  2. There is a REPL client / interpreter called mongo which also happens to be a full-fledged JavaScript interpreter.
  3. A single Mongo instance can host multiple databases, each stored in its own file.
  4. Each database can have multiple collections. analogous to tables in an RDBMS.
  5. Each collection has zero or more documents, corresponding to rows in RDBMS but the similarities are faint.
    1. A document is a JSON or JSON-like content. JSON-like refers to the fact that there are extensions to the JSON-format supported - with value types like Date, binary, etc. being supported.
    2. Every document has a unique identifier - the _id attribute, which must be unique in a collection. It's default type is ObjectId, which is a 12-byte integer, but can be any type. The 12-byte integer is partitioned from left to right as "4[epoch seconds]|3[hostname hash]|2[pid]|3[incr num]".

The main meat of the topics are perhaps in dealing with distributed data - sharding, indexing, consistency, and all the tools used to implement these. There are also specifics like expression languages for queries, as well as client libraries. There are also many specifics about working using the mongo REPL. Last, but not nearly the least, is the topic of building applications using Mongo - a topic that influences how application data is expressed via Mongo, and consumed from it. More on that in another post.

C++ has been an old love affair, a fact that alienates me from a lot of well-meaning programmers at the outset, and endears me to a few. But truth be told I have not written a lot of C++ over the past few years and have grown my own contrarian opinions about the usefulness of many recent additions to the language. I didn't get a chance to work with fold expressions earlier but looked at it recently.

To me fold expressions appear to be a syntactic convenience for unrolling loops involving parameter packs without explicitly writing the templates and their specialization needed before. The code does become shorter, but does it really become more expressive? I don't know - I think it becomes a little cryptic / terse because the syntax doesn't intuitively express what's happening. You have to know and get used to it, like with parameter pack expansions involving function and template expressions.

I don't have much to argue on the matter. The C++ folks can always shut me up by saying that this is a tool for library writers that people like me can ignore. Maybe they are right. But I do wonder, why library writers have to put up with cryptic syntax. Isn't simplicity of value to them?


Read more!