SIGMOD hosts a yearly competition to design a system which performs a set of queries as quickly as possible. The contests provide a starting framework and test harness. They make great intermediate database projects for learning.

Sorting (2019) Join Processing (2018) Streaming N-Gram Filter (2017) Shortest Path (2016) Transaction Processing (2015) Social Network Graph Processing (2014) Streaming Full Text Search (2013) Multi-dimensional Indexing (2012) Durable Main-Memory Index Using Flash (2011) Distributed Query Engine (2010) Main Memory Transactional Index (2009)

Building BerkeleyDB

A B-Tree tutorial series implementing an ABI-compatible BerkeleyDB clone.

Introduction Page Format Entry Format API Basics Point Reads

Making an OCaml library usable from C.

S3 is a convenient way to host larger static artifacts for a website, but which S3-compatible service is the cheapest for that usecase?

It’s easy to find documents containing "large" and "elephant". It’s hard to find documents in German which have "large" and "elephant" together in a sentence, or words with similar meanings to large, and provide only the 10 most relevant documents.


How to Learn

Suggested reading material for various topics.

Philosophy of How to Learn Consensus

dbdiag: ophistory

Anouncing thisismiller/dbdiagGitHub's ophistory script, a diagram as text tool to allow visualizing the concurrent executions of operations.

A reminder that macOS does not respect the usual ways of making data durable on disk.

A popularity-based sampling of which databases are using which TLS implementations.

A walkthrough of how and why complex infrastructure should be built with deterministic simulation, and how to make such tests as productive as possible for developers.