mindstalk | sqlite the miracle DB

At work today, Boss suggested I look at sqlite a bit, since our client code uses it. What I thought might be a brief glance turned into hours of reading, as it became rather fascinating. For those who don't know, it's an embedded SQL database, with not much code, unlike the client/server databases of Oracle or anything else you've probably heard of. As their docs put it, they're not competing with such databases, they're competing with fopen() and other filesystem access.

They call their testing "aviation grade", possibly without hyperbole: 100% branch coverage, 100% coverage of something stronger than branches, 700x more testing code than actual library code and a lot of that generates tests parametrically... it sounds pretty nuts. They worship Valgrind but find compiler warnings somewhat useless; getting warnings to zero added more bugs than it solved. https://www.sqlite.org/testing.html

They claim "billions and billions of deployments", which sounded like humorous hyperbole until they added being on every iPhone or Android phone, every Mac or Windows 10 machine, every major browser install... There are over 2 billon smartphones, so just from the phone OS and the phone browser, you've got 4 billion installs...

They also make a pitched case for consider a sqlite database any time you'd be considering some complex file format. With almost no code to write, you'd get consistency robustness, complex queries, machine and language independence, and at least some ability to do partial writes[1], compared to throwing a bunch of files into a zipfile.

https://www.sqlite.org/appfileformat.html
https://www.sqlite.org/affcase1.html

They also had a nicely educational description of their rollback and write-ahead models. https://www.sqlite.org/atomiccommit.html
https://www.sqlite.org/wal.html

[1] I do wonder about this. One odd thing about sqlite is a looseness about types, and AIUI cramming numeric values into the smallest range that will hold them. So I'd think that if you UPDATED a value 100 to a value 1000000000000, you'd have to shuffle the trailing part of the file, compared to a format that e.g. reserved 8 bytes for a numeric type. But maybe they do buffer numeric or string storage. And not having to write the whole file, or not having to read the whole file (e.g. to decompress it) seem like at least partial wins.