choosing DB
99% can just use sqlite:
It is fast enough + simple to administer. As long as no complex need, basically blogs, simple internal app, small business, etc. Now litestream also makes it easy to have backups
99% from the 1% suffice with Postgres/MySQL. Mostly because the need of:
- security/auditing
- HA/DR (with failure isolation + rto/rpo guarantee)
- Good support for tooling + monitoring
- Probably also a replica/secondary, for accounting/analytics
In most case, can also use these as basic search/queue/etc. This is enough for most traditional business (OTA, ERP, payment, small eCommerce, CMS, etc)
The rest need more complex solution, for example:
- Social graph / knowledge graph (dgraph, neo4j, janus, nebula, vitess+myrocks, etc)
- Search engine (elastic, solr, etc)
- Telco (VoltDB, RonDB/NDB)
- TSDB / monitoring engine (timescale, questdb, clickhouse, prometheus, M3, kdb, influx, kudu etc) + stream processing for IIoT (kafka, pulsar, spark, flink, etc)
- Adtech/caching (aerospike, tarantool, voltdb, singlestore, scylla, redis, memcache, etc)
- Geospatial (postGIS, singlestore, cockroach, tarantool, aerospike, kinetica, geospock) (for libs, H3 or S2)
- Small analytics OLTP (HTAP) / data grid (singlestore, ksql, TiDB, kudu, geode, hazelcast, ignite, infinispan, etc)
- Proper queuing (rocketmq, rabbitmq, nsq, etc)
- Real DW /OLAP (clickhouse, druid, citus, singlestore, kudu, pinot, bigquery, etc)
- Data lake solution (hadoop, delta, hudi, iceberg)
Or even custom, such as:
- disruptor, chronicle, prevayler, etc [not a database per se, but in use effectively mimics a database]
- Scientific research (e.g. genbank)
- Entity-resolution(tilodb)
- Accounting-focus (tigerbeetle)
- Machine learning embedding (embeddinghub)
- Time travel DB (datomic, dolt)
- Auth platform (Keto, authzed)