Frigatebird: A columnar SQL database with push-based query execution
Named after the frigatebird, one of the fastest birds that can stay aloft for weeks with minimal effortFrigatebird is an embedded columnar database that implements a push-based Volcano execution model with morsel-driven parallelism. Queries compile into pipelines where operators push batches downstream through channels, enabling parallel execution across multiple workers.
- Push-based execution - Operators push data downstream instead of pulling, enabling natural parallelism
- Morsel-driven parallelism - Data processed in 50k-row morsels that flow through pipelines independently
- Late materialization - Columns loaded only when needed, reducing I/O for selective queries
- Three-tier caching - Uncompressed (hot) β Compressed (warm) β Disk (cold)
- Vectorized filtering - Bitmap operations process 64 rows per CPU instruction
- Dictionary encoding - Automatic compression for low-cardinality string columns
- WAL durability - Write-ahead logging with three-phase commit for crash recovery
- io_uring + O_DIRECT - Batched async I/O bypassing OS page cache (Linux)
# Start the interactive CLI
make frigatebirdsql> CREATE TABLE events (id TEXT, ts TIMESTAMP, data TEXT) ORDER BY id
OK
sql> INSERT INTO events (id, ts, data) VALUES ('sensor-1', '2024-01-15 10:30:00', 'temperature=23.5')
OK
sql> SELECT * FROM events WHERE id = 'sensor-1'
id | ts | data
sensor-1 | 2024-01-15 10:30:00 | temperature=23.5
(1 rows)note: ORDER BY is mandatory while creating a table
A query like SELECT name FROM users WHERE age > 25 AND city = 'NYC' compiles into a pipeline:
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Step 0 β β Step 1 β β Step 2 β
β column:age ββββββΆβ column:city ββββββΆβ column:name ββββββΆ Output
β filter:>25 β β filter:=NYC β β (project) β
β (root) β β β β β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
187k rows 31k rows 8k rows
Each step loads only its column and filters, passing surviving row IDs downstream. The final step materializes projection columns only for rows that passed all filters.
Without late materialization: Load ALL columns Γ 187k rows = 750k values
With late materialization: age: 187k + city: 31k + name: 8k = 226k values
βββββββββββββββββββββββββββββββββββββββββββββ
70% less data loaded
Workers grab morsels (page groups) using lock-free atomic compare-and-swap:
Time βββββββββββββββββββββββββββββββββββββββββββββββΆ
Worker 0 ββββ M0:S0 ββββ βββ M0:S1 βββ ββ M0:S2 ββ
Worker 1 ββββ M1:S0 ββββ βββ M1:S1 βββ ββ M1:S2 ββ
Worker 2 ββββ M2:S0 ββββ βββ M2:S1 βββ ββ M2:S2 ββ
Worker 3 βββ M3:S0 βββ ββ M3:S1 ββ ββ M3:S2 ββ
Legend: M0:S0 = Morsel 0, Step 0
Multiple morsels execute concurrently. Channels buffer batches between steps.
| Command | Description |
|---|---|
\d |
List all tables |
\d <table> |
Describe table schema |
\dt |
List all tables |
\i <file> |
Execute SQL from file |
\q |
Quit |
DDL
CREATE TABLE <name> (...columns...) ORDER BY <col[, ...]>
DML
INSERT INTO <table> (...) VALUES (...)UPDATE <table> SET ... [WHERE ...]DELETE FROM <table> [WHERE ...]
Queries
- Single table SELECT with WHERE, ORDER BY, LIMIT, OFFSET
- Predicates:
AND,OR, comparisons,BETWEEN,LIKE,ILIKE,RLIKE,IN,IS NULL - Aggregates:
COUNT,SUM,AVG,MIN,MAX,VARIANCE,STDDEV,PERCENTILE_CONT - Aggregate modifiers:
FILTER (WHERE ...),DISTINCT - Conditional aggregates:
sumIf,avgIf,countIf GROUP BYwithHAVING,QUALIFY,DISTINCT- Window functions:
ROW_NUMBER,RANK,DENSE_RANK,LAG,LEAD,SUM,FIRST_VALUE,LAST_VALUE - Time functions:
TIME_BUCKET,DATE_TRUNC - Scalar functions:
ABS,ROUND,CEIL,FLOOR,EXP,LN,LOG,POWER,WIDTH_BUCKET
Data Types
TEXT/VARCHAR/STRINGINT/INTEGER/BIGINTFLOAT/DOUBLE/REALBOOL/BOOLEANTIMESTAMP/DATETIMEUUIDINET/IP
Data is stored in columnar format with one file per column:
storage/
βββ data.00000 # Page data (4GB max per file, auto-rotates)
βββ data.00001
wal_files/
βββ frigatebird/ # WAL for durability
βββ frigatebird-meta/ # Metadata journal
Pages are compressed with LZ4 and aligned to 4KB boundaries for O_DIRECT I/O.
cargo testSee the docs/ directory for detailed documentation:
- Architecture - System design and execution model
- Components - Deep dive into each subsystem
- Data Flow - Step-by-step query execution trace
MIT License - see LICENSE for details.

