One ingest, many readers
DuckLake caches your datasets in DuckDB backed by S3 Parquet. Hundreds of dashboard viewers hit the cache — your production databases receive only one query per TTL period.
One ingest, many readers
DuckLake caches your datasets in DuckDB backed by S3 Parquet. Hundreds of dashboard viewers hit the cache — your production databases receive only one query per TTL period.
Columnar queries in milliseconds
DuckDB’s in-process columnar engine runs aggregations without network overhead. Chart queries that take 30 seconds on PostgreSQL take 50ms on the local DuckDB cache.
17+ data sources
PostgreSQL, MySQL, ClickHouse, Snowflake, BigQuery, Spanner, Trino, SQL Server, Databricks, and more. One consistent interface across all connectors.
Apache Iceberg V2
Iceberg V2 metadata is written alongside every Parquet file. Spark, Trino, and Flink can read BILake’s cached data as a first-class Iceberg table — no ETL pipeline.
Settings-driven
TTL, memory limits, S3 credentials, auth parameters — all in the metadata database, editable from the Settings page. No config files, no restarts for most changes.
Tableau-style workbooks
Drag-and-drop shelf interface: drop fields onto rows, columns, and mark shelves. Multiple worksheet and dashboard sheets in one workbook tab strip.
Source databases (PostgreSQL / MySQL / ClickHouse / BigQuery / …) │ one ingest per TTL ▼Worker process ──→ DuckDB ──→ S3 Parquet ──→ Iceberg V2 ↑Flight server (reads via httpfs) ──────┘ ↑API server ←→ SvelteKit FrontendThree processes, each independently scalable:
| Process | Role |
|---|---|
| API server | ConnectRPC, auth, CRUD, settings, River scheduler |
| Flight server | DuckDB in-process, read queries, Parquet views |
| Worker | River job consumer, dataset ingest, S3 writes |
git clone https://github.com/hakanuzum/bilake.gitcd bilakedocker compose up -dopen http://localhostDefault login: admin@example.com / admin123