Persistent DuckDB connection in auth.py for faster queries

Replace per-request subprocess spawning with a single long-lived duckdb
Python connection (in-memory + ATTACH read-only). LOAD httpfs and S3 auth
are paid once at startup; object cache accumulates across requests.

Benchmarked improvement on remote: Q1 10x, Q2 3x, Q3 9x, Q4 22x faster.
Add duckdb==1.5.1 Python package to Dockerfile.
This commit is contained in:
2026-05-17 11:19:01 +02:00
parent 86a1669902
commit d539736afc
2 changed files with 39 additions and 22 deletions

View File

@@ -38,7 +38,7 @@ FROM --platform=linux/amd64 debian:12-slim
RUN apt-get update -qq && \
apt-get install -y --no-install-recommends \
curl ca-certificates unzip bsdmainutils python3 \
curl ca-certificates unzip bsdmainutils python3 python3-pip \
less ncurses-bin && \
curl -fsSL \
"https://github.com/caddyserver/caddy/releases/download/v2.9.1/caddy_2.9.1_linux_amd64.tar.gz" \
@@ -59,7 +59,8 @@ RUN apt-get update -qq && \
cp /usr/local/libduckdb.so /usr/local/lib/ && \
ldconfig && \
rm /tmp/libduckdb.zip && \
apt-get clean && rm -rf /var/lib/apt/lists/*
apt-get clean && rm -rf /var/lib/apt/lists/* && \
pip3 install --no-cache-dir --break-system-packages duckdb==1.5.1
WORKDIR /app