Commit Graph

16 Commits

Author SHA1 Message Date
8f62a79bbe feat: deploy ask TUI to ask.xn--2dk.xyz
- Add ttyd service for ask on port 7682
- Update haloy.yml with new domain and GEMINI_API_KEY
- Update Caddyfile to route ask.xn--2dk.xyz to ttyd
- Update Dockerfile to include ask binary
- Update README with ask section and schema files documentation
2026-03-28 12:12:31 +01:00
e1c2377343 feat(ask): add text wrapping for wide table columns
- Implement wrap_text function to handle long cell content
- Auto-wrap table columns when content exceeds available width
- Preserve original table rendering for fits-all cases
- Remove sample_datasets project (no longer needed)
- Update .gitignore to use wildcard for target dirs
2026-03-28 11:59:02 +01:00
c142080a5d schema: revert Phase 3 to S3 bigquery_tables enrichment
GraphQL approach had broken pagination (totalCount key missing, crashes
silently). S3 approach at least completes cleanly even if the metadata
table currently lacks a description column.
2026-03-28 11:27:54 +01:00
b5d84e3556 feat: add LLM SQL query assistant and dataset sampler
- ask.py: Python script to query Base dos Dados via natural language using Gemini,
  generates and executes DuckDB SQL from Portuguese questions
- ask/ (Rust): CLI companion for the SQL query assistant with system prompt
- sample_datasets.py: samples parquet files from S3 into a local DuckDB for exploration
- sample_datasets/ (Rust): CLI for dataset sampling
- context/: LLM context bundle (schemas, join keys, file tree) for query generation
2026-03-28 11:23:51 +01:00
6801db427e schema: use BD GraphQL API for enrichment, add file tree and schema artifacts
- Replace S3 bigquery_tables metadata lookup with paginated GraphQL API call
  to fetch table and column descriptions from Base dos Dados
- Add gera_schemas.py for schema compilation and S3 inventory
- Add schemas.json and file_tree.md as generated reference artifacts
- Add websocket proxy in Caddyfile for ttyd on port 7681
- Ignore generated context/ artifacts in .gitignore
- Add openai to requirements.txt
2026-03-28 11:23:38 +01:00
ed81e52254 docs: reorder README for data users, remove unused files (xdg-open, gera_schemas.py, open_gui.sh, docs/) 2026-03-26 12:01:46 +01:00
5239a03ea8 docs: expand /query curl usage, remove outdated UI references 2026-03-26 11:58:14 +01:00
41e7f7a972 replace duckdb-ui with ttyd shell: add /query HTTP endpoint, fix utf-8/locale, region config
- swap DuckDB UI for ttyd web terminal (--writable, -readonly db)
- add POST /query endpoint with X-Password auth for curl-based SQL execution
- fix UTF-8 rendering: set LANG/LC_ALL=C.UTF-8 in container
- pass BUCKET_REGION env var for correct S3 signing region
- simplify start.sh: drop Xvfb, views.duckdb generation, blocking duckdb -ui
- add less, ncurses-bin to Dockerfile for proper pager/terminal support
- update Caddyfile: single route to ttyd with flush_interval -1 for websocket
- update README to reflect current architecture and document /query usage
- remove duckdb-ui.service, schemas.json, file_tree.md (generated artifacts)
2026-03-26 11:54:46 +01:00
cd94603fac update haloy config: use hostname for server, fix env var format 2026-03-25 13:39:15 +01:00
8c22944bbb fix duckdb filename, add db file and gitignore cleanup
- Dockerfile + start.sh: use basedosdados.duckdb (not basedosdados3.duckdb)
- add basedosdados.duckdb (3.5 MB, needed for Docker build)
- add requirements.txt (local dev use)
- .gitignore: remove *.duckdb exclusion, add .DS_Store
2026-03-25 13:30:25 +01:00
0d77f83045 simplify container: skip db prep, password via env var, fixed server IP
- start.sh: remove prepara_db.py step; load S3 creds via DuckDB init file
- Caddyfile: switch to basic_auth with {env.BASIC_AUTH_HASH} — no rebuild to rotate password
- Dockerfile: drop Python/pip layers (no longer needed at runtime)
- haloy.yml: set server to 89.167.95.136, add BASIC_AUTH_HASH to env
- remove requirements.txt (only needed for local prepara_db.py, not the container)
2026-03-25 13:27:51 +01:00
9eb2dee013 containerize with Haloy: Dockerfile, Caddy basicauth, haloy.yml for db.xn--2dk.xyz
- Dockerfile: debian slim, installs DuckDB CLI, Python deps, Caddy
- start.sh: runs prepara_db.py → starts Caddy (basicauth) → starts DuckDB UI
- Caddyfile: updated for container (no TLS, port 8080, Haloy handles HTTPS)
- haloy.yml: deploys to db.xn--2dk.xyz on port 8080
- requirements.txt: duckdb, boto3, python-dotenv
- prepara_db.py, open_gui.sh, duckdb-ui.service: add previously untracked files
- remove prepara_gui.py (replaced by prepara_db.py)
2026-03-25 13:23:59 +01:00
03758acdd9 add schema dump: parquet footer reader generating schemas.json and file_tree.md 2026-03-25 10:13:40 +01:00
4572fcb28e add DuckDB explorer: creates views over S3 parquets for local querying 2026-03-25 10:13:37 +01:00
dd221cff88 add export pipeline: BigQuery → GCS → Hetzner S3 (roda.sh) 2026-03-25 10:13:34 +01:00
335abbfa2f add project setup: gitignore, env sample, readme 2026-03-25 10:13:31 +01:00