- Add 37 census documentation files for IBGE census datasets (1970-2010)
- Add dataviz wordcloud scripts and images
- Add relatorio_final.md with research findings on households and living conditions
New data from DuckDB queries:
- 90.7M households, 203M population
- 53.2% Black population
- 27.9% female-headed households
- 46.6% urban sewage without collection/treatment
- 15,816 favela sectors (2010)
- 68% Black population in Fortaleza
- Add profile of 6,126 people in collective dwellings (v4002=63)
with demographics: gender, race, education, age, civil status
- Add detailed analysis of 503 minors: 349 likely prisoners (v0502=20),
154 dependents of staff/prisoners
- Add breakdown of female prisoners: higher education and whiter than male prisoners
- Fix language inconsistencies (Spanish, Chinese, English terms)
- Add documentation for br_ibge_censo_2022 setor_censitario (v* variables)
- Add documentation for prison population identification across census datasets
- Add ttyd service for ask on port 7682
- Update haloy.yml with new domain and GEMINI_API_KEY
- Update Caddyfile to route ask.xn--2dk.xyz to ttyd
- Update Dockerfile to include ask binary
- Update README with ask section and schema files documentation
- Implement wrap_text function to handle long cell content
- Auto-wrap table columns when content exceeds available width
- Preserve original table rendering for fits-all cases
- Remove sample_datasets project (no longer needed)
- Update .gitignore to use wildcard for target dirs
GraphQL approach had broken pagination (totalCount key missing, crashes
silently). S3 approach at least completes cleanly even if the metadata
table currently lacks a description column.
- ask.py: Python script to query Base dos Dados via natural language using Gemini,
generates and executes DuckDB SQL from Portuguese questions
- ask/ (Rust): CLI companion for the SQL query assistant with system prompt
- sample_datasets.py: samples parquet files from S3 into a local DuckDB for exploration
- sample_datasets/ (Rust): CLI for dataset sampling
- context/: LLM context bundle (schemas, join keys, file tree) for query generation
- Replace S3 bigquery_tables metadata lookup with paginated GraphQL API call
to fetch table and column descriptions from Base dos Dados
- Add gera_schemas.py for schema compilation and S3 inventory
- Add schemas.json and file_tree.md as generated reference artifacts
- Add websocket proxy in Caddyfile for ttyd on port 7681
- Ignore generated context/ artifacts in .gitignore
- Add openai to requirements.txt
- swap DuckDB UI for ttyd web terminal (--writable, -readonly db)
- add POST /query endpoint with X-Password auth for curl-based SQL execution
- fix UTF-8 rendering: set LANG/LC_ALL=C.UTF-8 in container
- pass BUCKET_REGION env var for correct S3 signing region
- simplify start.sh: drop Xvfb, views.duckdb generation, blocking duckdb -ui
- add less, ncurses-bin to Dockerfile for proper pager/terminal support
- update Caddyfile: single route to ttyd with flush_interval -1 for websocket
- update README to reflect current architecture and document /query usage
- remove duckdb-ui.service, schemas.json, file_tree.md (generated artifacts)
- start.sh: remove prepara_db.py step; load S3 creds via DuckDB init file
- Caddyfile: switch to basic_auth with {env.BASIC_AUTH_HASH} — no rebuild to rotate password
- Dockerfile: drop Python/pip layers (no longer needed at runtime)
- haloy.yml: set server to 89.167.95.136, add BASIC_AUTH_HASH to env
- remove requirements.txt (only needed for local prepara_db.py, not the container)