Enrich spreadsheets
demografix enrich
reads a spreadsheet, predicts age, gender, or nationality for each row, and appends prediction columns while preserving every existing column. The output is an enriched file, ready for analysis.
Input formats
enrich
reads CSV, TSV, JSON, JSONL, and XLSX. The input format is detected from the file extension.
Enrich a file
Point
enrich
at a file, choose the services, and write the result with -o:
demografix enrich people.csv -o out.csv --age --gender --nationality --name-col full_name
Each enabled service appends its own columns.
--gender
adds gender, gender_count, and gender_probability.
--age
adds
age
and age_count.
--nationality
adds ranked country candidates and nationality_count.
Choose the name column
enrich
needs to know which column holds the name. Name it with --name-col, using a header name or a 1-based column index:
demografix enrich people.csv -o out.csv --age --name-col 2
When first and last names are in separate columns, use split mode:
demografix enrich people.csv -o out.csv --age \
--first-name-col first --last-name-col last
Scope by country
Gender and age accept a country. Apply one country to every row with --country, or read it per row from a column with --country-col:
demografix enrich people.csv -o out.csv --gender --age --name-col full_name --country-col country
Nationality candidates
Nationality returns several country candidates per name, ordered by descending probability. Set how many are appended with
--top-n
(1 to 5, default 3):
demografix enrich people.csv -o out.csv --nationality --name-col last_name --top-n 5
This appends
country_1
through country_5, a matching
country_N_probability
for each, and nationality_count.
Avoid column collisions
A fresh run refuses to overwrite an existing column. If a prediction column name already exists in the input, add a prefix with --prefix:
demografix enrich people.csv -o out.csv --age --name-col full_name --prefix pred_
Output
For enrich,
-o
is an output file path, not a format. The output format follows the file extension — .csv, .tsv, .json, .jsonl, or .xlsx.
Preview the cost
--dry-run
validates the configuration and prints a plan — input rows, appended columns, and the number of names that would be billed — without calling the API:
demografix enrich people.csv -o out.csv --age --name-col full_name --dry-run
plan
input csv, 1200 rows
output out.csv
services age
name column full_name
country none (global)
new columns age, age_count
cost 1200 names — no API calls made
Each enabled service bills one name per non-empty row.
Resume after an interruption
A failed request does not abort the run. Sibling rows still process, and the partial output is written. If the quota runs out mid-run,
enrich
reports how many rows it wrote.
Re-run with
--resume
to fill only the rows whose prediction columns are still empty:
demografix enrich out.csv -o out.csv --age --name-col full_name --resume
Summarize a signup file
Enrich a file of signups, then aggregate the result:
demografix enrich signups.csv -o signups_enriched.csv \
--age --gender --nationality --name-col full_name
The output adds gender, age, and nationality columns to every row. Pivot it in a spreadsheet or notebook to report the demographic mix of the cohort — the gender split, the age distribution, and the leading nationalities.
To predict a few names without a file, or to pipe a list of names, see Basic usage.