Batch Processing (Reference)
fast-agent batch run processes row-oriented inputs and writes one JSONL envelope per row.
Inputs
Use --input with a local .jsonl, .csv, or .parquet file, or with an hf:// URI for a Hugging Face dataset:
uv run fast-agent batch run \
--input hf://datasets/evalstate/my-dataset/data/train.jsonl \
--output out.jsonl \
--model passthrough
Use --prompt for a short inline row prompt template:
uv run fast-agent batch run \
--input rows.jsonl \
--prompt "Classify this {{product}} into A, B, or C." \
--output out.jsonl \
--model passthrough
--prompt and --template are mutually exclusive. Use --template when the
row prompt is easier to maintain in a file.
Supported input formats:
| Source | Supported formats | Notes |
|---|---|---|
| Local filesystem | .jsonl, .csv, .parquet |
JSONL rows must be JSON objects. CSV rows are dictionaries keyed by header name. Parquet rows are read with DuckDB. |
hf:// Hugging Face dataset |
.jsonl, .csv, parquet |
Use hf://datasets/owner/name to read the dataset viewer parquet files, or point at a specific file such as hf://datasets/owner/name/path/file.parquet. If a repo has a single JSONL/CSV file, that file is used before the parquet fallback. |
| DuckDB | Python package or CLI | Parquet input requires either the duckdb Python package or a duckdb CLI on PATH. Install fast-agent-mcp[batch-parquet] to add the Python package. |
Dataset-level Hugging Face parquet inputs can be filtered by config and split:
uv run fast-agent batch run \
--input 'hf://datasets/evalstate/my-dataset?config=default&split=train' \
--output out.jsonl \
--model passthrough
Each loaded row becomes the template context. Column names are available as template variables, and {{row_json}} renders the complete row:
For CSV input, all values are strings because they come from CSV fields. JSONL preserves the JSON value types. Parquet scalar values are normalized for JSON output and templates; dates/times become ISO strings, decimals become strings, and bytes are decoded as UTF-8 with replacement for invalid bytes.
Parquet SQL selection
For parquet input, --sql can define the rows processed by the batch run. The query is a DuckDB SELECT query over a view named input:
uv run fast-agent batch run \
--input rows.parquet \
--output out.jsonl \
--sql "SELECT id, text FROM input WHERE split = 'eval'"
--sql is intentionally limited to parquet input. It cannot be combined with --limit, --offset, --sample, or --parallel; put filtering, ordering, and limits directly in the SQL query.
When --sql is used, output row_number values are result ordinals from the SQL result set, not stable original parquet row positions. Prefer --id-field with a stable identifier column when using SQL selection, especially with --resume.
Hugging Face Output
--hf-dataset currently applies to exported trace artifacts, not result JSONL output. Use it with --export-traces:
uv run fast-agent batch run \
--input rows.jsonl \
--output out.jsonl \
--export-traces traces/ \
--hf-dataset owner/trace-dataset
Appending and de-duplicating result rows into a Hugging Face dataset is not implemented yet.
Options
Worker and prompting
| Option | Description |
|---|---|
--model, -m |
Model override for direct mode or the selected AgentCard worker. |
--instruction PATH |
System instruction file for direct mode. Mutually exclusive with --agent-card. |
--agent-card PATH_OR_URL |
AgentCard file, directory, or URL defining the batch worker. Mutually exclusive with --instruction. |
--agent NAME |
Agent name to run when --agent-card loads multiple runnable agents. |
--prompt, -p TEXT |
Inline row prompt template. Mutually exclusive with --template. |
--template PATH |
Row prompt template file. Defaults to sending the full row JSON. |
--schema PATH |
JSON Schema file for structured results. Mutually exclusive with --schema-model. |
--schema-model IMPORT |
Pydantic BaseModel import path for structured results. Mutually exclusive with --schema. |
--shell, -x |
Enable a local shell runtime and expose the execute tool. |
Input selection
| Option | Description |
|---|---|
--input, -i PATH_OR_URI |
Required. Local .jsonl, .csv, .parquet path or hf:// Hugging Face dataset URI. |
--limit N |
Maximum selected rows to process. Useful while developing prompts and templates. |
--offset N |
Rows to skip before sampling. |
--sample N |
Deterministic sample size. |
--seed N |
Deterministic sampling seed. |
--sql QUERY |
DuckDB SELECT query over parquet input view named input. Cannot be combined with --limit, --offset, --sample, or --parallel. |
Output envelopes
| Option | Description |
|---|---|
--output, -o PATH |
Required. Output JSONL file. |
--include-input / --no-include-input |
Include the source row in each output envelope. |
--id-field FIELD |
Input field used as the row ID. Prefer this for resumable production jobs. |
--error-output PATH |
Additional JSONL file containing failed envelopes. |
--telemetry-output PATH |
JSONL file containing per-attempt normalized telemetry. |
--summary-output PATH |
Write final summary JSON to this path. |
--final-summary / --no-final-summary |
Print the final summary JSON to stdout. Disable when another process consumes stdout. |
Resume, overwrite, and failure limits
| Option | Description |
|---|---|
--resume |
Append missing or retried rows. Successful existing envelopes are skipped by row ID. |
--overwrite |
Replace existing output. Mutually exclusive with --resume. |
--max-errors N |
Stop after this many row-level failures. Cannot be combined with --parallel. |
Parallel runs
| Option | Description |
|---|---|
--parallel N |
Run N local shard workers and merge their outputs. |
--work-dir PATH |
Directory for parallel shard outputs and resume manifests. Use a stable value for resumable parallel jobs. |
--keep-temp / --no-keep-temp |
Keep parallel shard outputs after a successful merge. |
--progress-every N |
Print progress every N processed rows per worker. |
--progress / --no-progress |
Print batch progress messages to stderr. |
--parallel cannot be combined with --sql, --sample, --max-errors, or
--export-traces.
Trace export
| Option | Description |
|---|---|
--export-traces PATH |
Directory for per-row Codex trace JSONL files and manifest.jsonl. Cannot be combined with --parallel. |
--hf-dataset REPO |
Upload exported traces to a Hugging Face dataset repository. Requires --export-traces. |
--hf-dataset-path PATH |
Path or prefix inside the Hugging Face dataset for exported traces. Requires --hf-dataset. |