For the exhaustive, field-level reference — every detection rule, parser option, and
promotion step type with parameters and worked examples — see
Ingestion specs (reference).
Ingestion spec model
An ingestion spec describes how to turn an incoming file into Solya’s model:- Detection — how the file is matched:
FILENAME,COLUMNS,SHEET_NAME,FILENAME_MATCHES,HEADER_CONTAINS, with match modesANY,ALL,COMPOSITE. - Parsing — how columns are read (field types, formats).
- Promotion — the transformation pipeline. Step types include
COERCE_TYPES,FILTER,DEDUPLICATE,ADD_COLUMN,RENAME_COLUMNS,SELECT_COLUMNS,DROP_COLUMNS,INJECT_VALUE,JOIN/SEQUENTIAL_JOIN/AGGREGATION_JOIN,UNPIVOT,TAXONOMY_MAPPING,ID_MAPPING_OUTPUT/ID_MAPPING_JOIN,GENERATE_ID. Write modes:MERGE,APPEND,OVERWRITE_PARTITION.
DRAFT, ACTIVE, DEPRECATED, ARCHIVED), a scope
(GLOBAL or ORG), a priority, and an isDeployed flag (only deployed specs apply to
live files). Specs are versioned.
Ingestion runs
An ingestion run has statusPENDING → RUNNING → SUCCESS / FAILED, a trigger type
(MANUAL, SCHEDULED, API), a link to the underlying Databricks run, and structured
logs at DEBUG / INFO / WARN / ERROR (capped at 5000 entries, 2000 chars per
message). It records triggeredAt, startedAt, and completedAt.
Sandbox: validate then promote
The sandbox splits ingestion into two tracked runs:- Sandbox ingest run — parses and validates an uploaded file into temporary tables and
emits findings with a severity (
error/warning/info) and a stage (detect, parse, map, loader, strict scan, quarantine). StatusPENDING → RUNNING → SUCCESS / FAILED. - Sandbox promote run — copies validated data into silver. Same statuses, plus a
terminal SUCCESS-but-aborted state (
stats.aborted = true) when the ingest had errors and promotion-with-errors wasn’t allowed.
File ingestion records
Files dropped into the landing area are tracked with a status ofUPLOADED, INGESTED,
ERROR, or DELETED, and a source of APP (UI upload) or CLIENT_IMPORTER (external
importer tool).
All runs are organization-scoped and observable from the UI and API, with the same
structured-log model across ingestion, sandbox, tag-, and alert-evaluation runs.

