Skip to main content
This is the technical companion to the data layer guides.
For the exhaustive, field-level reference — every detection rule, parser option, and promotion step type with parameters and worked examples — see Ingestion specs (reference).

Ingestion spec model

An ingestion spec describes how to turn an incoming file into Solya’s model:
  • Detection — how the file is matched: FILENAME, COLUMNS, SHEET_NAME, FILENAME_MATCHES, HEADER_CONTAINS, with match modes ANY, ALL, COMPOSITE.
  • Parsing — how columns are read (field types, formats).
  • Promotion — the transformation pipeline. Step types include COERCE_TYPES, FILTER, DEDUPLICATE, ADD_COLUMN, RENAME_COLUMNS, SELECT_COLUMNS, DROP_COLUMNS, INJECT_VALUE, JOIN / SEQUENTIAL_JOIN / AGGREGATION_JOIN, UNPIVOT, TAXONOMY_MAPPING, ID_MAPPING_OUTPUT / ID_MAPPING_JOIN, GENERATE_ID. Write modes: MERGE, APPEND, OVERWRITE_PARTITION.
A spec has a status (DRAFT, ACTIVE, DEPRECATED, ARCHIVED), a scope (GLOBAL or ORG), a priority, and an isDeployed flag (only deployed specs apply to live files). Specs are versioned.

Ingestion runs

An ingestion run has status PENDING → RUNNING → SUCCESS / FAILED, a trigger type (MANUAL, SCHEDULED, API), a link to the underlying Databricks run, and structured logs at DEBUG / INFO / WARN / ERROR (capped at 5000 entries, 2000 chars per message). It records triggeredAt, startedAt, and completedAt.

Sandbox: validate then promote

The sandbox splits ingestion into two tracked runs:
  • Sandbox ingest run — parses and validates an uploaded file into temporary tables and emits findings with a severity (error / warning / info) and a stage (detect, parse, map, loader, strict scan, quarantine). Status PENDING → RUNNING → SUCCESS / FAILED.
  • Sandbox promote run — copies validated data into silver. Same statuses, plus a terminal SUCCESS-but-aborted state (stats.aborted = true) when the ingest had errors and promotion-with-errors wasn’t allowed.
A spec recommendation run profiles an uploaded file and proposes a spec plus per-dataset row previews (same status set).

File ingestion records

Files dropped into the landing area are tracked with a status of UPLOADED, INGESTED, ERROR, or DELETED, and a source of APP (UI upload) or CLIENT_IMPORTER (external importer tool).
All runs are organization-scoped and observable from the UI and API, with the same structured-log model across ingestion, sandbox, tag-, and alert-evaluation runs.