VxData API surface
Conventions:
- Flat schemas everywhere:
vxdata.schemas.{create,update,response}(AnyCreate/AnyUpdate/AnyResponse, discriminated onpayload_type). - Resource mutations are batch-native: each verb takes one-or-many; a single item is just a 1-element collection. No per-id REST paths — one
POST /resources/{verb}. - Writes never return objects (only counts). Reads return resources.
- Soft-delete only (tombstone via
is_deleted). Reads hide tombstones; restore =updatewithis_deleted=falseon an id you already know. as_of(bitemporal read) is accepted by every read; the SDK injects its client-level default unless overridden per request.- Errors via one app-level handler:
ValueError/LookupError-> 400, missing resource -> 404.
Resources (CRUD)
POST /resources/create
body: { resources: AnyCreate[] }
-> { created: int } # all-or-nothing; raises if any id already active
POST /resources/read
body: { identifiers: str[], as_of?: datetime, exclude_children?: bool=false }
-> AnyResponse[] # input order preserved; missing/tombstoned skipped
POST /resources/update
body: { updates: { <identifier>: AnyUpdate } }
-> { updated: int } # all-or-nothing; is_deleted toggles delete(true)/restore(false)
POST /resources/delete
body: { identifiers: str[], cascade?: bool=true }
-> { deleted: int } # soft tombstone (+ subtree when cascade)
AnyCreate carries resource metadata (identifier, parent_identifier, license, access_level, derived_from) + flat payload fields. AnyUpdate is the same minus identifier, all optional, plus is_deleted; only provided fields change (derived_from replaces, not appends).
Query
POST /query
body: QueryRequest { payload_type, filters?, parent_identifier?, include_indirect_parents?,
identifiers?, select_columns?, sort_by="created_at", sort_order="desc",
cursor?, limit?, as_of? }
-> { results: AnyResponse[], next_cursor: str|null, limit: int|null }
POST /query/count
body: QueryRequest # pagination fields ignored
-> { count: int }
POST /query/parquet
body: QueryRequest
-> application/octet-stream # single parquet file
Lineage
GET /children?parent_identifier=&with_payloads=false&as_of=
-> TreeNode[] # immediate children (roots if no parent);
# child counts + optional payload preview (not flat AnyResponse)
GET /parents/{identifier}?as_of=
-> str[] # ancestor identifier chain, nearest-first
Patients / Studies (server-side identifier allocation)
POST /patients/register body: { datasource_id, external_uid } -> { identifier } # raises if exists
POST /patients/resolve body: { datasource_id, external_uid } -> { identifier } # raises if missing
POST /studies/register body: { patient_identifier, external_uid } -> { identifier }
POST /studies/resolve body: { patient_identifier, external_uid } -> { identifier }
Artefacts (ergonomic only; artefact CRUD goes through /resources/* — see TODO.md)
GET /artefacts/latest?repo=&branch=&path=
-> ArtefactResponse # latest successful artefact for repo/branch/path
Stats
GET /datasources/stats -> per-datasource counts
GET /datasources/detailed-stats?datasource_id= -> detailed breakdown (all datasources if omitted)
Schemas
GET /schemas
-> { <PayloadType>: <json-schema> } # all payload definitions (subsumes the old per-name + payload-types endpoints)
S3 / object storage
GET /s3/config
-> { s3_endpoint, s3_bucket }
POST /s3/presign
body: { direction: "upload"|"download", path?, filename?, group? }
-> { url, path? } # upload needs filename+group; download needs path
POST /s3/presign/batch
body: { direction, items: [...] } # batch of the above
-> { urls: [...] }
GET /s3/parquet?path=&limit=100&offset=0
-> { columns, schema, total_rows, returned_rows, offset, data } # parquet preview reader
Admin
POST /admin/materialized-views/refresh
-> { message } # force lineage MV refresh (Postgres only)