Skip to main content

VxData API surface

Conventions:

  • Flat schemas everywhere: vxdata.schemas.{create,update,response} (AnyCreate/AnyUpdate/AnyResponse, discriminated on payload_type).
  • Resource mutations are batch-native: each verb takes one-or-many; a single item is just a 1-element collection. No per-id REST paths — one POST /resources/{verb}.
  • Writes never return objects (only counts). Reads return resources.
  • Soft-delete only (tombstone via is_deleted). Reads hide tombstones; restore = update with is_deleted=false on an id you already know.
  • as_of (bitemporal read) is accepted by every read; the SDK injects its client-level default unless overridden per request.
  • Errors via one app-level handler: ValueError/LookupError -> 400, missing resource -> 404.

Resources (CRUD)

POST /resources/create
body: { resources: AnyCreate[] }
-> { created: int } # all-or-nothing; raises if any id already active

POST /resources/read
body: { identifiers: str[], as_of?: datetime, exclude_children?: bool=false }
-> AnyResponse[] # input order preserved; missing/tombstoned skipped

POST /resources/update
body: { updates: { <identifier>: AnyUpdate } }
-> { updated: int } # all-or-nothing; is_deleted toggles delete(true)/restore(false)

POST /resources/delete
body: { identifiers: str[], cascade?: bool=true }
-> { deleted: int } # soft tombstone (+ subtree when cascade)

AnyCreate carries resource metadata (identifier, parent_identifier, license, access_level, derived_from) + flat payload fields. AnyUpdate is the same minus identifier, all optional, plus is_deleted; only provided fields change (derived_from replaces, not appends).

Query

POST /query
body: QueryRequest { payload_type, filters?, parent_identifier?, include_indirect_parents?,
identifiers?, select_columns?, sort_by="created_at", sort_order="desc",
cursor?, limit?, as_of? }
-> { results: AnyResponse[], next_cursor: str|null, limit: int|null }

POST /query/count
body: QueryRequest # pagination fields ignored
-> { count: int }

POST /query/parquet
body: QueryRequest
-> application/octet-stream # single parquet file

Lineage

GET /children?parent_identifier=&with_payloads=false&as_of=
-> TreeNode[] # immediate children (roots if no parent);
# child counts + optional payload preview (not flat AnyResponse)

GET /parents/{identifier}?as_of=
-> str[] # ancestor identifier chain, nearest-first

Patients / Studies (server-side identifier allocation)

POST /patients/register body: { datasource_id, external_uid } -> { identifier } # raises if exists
POST /patients/resolve body: { datasource_id, external_uid } -> { identifier } # raises if missing
POST /studies/register body: { patient_identifier, external_uid } -> { identifier }
POST /studies/resolve body: { patient_identifier, external_uid } -> { identifier }

Artefacts (ergonomic only; artefact CRUD goes through /resources/* — see TODO.md)

GET /artefacts/latest?repo=&branch=&path=
-> ArtefactResponse # latest successful artefact for repo/branch/path

Stats

GET /datasources/stats -> per-datasource counts
GET /datasources/detailed-stats?datasource_id= -> detailed breakdown (all datasources if omitted)

Schemas

GET /schemas
-> { <PayloadType>: <json-schema> } # all payload definitions (subsumes the old per-name + payload-types endpoints)

S3 / object storage

GET /s3/config
-> { s3_endpoint, s3_bucket }

POST /s3/presign
body: { direction: "upload"|"download", path?, filename?, group? }
-> { url, path? } # upload needs filename+group; download needs path

POST /s3/presign/batch
body: { direction, items: [...] } # batch of the above
-> { urls: [...] }

GET /s3/parquet?path=&limit=100&offset=0
-> { columns, schema, total_rows, returned_rows, offset, data } # parquet preview reader

Admin

POST /admin/materialized-views/refresh
-> { message } # force lineage MV refresh (Postgres only)