Skip to main content

VxData API surface

Conventions:

  • Flat schemas everywhere: vxdata.schemas.{create,update,response} (AnyCreate/AnyUpdate/AnyResponse, discriminated on payload_type).
  • Resource mutations are batch-native: each verb takes one-or-many; a single item is just a 1-element collection. No per-id REST paths — one POST /resources/{verb}.
  • Writes never return objects (only counts). Reads return resources.
  • Soft-delete only (tombstone via is_deleted). Reads hide tombstones; restore = update with is_deleted=false on an id you already know.
  • as_of (bitemporal read) is accepted by every read; the SDK injects its client-level default unless overridden per request.
  • Errors via one app-level handler: ValueError/LookupError -> 400, missing resource -> 404.

Resources (CRUD)

POST /resources/create
body: { resources: AnyCreate[] }
-> { created: int } # all-or-nothing; raises if any id already active

POST /resources/read
body: { identifiers: str[], as_of?: datetime, exclude_children?: bool=false }
-> AnyResponse[] # input order preserved; missing/tombstoned skipped

POST /resources/update
body: { updates: { <identifier>: AnyUpdate } }
-> { updated: int } # all-or-nothing; is_deleted toggles delete(true)/restore(false)

POST /resources/delete
body: { identifiers: str[], cascade?: bool=true }
-> { deleted: int } # soft tombstone (+ subtree when cascade)

POST /resources/restore
body: { identifiers: str[] }
-> { restored: int } # clears tombstone (inverse of delete)

AnyCreate carries resource metadata (identifier, parent_identifier, license, access_level, derived_from) + flat payload fields. AnyUpdate is the same minus identifier, all optional, plus is_deleted; only provided fields change (derived_from replaces, not appends).

Query

POST /query
body: QueryRequest { payload_type, filters?, parent_identifier?, include_indirect_parents?,
identifiers?, select_columns?, sort_by="created_at", sort_order="desc",
cursor?, limit?, as_of? }
-> { results: AnyResponse[], next_cursor: str|null, limit: int|null }

POST /query/count
body: QueryRequest # pagination fields ignored
-> { count: int }

POST /query/parquet
body: QueryRequest
-> application/octet-stream # single parquet file

Lineage

GET /parents/{identifier}?as_of=
-> str[] # ancestor identifier chain, nearest-first

Patients / Studies (server-side identifier allocation)

POST /patients/register body: { datasource_id, external_uid } -> { identifier } # raises if exists
POST /patients/resolve body: { datasource_id, external_uid } -> { identifier } # raises if missing
POST /studies/register body: { patient_identifier, external_uid } -> { identifier }
POST /studies/resolve body: { patient_identifier, external_uid } -> { identifier }

Artefacts

Artefacts are resources: create/read/update/delete them through /resources/* like any other payload_type.

Stats

GET /datasources/stats -> per-datasource counts
GET /datasources/detailed-stats?datasource_id= -> detailed breakdown (all datasources if omitted)

Schemas

GET /schemas
-> { <PayloadType>: <json-schema> } # all payload definitions (subsumes the old per-name + payload-types endpoints)

S3 / object storage

GET /s3/config
-> { s3_endpoint, s3_bucket }

POST /s3/presign
body: { direction: "upload"|"download", path?, filename?, group? }
-> { url, path? } # upload needs filename+group; download needs path

POST /s3/presign/batch
body: { direction, items: [...] } # batch of the above
-> { urls: [...] }

POST /s3/presign/upload-dir
body: { group, relpaths: str[] } # one shared {group}/{subdir} prefix, a presigned PUT per relpath
-> { root, items: [...] } # root s3:// prefix + { url, path } per input, in order

GET /s3/list?prefix=
-> { keys: str[] } # object keys under an s3://bucket/key prefix

GET /s3/parquet?path=&limit=100&offset=0
-> { columns, schema, total_rows, returned_rows, offset, data } # parquet preview reader