Accessing Data
How to query resources and download files from the data platform.
Querying Resources
from vxp_client import PlatformClient, F
client = PlatformClient()
# Or query a specific resource type with filters
volumes = (
client.query("Volume")
.filter(F.volume_type == "T2", F.image_plane == "transversal")
.limit(100)
.collect()
)
# For convenience, you can pull all tables in full at once
dfs = client.get_all_dataframes()
patients = dfs["Patient"]
volumes = dfs["Volume"]
Explore available resource types and their fields in the frontend query tab or by looking at the payload schema docs.
For the full querying API (lineage filters, subqueries, ANY/ALL, joins), see the querying tutorial.
Downloading Files
Table data is returned directly as Polars DataFrames. Blob data (MRIs, DICOMs, PDFs) is stored on S3.
Tables may contain URLs of blob files that are stored on S3 - for example, the Volume table holds a column path_nii that points to the location of the corresponding nifti file.
We can a) download files individually by specifying their paths or b) pull all data contained in a single polars dataframe at once.
Download individual S3 URLs
# Single file
local_path = client.download_files("s3://bucket/path/to/file.nii.gz", dest_dir=Path("./downloads"))
# Multiple files
local_paths = client.download_files(
["s3://bucket/file1.nii.gz", "s3://bucket/file2.nii.gz"],
dest_dir=Path("./downloads")
)
Download all S3 URLs in a DataFrame
from pathlib import Path
volumes = client.query("Volumes").filter(...).collect() # build your dataframe of interest here
# Automatically detects S3 URL columns, downloads in parallel to disk, and replaces paths in dataframe with local paths
volumes_local = client.materialize_dataframe(volumes, dest_dir=Path("./downloads"))