Using internal datasets

As of early July 2025, we store basically all of our data on /mnt/storage on the GPU cluster. The file tree looks like this:

storage
├── abc
├── data
│   ├── clinical_trials
│   │   ├── alianca
│   │   ├── aristra
│   │   ├── bamberg
│   │   ├── basel
│   │   ├── db_backup
│   │   ├── histo_stitching
│   │   └── lund
│   ├── dicom
│   ├── histopathology
│   │   ├── lsm
│   │   ├── micro_ct
│   │   ├── public_datasets
│   │   └── wsis
│   ├── misc
│   │   ├── api
│   │   ├── docker_volumes
│   │   └── test
│   └── mri
│       ├── micro_mri
│       ├── phantom
│       ├── public_datasets
│       └── testscans
├── misc
│   └── jenna
├── projects
│   ├── artifacts
│   ├── bamberg-bids
│   ├── coverage_dashboard
│   ├── diffsim
│   ├── dvc-data-registry
│   ├── dwi_interp
│   ├── gleason_seg
│   ├── histo_stitching
│   ├── inverse
│   ├── inverse_embedding
│   ├── mismo
│   ├── vify
│   ├── vimesh
│   ├── vipolate
│   ├── viqc
│   ├── virdx_denoising
│   ├── virdx_pipeline
│   ├── vireg
│   ├── viseg
│   └── viseg_histo
└── ...

If you're look to use any of our data, copy data into your relevant projects subfolder. Do not directly modify the data in /mnt/storage/data/clinical_trials!