Skip to main content

Deployment

Kubernetes cluster deployments

We maintain four deployments of the data-platform, each for different purposes:

  • production is the go-to production deployment. It should only contain clean data and should not be written to except for automated, well-tested processes.
  • staging acts as the final test-bed for changes before they hit production. Otherwise, the deployment is identical to production.
  • public is a variant of the data-platform that contains only less-sensitive data, e.g. from public datasets.
  • previews is a collection of deployments, automatically spun up and maintained for each active PR. This enables rapid development and demonstration of new features and data transformations.

Both production and staging contain sensitive patient data and therefore must never be accessed by internet-connected LLMs or other public-facing processes. previews gets populated with a copy of the public instance.

production, staging and public instances store blob data on our central S3 store (as of mid-February: our MinIO at jumphost 10000). preview deployments get their own MinIO instance for writing new data, but can access the central S3 store for reading existing data.

Each type of deployment lives in its own Kubernetes namespace:

  • data-platform-production
  • data-platform-staging
  • data-platform-public
  • data-platform-previews

Production, staging, and public instances receive the same amount of resources. The previews deployments receive fewer resources, but the overall namespace may grow larger in resource use than the three other ones individually as there may be multiple deployments in parallel.

In the future (not implemented now), each deployment should be reachable via a readable URL: https://production.data.virdx.dev/, https://staging.data.virdx.dev/, https://public.data.virdx.dev/, and https://pr-123.previews.data.virdx.dev/.

Legacy, docker-compose based deployments

The platform runs as docker-compose-managed Docker containers on the jumphost (192.168.10.101).

The prod deployment is the primary system for people to use. A staging deployment is available for testing new features before pushing them to prod. Additionally, we define a local dev deployment for development and testing. All docker-compose deployment specs live in the repo at deployments//. Every deployment has a compose.yaml file that defines containers and services. Staging and production also have deployment-local operational scripts such as deploy.sh, backup.sh, and restore_from_s3.sh.

For both prod and staging, a directory at /opt/virdx/vxplatform exists that contains a checkout of the repo and therefore the deployment specs.

On push to the main branch of this repo, we automatically build Docker images and distribute them via the GitHub Container Registry. At its core, the data platform consists of a frontend service, an API service, and a Postgres database. Storage of file data is provided by our central MinIO deployment and not managed by this repo. Each deployment has a "target S3 bucket" - a bucket address that is provided to the client so it knows where to put files.

A dev deployment can be spun up for local testing on a developer's machine.

Production Deployment

deployments/frankfurt_prod/compose.yaml:

  • Defines three services:
    • frontend at port 2700,
    • api at port 2701,
    • postgres at port 2702.
  • Uses GHCR images ghcr.io/virdx/vxplatform-api: and ghcr.io/virdx/vxplatform-frontend:.
  • The postgres container's data is stored into a mounted disk location.
  • Defines the target S3 data destination to be the central MinIO instance in the dataplatform-prod bucket.

deployments/frankfurt_prod/.env.secrets:

  • Contains the secrets for the postgres database as well as S3 access credentials (required by the frontend to visualize files).

How can we trigger deployment? The GitHub action at .github/workflows/deploy_frankfurt_prod.yaml allows manual triggering of a deployment. The workflow calls deployments/frankfurt_prod/deploy.sh.

How are backups created? There is currently no dedicated production backup workflow in this repo. Operational backup automation currently lives under deployments/frankfurt_staging/ and .github/workflows/backup_frankfurt_staging.yaml.

Staging Deployment

The staging deployment is nearly identical to production, with the following differences:

  • different set of ports: frontend at 2800, api at 2801, postgres at 2802
  • stores Postgres data into a named volume - no need for exposing the postgres data on disk here

The purpose of the staging deployment is to provide production-like data, but to allow for more frequent updates in order to test new features before pushing them to prod.

How can we trigger deployment? We re-deploy automatically off of main branch pushes. Additionally, the GitHub action at .github/workflows/deploy_frankfurt_staging.yaml allows manual triggering of a deployment. The workflow calls deployments/frankfurt_staging/deploy.sh.

How do we populate staging with data? We can trigger the GitHub action at .github/workflows/restore_frankfurt_staging.yaml to populate the staging database from a production backup. This will effectively overwrite any existing data in staging with a copy of production data. References to objects on S3 that are present in the production data will continue to work, because the restored records still point to objects in the dataplatform-prod bucket and staging has read access. The workflow calls deployments/frankfurt_staging/restore_from_s3.sh.

How are backups created? The GitHub action at .github/workflows/backup_frankfurt_staging.yaml creates routine backups by calling deployments/frankfurt_staging/backup.sh.

How does this not contaminate prod? The staging database is fully isolated from the production database. Any new files uploaded via staging go to the dataplatform-dev bucket, not dataplatform-prod.

Dev Deployment

The deployments/dev/compose.yaml file defines the local dev deployment that can be run on a developer's machine. Users define target ports for the services via a .env file. Containers are built from local code state rather than GHCR images. This allows for hot reloads.

A MinIO instance is also spawned as part of this deployment.

Run just deploy from the repo root to get started. Run just import-backup-s3 --access-key=... --secret-key=... to populate this deployment's postgres from a backup.

just is intentionally limited to local development. Staging and production operations are handled via deployment-specific scripts and GitHub Actions.