Skip to main content

Module study_vify_inference

Class VifyPipelineDataModel

Represents the data model for the Vify pipeline, handling study data processing for inference.

This class encapsulates all necessary data components required for running inference in the Vify pipeline, including imaging data, masks, clinical information, and PSA-related calculations.

Attributes: data_dict (dict): Dictionary containing the imaging data extracted from the study. mask_dict (dict): Dictionary containing the mask data associated with the images. clinical_dict (dict): Dictionary containing patient clinical data including study ID, PSA value, and age. psa_dict (dict): Dictionary containing calculated PSA density values.

Class Methods: from_study(study: Study, config: VifyPipelineConfig) -> list[VifyPipelineDataModel]: Factory method that creates data model instances from a study object.

Args: study (Study): The study object containing patient and imaging data. config (VifyPipelineConfig): Configuration object specifying pipeline parameters.

Returns: list[VifyPipelineDataModel]: List containing a single VifyPipelineDataModel instance.

Raises: ValueError: If either data_dict or mask_dict fails to load.

Note: This class inherits from BasePipelineDataModel and is specifically designed to work with the feature extraction process in the Vify pipeline workflow.

Class VifyPipelineStep

Pipeline step for running Vify inference on prostate MRI studies.

This class handles the complete inference pipeline for the Vify model, including:

  • Loading and preprocessing imaging data
  • Feature extraction from prostate MRI images
  • Model prediction for lesion detection
  • Extraction of global prostate and lesion-specific features

Args: BasePipelineStep: Base class providing the pipeline step interface

The class requires a VifyPipelineConfig object during initialization which specifies:

  • Model parameters and paths
  • Data processing settings
  • Feature extraction configuration
  • Inference parameters

Method VifyPipelineStep.extract_global_features

View Source

extract_global_features(self, input: VifyPipelineDataModel) -> dict

Extract global features from the input data model.

This method calculates various prostate-related measurements and PSA-derived metrics:

  • Prostate size in voxels
  • Prostate volume in cm³/ml (using voxel volume correction)
  • PSA value in ng/ml
  • PSA density (PSAD) for whole prostate in ng/ml²
  • Zone-specific PSA density for peripheral zone (PZ) in ng/ml²
  • Zone-specific PSA density for central gland (CG) in ng/ml²

Args: input (VifyPipelineDataModel): Input data model containing:

  • mask_dict: Dictionary with binary anatomy masks
  • clinical_dict: Dictionary with PSA values
  • psa_dict: Dictionary with pre-calculated PSAD values

Returns: dict: Dictionary containing:

  • prostate_size: Size in voxels
  • prostate_volume: Volume in cm³/ml
  • psa: PSA value in ng/ml
  • psad: PSA density for whole prostate
  • psad_pz: PSA density for peripheral zone
  • psad_cg: PSA density for central gland

Method VifyPipelineStep.extract_lesion_features

View Source

extract_lesion_features(self, features_df: DataFrame, num_lesions: int, y_probas: DataFrame) -> list[dict]

Extract detailed features for individual lesions detected in prostate MRI.

This method processes each detected lesion to extract key characteristics including:

  • Location coordinates (x,y,z) in the prostate
  • Size measurements in both voxels and milliliters
  • Anatomical zone classification (e.g., peripheral, transition, anterior)
  • VirDx prediction score indicating likelihood of clinically significant cancer

Args: features_df (pd.DataFrame): DataFrame containing extracted radiomics and shape features for each detected lesion, including coordinates, size metrics, and zone information num_lesions (int): Maximum number of lesions to analyze, as configured in the pipeline y_probas (pd.DataFrame): DataFrame containing model prediction probabilities for each lesion, with columns named 'Probability_lesion_X'

Returns: list[dict]: List of dictionaries, one per lesion, containing:

  • virdx_score (float): Model prediction score [0-1] for cancer likelihood
  • lesion_center_of_mass_coords (tuple): (x,y,z) coordinates of lesion centroid
  • lesion_center_of_mass_zone (str): Anatomical zone name from configuration mapping
  • lesion_volume_ml (float): Lesion volume in milliliters (cm³)
  • lesion_size_voxel (int): Lesion size in number of voxels

Note:

  • Processes lesions sequentially until num_lesions is reached or no more lesions exist
  • Uses zone_value_map from config to convert numerical zone codes to anatomical names
  • Handles missing lesions by breaking the loop when coordinates are not found

Method VifyPipelineStep.infer

View Source

infer(self, input: VifyPipelineDataModel) -> BasePipelineOutput

Perform inference on prostate MRI data to detect and analyze potential lesions.

This method executes the complete inference pipeline:

  1. Loads and processes the input data to generate features
  2. Extracts global prostate measurements (volume, PSA metrics)
  3. Runs the VirDx model to predict lesion probabilities
  4. Extracts detailed features for each detected lesion
  5. Compiles results into a standardized output format

Args: input (VifyPipelineDataModel): Input data containing:

  • data_dict: Dictionary of MRI imaging data
  • mask_dict: Dictionary of segmentation masks
  • clinical_dict: Patient clinical data (PSA, age)
  • psa_dict: Pre-calculated PSA density values

Returns: BasePipelineOutput: Structured output containing:

  • misc: Dictionary with lesion analysis results including:
    • VirDx scores for each lesion
    • Lesion coordinates and volumes
    • Prostate measurements
    • PSA-related metrics Note: volumes and operations lists are empty in current implementation

Raises: Various exceptions may be raised during feature extraction or model prediction

Method VifyPipelineStep.load_data

View Source

load_data(self, input: VifyPipelineDataModel) -> DataFrame

Load and process the data from the input object to generate features for inference.

This method performs the following steps:

  1. Extracts imaging data, masks, clinical info and PSA data from input
  2. Adds placeholder values for cancer, ISUP and PI-RADS (required for compatibility)
  3. Generates features using the configured parameters (image type, workers etc.)
  4. Returns a DataFrame containing all extracted features for lesion analysis

Args: input (VifyPipelineDataModel): Input data model containing:

  • data_dict: Dictionary with imaging data
  • mask_dict: Dictionary with segmentation masks
  • clinical_dict: Dictionary with patient clinical data
  • psa_dict: Dictionary with PSA density calculations

Returns: pd.DataFrame: DataFrame containing extracted features including:

  • Lesion coordinates and volumes
  • Texture features
  • Shape features
  • Clinical parameters
  • PSA-derived metrics

Note: The feature extraction process uses the configuration parameters specified during pipeline initialization (image_type, use_feature etc.)