Module study_vify_inference
Class VifyPipelineDataModel
Represents the data model for the Vify pipeline, handling study data processing for inference.
This class encapsulates all necessary data components required for running inference in the Vify pipeline, including imaging data, masks, clinical information, and PSA-related calculations.
Attributes: data_dict (dict): Dictionary containing the imaging data extracted from the study. mask_dict (dict): Dictionary containing the mask data associated with the images. clinical_dict (dict): Dictionary containing patient clinical data including study ID, PSA value, and age. psa_dict (dict): Dictionary containing calculated PSA density values.
Class Methods: from_study(study: Study, config: VifyPipelineConfig) -> list[VifyPipelineDataModel]: Factory method that creates data model instances from a study object.
Args: study (Study): The study object containing patient and imaging data. config (VifyPipelineConfig): Configuration object specifying pipeline parameters.
Returns: list[VifyPipelineDataModel]: List containing a single VifyPipelineDataModel instance.
Raises: ValueError: If either data_dict or mask_dict fails to load.
Note: This class inherits from BasePipelineDataModel and is specifically designed to work with the feature extraction process in the Vify pipeline workflow.
Class VifyPipelineStep
Pipeline step for running Vify inference on prostate MRI studies.
This class handles the complete inference pipeline for the Vify model, including:
- Loading and preprocessing imaging data
- Feature extraction from prostate MRI images
- Model prediction for lesion detection
- Extraction of global prostate and lesion-specific features
Args: BasePipelineStep: Base class providing the pipeline step interface
The class requires a VifyPipelineConfig object during initialization which specifies:
- Model parameters and paths
- Data processing settings
- Feature extraction configuration
- Inference parameters
Method VifyPipelineStep.extract_global_features
extract_global_features(self, input: VifyPipelineDataModel) -> dict
Extract global features from the input data model.
This method calculates various prostate-related measurements and PSA-derived metrics:
- Prostate size in voxels
- Prostate volume in cm³/ml (using voxel volume correction)
- PSA value in ng/ml
- PSA density (PSAD) for whole prostate in ng/ml²
- Zone-specific PSA density for peripheral zone (PZ) in ng/ml²
- Zone-specific PSA density for central gland (CG) in ng/ml²
Args: input (VifyPipelineDataModel): Input data model containing:
- mask_dict: Dictionary with binary anatomy masks
- clinical_dict: Dictionary with PSA values
- psa_dict: Dictionary with pre-calculated PSAD values
Returns: dict: Dictionary containing:
- prostate_size: Size in voxels
- prostate_volume: Volume in cm³/ml
- psa: PSA value in ng/ml
- psad: PSA density for whole prostate
- psad_pz: PSA density for peripheral zone
- psad_cg: PSA density for central gland
Method VifyPipelineStep.extract_lesion_features
extract_lesion_features(self, features_df: DataFrame, num_lesions: int, y_probas: DataFrame) -> list[dict]
Extract detailed features for individual lesions detected in prostate MRI.
This method processes each detected lesion to extract key characteristics including:
- Location coordinates (x,y,z) in the prostate
- Size measurements in both voxels and milliliters
- Anatomical zone classification (e.g., peripheral, transition, anterior)
- VirDx prediction score indicating likelihood of clinically significant cancer
Args: features_df (pd.DataFrame): DataFrame containing extracted radiomics and shape features for each detected lesion, including coordinates, size metrics, and zone information num_lesions (int): Maximum number of lesions to analyze, as configured in the pipeline y_probas (pd.DataFrame): DataFrame containing model prediction probabilities for each lesion, with columns named 'Probability_lesion_X'
Returns: list[dict]: List of dictionaries, one per lesion, containing:
- virdx_score (float): Model prediction score [0-1] for cancer likelihood
- lesion_center_of_mass_coords (tuple): (x,y,z) coordinates of lesion centroid
- lesion_center_of_mass_zone (str): Anatomical zone name from configuration mapping
- lesion_volume_ml (float): Lesion volume in milliliters (cm³)
- lesion_size_voxel (int): Lesion size in number of voxels
Note:
- Processes lesions sequentially until num_lesions is reached or no more lesions exist
- Uses zone_value_map from config to convert numerical zone codes to anatomical names
- Handles missing lesions by breaking the loop when coordinates are not found
Method VifyPipelineStep.infer
infer(self, input: VifyPipelineDataModel) -> BasePipelineOutput
Perform inference on prostate MRI data to detect and analyze potential lesions.
This method executes the complete inference pipeline:
- Loads and processes the input data to generate features
- Extracts global prostate measurements (volume, PSA metrics)
- Runs the VirDx model to predict lesion probabilities
- Extracts detailed features for each detected lesion
- Compiles results into a standardized output format
Args: input (VifyPipelineDataModel): Input data containing:
- data_dict: Dictionary of MRI imaging data
- mask_dict: Dictionary of segmentation masks
- clinical_dict: Patient clinical data (PSA, age)
- psa_dict: Pre-calculated PSA density values
Returns: BasePipelineOutput: Structured output containing:
- misc: Dictionary with lesion analysis results including:
- VirDx scores for each lesion
- Lesion coordinates and volumes
- Prostate measurements
- PSA-related metrics Note: volumes and operations lists are empty in current implementation
Raises: Various exceptions may be raised during feature extraction or model prediction
Method VifyPipelineStep.load_data
load_data(self, input: VifyPipelineDataModel) -> DataFrame
Load and process the data from the input object to generate features for inference.
This method performs the following steps:
- Extracts imaging data, masks, clinical info and PSA data from input
- Adds placeholder values for cancer, ISUP and PI-RADS (required for compatibility)
- Generates features using the configured parameters (image type, workers etc.)
- Returns a DataFrame containing all extracted features for lesion analysis
Args: input (VifyPipelineDataModel): Input data model containing:
- data_dict: Dictionary with imaging data
- mask_dict: Dictionary with segmentation masks
- clinical_dict: Dictionary with patient clinical data
- psa_dict: Dictionary with PSA density calculations
Returns: pd.DataFrame: DataFrame containing extracted features including:
- Lesion coordinates and volumes
- Texture features
- Shape features
- Clinical parameters
- PSA-derived metrics
Note: The feature extraction process uses the configuration parameters specified during pipeline initialization (image_type, use_feature etc.)