Module helpers
Function bottom_10
bottom_10(x: ndarray) -> float64
Function bottom_100
bottom_100(x: ndarray) -> float64
Function bottom_100_mean
bottom_100_mean(x: ndarray) -> float64
Function bottom_10_mean
bottom_10_mean(x: ndarray) -> float64
Function bottom_10p_mean
bottom_10p_mean(x: ndarray) -> float64
Computes the mean of the bottom 10 percent of values. Args: x (np.ndarray): data input. Returns: np.float64: Mean
Function calculate_psad
calculate_psad(mask_dict: dict, psa: float) -> dict
Calculate PSA density (PSAD) and its subtypes from prostate segmentation masks.
Args: mask_dict (dict): Dictionary containing segmentation masks with keys:
- 'anatomy': Volume containing prostate zone segmentations
- 'bin_anatomy': Volume containing binary prostate segmentation psa (float): Patient's PSA value in ng/ml
Returns: dict: Dictionary containing the calculated PSA density values:
- 'psad': Overall PSA density (PSA/total prostate volume)
- 'psad_pz': PSA density for peripheral zone
- 'psad_cg': PSA density for central gland (transition + anterior + central zones)
Function check_and_resample_masks
check_and_resample_masks(img_vol: Volume, mask_vol: Volume, softmax_vol: UnionType[Volume, None] = None, anatomical_vol: UnionType[Volume, None] = None) -> tuple
Resamples segmentation masks to match the shape, spacing, and origin of a reference image volume.
This function ensures that all provided volumes (mask, softmax predictions, and anatomical segmentations) are resampled to match the properties of the reference image volume. This is necessary for consistent analysis when volumes are acquired with different parameters.
Args: img_vol (Volume): Reference image volume that defines the target shape, spacing, and origin mask_vol (Volume): Binary mask volume to be resampled softmax_vol (Volume, optional): Probability map/softmax predictions volume to be resampled. Defaults to None. anatomical_vol (Volume, optional): Anatomical segmentation volume (e.g., prostate zones) to be resampled. Defaults to None.
Returns: tuple[Volume, Volume, Volume]: Tuple containing:
- mask_resampled: Resampled binary mask matching reference volume
- softmax_resampled: Resampled probability maps (None if not provided)
- anatomy_resampled: Resampled anatomical segmentations (None if not provided)
Function get_clinical_data
get_clinical_data(study: Study, study_path: str, studies_path: str, patient_csv: UnionType[str, None] = None) -> dict
Get clinical data for a patient including PSA, cancer status, PI-RADS score, ISUP grade, and age.
This function retrieves clinical data either from the Study object directly or from a provided CSV file. When using a CSV file, it matches the study path to find the correct patient record.
Args: study (Study): Study object containing patient metadata and clinical information study_path (str): Full path to the current study directory studies_path (str): Base directory containing all study folders patient_csv (str, optional): Path to CSV file containing patient clinical data. Must include columns: 'id', 'psa', 'cancer', 'pirads', 'isup'. When None, data is retrieved from Study object. Defaults to None.
Returns: dict: Dictionary containing clinical data with keys:
- 'psa': PSA value in ng/ml
- 'cancer': Presence of clinically significant cancer (boolean)
- 'pirads': PI-RADS score (1-5)
- 'isup': ISUP grade (1-5)
- 'age': Patient age
Function get_data_for_feature_extraction
get_data_for_feature_extraction(study: Study, queries: QueryObject, image_type: str, segmentation_method: QueryObject, study_path: str = 'unknown') -> UnionType[tuple[dict, dict], tuple[None, None]]
Get the image data and corresponding segmentation masks for feature extraction.
This function retrieves and processes various types of MRI images (ADC, DWI, VERDICT, etc.) along with their corresponding segmentation masks (lesions, prostate zones) from a study. It handles different query types based on the image modality and ensures proper data validation.
Args: study (Study): Study object containing the imaging and segmentation data study_path (str): Path to the study directory for logging purposes queries (QueryObject): List of query parameters specific to the image type (e.g., b-values for DWI) image_type (str): Type of MRI sequence ('adc', 'dwi', 'verdict', 'mismo_adc', 't2') segmentation_method (QueryObject): Method used for lesion segmentation Returns: tuple[dict, dict] | tuple[None, None]: Returns either:
- A tuple of two dictionaries:
- data_dict: Contains the image volumes for each query
- mask_dict: Contains lesion masks, prostate zones, and binary prostate mask
- (None, None) if required data is missing or invalid
Function get_global_features
get_global_features(image_type: str, img_vol: Volume, mask: ndarray, anatomy_resampled: Volume, softmax_resampled: Volume) -> dict[str, UnionType[float64, Any]]
Function to calculate global statistical features from medical imaging data.
Computes statistical features across entire prostate regions and different anatomical zones, including intensity statistics from both the raw image data and softmax probability maps. TODO: Allow for empty masks Args: image_type (str): Type of medical image (e.g., 'adc', 'dwi', 'verdict', 't2') img_vol (Volume): 3D image volume containing the raw imaging data mask (np.ndarray): Binary mask array identifying all lesion voxels anatomy_resampled (Volume): Resampled prostate zone segmentation mask matching image dimensions softmax_resampled (Volume): Resampled softmax probability maps for lesion detection
Returns: dict[str, np.float64 | Any]: Dictionary containing computed global features with keys:
- '_zone_all': Mean intensity in each prostate zone
- '__all': Global statistical measures from image data
- '_softmax__all': Statistics from softmax values
- '_size_all': Total volume of all lesions
Function get_lesion_wise_features
get_lesion_wise_features(image_type: str, lesion_id: int, img_vol: Volume, sub_mask: ndarray, anatomy_resampled: Volume, softmax_resampled: Volume) -> dict[str, UnionType[float64, Any]]
Function to determine lesion-wise features from medical imaging data.
Calculates quantitative features for each individual lesion including location, size, volume, zonal distribution, and various statistical measures of image intensity and softmax probability values. TODO: Allow for empty masks Args: image_type (str): Type of medical image (e.g., 'adc', 'dwi', 'verdict', 't2') lesion_id (int): Unique identifier for the current lesion being analyzed img_vol (Volume): 3D image volume containing the raw imaging data sub_mask (np.ndarray): Binary mask array identifying the voxels of this specific lesion anatomy_resampled (Volume): Resampled prostate zone segmentation mask matching image dimensions softmax_resampled (Volume): Resampled softmax probability maps for lesion detection
Returns: dict[str, np.float64 | Any]: Dictionary containing computed features with keys:
- 'lesion_location': Broad zone location (1=PZ, 2=other zones)
- 'lesion_precise_location': Specific zone value
- 'lesion_coords_': Centroid coordinates
- 'size': Lesion size in mm³
- 'lesionvolume': Lesion volume in ml
- 'zoneprop': Proportion in each zone
- Various statistical features from image intensities and softmax values
Function get_study_list
get_study_list(studies_path: str, patient_csv: UnionType[str, None] = None) -> list
Get a list of study paths from a directory, optionally filtered by a CSV file.
This function returns a sorted list of study paths either from:
- A CSV file containing patient IDs and additional data
- All subdirectories in the specified studies directory
Args: studies_path (str): Base directory containing all study folders patient_csv (str, optional): Path to CSV file containing patient IDs. If provided, only studies matching IDs in this file are included. The CSV must have an 'id' column. Defaults to None.
Returns: list: Sorted list of absolute paths to study directories
Function select_features_into_df
select_features_into_df(df: DataFrame, exclude_num_first_columns: int, features_used: UnionType[list, None], use_individual_lesions: bool = True) -> tuple[DataFrame, Series, Series, Series, Series]
Select and prepare feature columns from a DataFrame for model training/evaluation.
Args: df (pd.DataFrame): Input DataFrame containing all features and metadata exclude_num_first_columns (int): Number of initial columns to exclude from features features_used (list | None, optional): List of feature prefixes to include. If None then no feature are subselected by global and local lesion level. If empty list then use_indiviual_lesion is activated. use_individual_lesions (bool, optional): Whether to use individual lesion features vs global features. Defaults to True.
Returns: tuple[pd.DataFrame,pd.Series,pd.Series,pd.Series,pd.Series]: Tuple containing:
- features: DataFrame with selected features
- gt: Ground truth cancer status
- patient_ids: Series of patient IDs
- isup: ISUP grades
- pirads: PI-RADS scores
Function top_10
top_10(x: ndarray) -> float64
Function top_100
top_100(x: ndarray) -> float64
Function top_100_mean
top_100_mean(x: ndarray) -> float64
Function top_10_mean
top_10_mean(x: ndarray) -> float64
Function top_10p_mean
top_10p_mean(x: ndarray) -> float64
Computes the mean of the top 10 percent of values. Args: x (np.ndarray): data input. Returns: np.float64: Mean