Module `helpers`

Function `bottom_10`

bottom_10(x: ndarray) -> float64

Function `bottom_100`

View Source

bottom_100(x: ndarray) -> float64

Function `bottom_100_mean`

View Source

bottom_100_mean(x: ndarray) -> float64

Function `bottom_10_mean`

View Source

bottom_10_mean(x: ndarray) -> float64

Function `bottom_10p_mean`

View Source

bottom_10p_mean(x: ndarray) -> float64

Computes the mean of the bottom 10 percent of values. Args: x (np.ndarray): data input. Returns: np.float64: Mean

Function `calculate_psad`

View Source

calculate_psad(mask_dict: dict, psa: float) -> dict

Calculate PSA density (PSAD) and its subtypes from prostate segmentation masks.

Args: mask_dict (dict): Dictionary containing segmentation masks with keys:

'anatomy': Volume containing prostate zone segmentations
'bin_anatomy': Volume containing binary prostate segmentation psa (float): Patient's PSA value in ng/ml

Returns: dict: Dictionary containing the calculated PSA density values:

'psad': Overall PSA density (PSA/total prostate volume)
'psad_pz': PSA density for peripheral zone
'psad_cg': PSA density for central gland (transition + anterior + central zones)

Function `check_and_resample_masks`

View Source

check_and_resample_masks(img_vol: Volume, mask_vol: Volume, softmax_vol: UnionType[Volume, None] = None, anatomical_vol: UnionType[Volume, None] = None) -> tuple

Resamples segmentation masks to match the shape, spacing, and origin of a reference image volume.

This function ensures that all provided volumes (mask, softmax predictions, and anatomical segmentations) are resampled to match the properties of the reference image volume. This is necessary for consistent analysis when volumes are acquired with different parameters.

Args: img_vol (Volume): Reference image volume that defines the target shape, spacing, and origin mask_vol (Volume): Binary mask volume to be resampled softmax_vol (Volume, optional): Probability map/softmax predictions volume to be resampled. Defaults to None. anatomical_vol (Volume, optional): Anatomical segmentation volume (e.g., prostate zones) to be resampled. Defaults to None.

Returns: tuple[Volume, Volume, Volume]: Tuple containing:

mask_resampled: Resampled binary mask matching reference volume
softmax_resampled: Resampled probability maps (None if not provided)
anatomy_resampled: Resampled anatomical segmentations (None if not provided)

Function `get_clinical_data`

View Source

get_clinical_data(study: Study, study_path: str, studies_path: str, patient_csv: UnionType[str, None] = None) -> dict

Get clinical data for a patient including PSA, cancer status, PI-RADS score, ISUP grade, and age.

This function retrieves clinical data either from the Study object directly or from a provided CSV file. When using a CSV file, it matches the study path to find the correct patient record.

Args: study (Study): Study object containing patient metadata and clinical information study_path (str): Full path to the current study directory studies_path (str): Base directory containing all study folders patient_csv (str, optional): Path to CSV file containing patient clinical data. Must include columns: 'id', 'psa', 'cancer', 'pirads', 'isup'. When None, data is retrieved from Study object. Defaults to None.

Returns: dict: Dictionary containing clinical data with keys:

'psa': PSA value in ng/ml
'cancer': Presence of clinically significant cancer (boolean)
'pirads': PI-RADS score (1-5)
'isup': ISUP grade (1-5)
'age': Patient age

Function `get_data_for_feature_extraction`

View Source

get_data_for_feature_extraction(study: Study, queries: QueryObject, image_type: str, segmentation_method: QueryObject, study_path: str = 'unknown') -> UnionType[tuple[dict, dict], tuple[None, None]]

Get the image data and corresponding segmentation masks for feature extraction.

This function retrieves and processes various types of MRI images (ADC, DWI, VERDICT, etc.) along with their corresponding segmentation masks (lesions, prostate zones) from a study. It handles different query types based on the image modality and ensures proper data validation.

Args: study (Study): Study object containing the imaging and segmentation data study_path (str): Path to the study directory for logging purposes queries (QueryObject): List of query parameters specific to the image type (e.g., b-values for DWI) image_type (str): Type of MRI sequence ('adc', 'dwi', 'verdict', 'mismo_adc', 't2') segmentation_method (QueryObject): Method used for lesion segmentation Returns: tuple[dict, dict] | tuple[None, None]: Returns either:

A tuple of two dictionaries:
- data_dict: Contains the image volumes for each query
- mask_dict: Contains lesion masks, prostate zones, and binary prostate mask
(None, None) if required data is missing or invalid

Function `get_global_features`

View Source

get_global_features(image_type: str, img_vol: Volume, mask: ndarray, anatomy_resampled: Volume, softmax_resampled: Volume) -> dict[str, UnionType[float64, Any]]

Function to calculate global statistical features from medical imaging data.

Computes statistical features across entire prostate regions and different anatomical zones, including intensity statistics from both the raw image data and softmax probability maps. TODO: Allow for empty masks Args: image_type (str): Type of medical image (e.g., 'adc', 'dwi', 'verdict', 't2') img_vol (Volume): 3D image volume containing the raw imaging data mask (np.ndarray): Binary mask array identifying all lesion voxels anatomy_resampled (Volume): Resampled prostate zone segmentation mask matching image dimensions softmax_resampled (Volume): Resampled softmax probability maps for lesion detection

Returns: dict[str, np.float64 | Any]: Dictionary containing computed global features with keys:

'_zone_all': Mean intensity in each prostate zone
'__all': Global statistical measures from image data
'_softmax__all': Statistics from softmax values
'_size_all': Total volume of all lesions

Function `get_lesion_wise_features`

View Source

get_lesion_wise_features(image_type: str, lesion_id: int, img_vol: Volume, sub_mask: ndarray, anatomy_resampled: Volume, softmax_resampled: Volume) -> dict[str, UnionType[float64, Any]]

Function to determine lesion-wise features from medical imaging data.

Calculates quantitative features for each individual lesion including location, size, volume, zonal distribution, and various statistical measures of image intensity and softmax probability values. TODO: Allow for empty masks Args: image_type (str): Type of medical image (e.g., 'adc', 'dwi', 'verdict', 't2') lesion_id (int): Unique identifier for the current lesion being analyzed img_vol (Volume): 3D image volume containing the raw imaging data sub_mask (np.ndarray): Binary mask array identifying the voxels of this specific lesion anatomy_resampled (Volume): Resampled prostate zone segmentation mask matching image dimensions softmax_resampled (Volume): Resampled softmax probability maps for lesion detection

Returns: dict[str, np.float64 | Any]: Dictionary containing computed features with keys:

'lesion_location': Broad zone location (1=PZ, 2=other zones)
'lesion_precise_location': Specific zone value
'lesion_coords_': Centroid coordinates
'size': Lesion size in mm³
'lesionvolume': Lesion volume in ml
'zoneprop': Proportion in each zone
Various statistical features from image intensities and softmax values

Function `get_study_list`

View Source

get_study_list(studies_path: str, patient_csv: UnionType[str, None] = None) -> list

Get a list of study paths from a directory, optionally filtered by a CSV file.

This function returns a sorted list of study paths either from:

A CSV file containing patient IDs and additional data
All subdirectories in the specified studies directory

Args: studies_path (str): Base directory containing all study folders patient_csv (str, optional): Path to CSV file containing patient IDs. If provided, only studies matching IDs in this file are included. The CSV must have an 'id' column. Defaults to None.

Returns: list: Sorted list of absolute paths to study directories

Function `select_features_into_df`

View Source

select_features_into_df(df: DataFrame, exclude_num_first_columns: int, features_used: UnionType[list, None], use_individual_lesions: bool = True) -> tuple[DataFrame, Series, Series, Series, Series]

Select and prepare feature columns from a DataFrame for model training/evaluation.

Args: df (pd.DataFrame): Input DataFrame containing all features and metadata exclude_num_first_columns (int): Number of initial columns to exclude from features features_used (list | None, optional): List of feature prefixes to include. If None then no feature are subselected by global and local lesion level. If empty list then use_indiviual_lesion is activated. use_individual_lesions (bool, optional): Whether to use individual lesion features vs global features. Defaults to True.

Returns: tuple[pd.DataFrame,pd.Series,pd.Series,pd.Series,pd.Series]: Tuple containing:

features: DataFrame with selected features
gt: Ground truth cancer status
patient_ids: Series of patient IDs
isup: ISUP grades
pirads: PI-RADS scores

Function `top_10`

View Source

top_10(x: ndarray) -> float64

Function `top_100`

View Source

top_100(x: ndarray) -> float64

Function `top_100_mean`

View Source

top_100_mean(x: ndarray) -> float64

Function `top_10_mean`

View Source

top_10_mean(x: ndarray) -> float64

Function `top_10p_mean`

View Source

top_10p_mean(x: ndarray) -> float64

Computes the mean of the top 10 percent of values. Args: x (np.ndarray): data input. Returns: np.float64: Mean

Function bottom_10​

Function bottom_100​

Function bottom_100_mean​

Function bottom_10_mean​

Function bottom_10p_mean​

Function calculate_psad​

Function check_and_resample_masks​

Function get_clinical_data​

Function get_data_for_feature_extraction​

Function get_global_features​

Function get_lesion_wise_features​

Function get_study_list​

Function select_features_into_df​

Function top_10​

Function top_100​

Function top_100_mean​

Function top_10_mean​

Function top_10p_mean​

Function `bottom_10`

Function `bottom_100`

Function `bottom_100_mean`

Function `bottom_10_mean`

Function `bottom_10p_mean`

Function `calculate_psad`

Function `check_and_resample_masks`

Function `get_clinical_data`

Function `get_data_for_feature_extraction`

Function `get_global_features`

Function `get_lesion_wise_features`

Function `get_study_list`

Function `select_features_into_df`

Function `top_10`

Function `top_100`

Function `top_100_mean`

Function `top_10_mean`

Function `top_10p_mean`