Evaluation Procedure in VIFY
The evaluation procedure in the vify repository is implemented in the script scripts/evaluate.py. This script is responsible for assessing the performance of trained models using configurations specified in the config/eval_config.yaml file. Below is a detailed explanation of the evaluation process, its components, and how to use it effectively.
Table of Contents
Overview
The evaluation script is designed to:
- Load configurations using Hydra and OmegaConf.
- Extract features from the dataset.
- Evaluate the model using a full dataset prediction.
- Compute evaluation metrics such as AUC (Area Under the Curve).
- Generate visualizations like ROC curves, scatter plots, and violin plots.
- Save evaluation results for further analysis.
Configuration
The evaluation script relies on a configuration file located at config/eval_config.yaml. This file defines all the parameters required for evaluation. A detailed explaination about each parameter can be found in this config as well.
Evaluation Workflow
The evaluation process follows these steps:
-
Load Configuration:
- The configuration is loaded using Hydra and converted into a Python object for easier manipulation.
-
Feature Extraction:
- Features are extracted from the dataset using the
extract_features_from_studiesfunction ifextract_featuresis set toTrue. Otherwise, features are loaded from disk.
- Features are extracted from the dataset using the
-
Model Evaluation:
- The model is evaluated on the full dataset.
-
Metrics Computation:
- Metrics such as AUC, Sensitivity and Specificity are computed.
-
Visualization:
- Visualizations such as ROC curves, scatter plots, and violin plots are generated based on the evaluation results.
-
Save Results:
- The evaluation results, metrics, and visualizations are saved to the specified output directory.
Logging
The script supports two logging mechanisms:
-
ClearML:
- If
use_clearmlis enabled, the script initializes a ClearML task and logger using theget_task_and_loggerfunction. - This allows for detailed experiment tracking, including metrics, hyperparameters, and artifacts.
- If
-
Standard Logging:
- If ClearML is not used, the script sets up standard Python logging with the specified logging level.
Output Structure
The outputs of the evaluation process are stored in
<output_path>/<project_name>/<task_name>/<timestamp>/
How to Run the Evaluation Script
To run the evaluation script, use the following command:
python scripts/evaluate.py
Code Walkthrough
1. Configuration Loading
@hydra.main(
version_base=None,
config_path="../config",
config_name="eval_config",
)
def run_evaluation(omega_config: OmegaConf) -> None:
config = cast(BaseConfig, OmegaConf.to_object(omega_config))
- The
@hydra.maindecorator loads the configuration file. - The configuration is converted into a Python object for easier manipulation.
2. Feature Extraction
if config.extract_features:
for image_type in config.image_types:
extract_features_from_studies(
studies_path=config.studies_path,
patient_csv=config.patient_csv,
output_path=experiment_path,
segmentation_method=config.lesion_segmentation_method.id,
queries=config.data_queries.kwargs[image_type],
image_type=image_type,
feature_extraction_methods=config.feature_extraction_methods,
lesion_cutoff=config.lesion_segmentation_method.softmax_cut_off,
dilation_factor=config.lesion_segmentation_method.dilation_factor,
)
features_path = experiment_path
elif config.features_path is not None:
features_path = str(config.features_path)
else:
raise ValueError(
"Either extract features must be enabled or feature path must be provided."
)
- Features are either extracted from the dataset or loaded from disk, depending on the configuration.
3. Model Evaluation
with open(config.model_path, "rb") as f:
model = pickle.load(f)
evaluate_full(
csv=all_features_df,
output_path=experiment_path,
model=model, # type: ignore
training_threshold=config.classifier_threshold,
num_predict_lesions=config.num_predict_lesions,
plot_dict=config.plots_params.kwargs,
exclude_num_first_columns=config.exclude_num_first_columns,
)
- The
evaluate_modelfunction performs single full evaluation and computes predictions and metrics on full dataset. - Model is loaded from disk as provided by the specified config path
This documentation provides a comprehensive guide to the evaluation procedure in the vify repository. For further details, refer to the source code in scripts/evaluate.py and the configuration file at config/eval_config.yaml.