cryovit.run
Functions to run feature extraction, training, evaluation, and inference for users.
Functions
|
Run DINO feature extraction on the specified training data, and saves the results as .hdf files. |
|
Run evaluation on the specified test data and labels, saving result metrics as a .csv file. |
|
Run inference on the specified data files and saves the results. |
|
Run training on the specified data and labels. |
- run_dino(train_data: list[Path], result_dir: Path, batch_size: int, window_size: int | None = 630, visualize: bool = False) None[source]
Run DINO feature extraction on the specified training data, and saves the results as .hdf files. The saved result file will contain data, dino_features, and any labels present in the source tomogram in the labels/ group.
- Parameters:
train_data (list[Path]) – List of paths to the training tomograms.
result_dir (Path) – Directory where the results will be saved.
batch_size (int) – Number of samples to process in each batch.
window_size (Optional[int], optional) – Size of the sliding window for feature extraction. If None, uses the default size.
visualize (bool, optional) – Whether to visualize the extracted features. Defaults to False.
- run_evaluation(test_data: list[Path], test_labels: list[Path], labels: list[str], model_path: Path, result_dir: Path, visualize: bool = True) Path[source]
Run evaluation on the specified test data and labels, saving result metrics as a .csv file.
- Parameters:
test_data (list[Path]) – List of paths to the test tomograms.
test_labels (list[Path]) – List of paths to the test labels.
labels (list[str]) – List of label names to evaluate.
model_path (Path) – Path to the trained model file.
result_dir (Path) – Directory where the evaluation results will be saved.
visualize (bool, optional) – Whether to visualize the evaluation results. Defaults to True.
- Returns:
Path to the evaluation results file.
- Return type:
Path
- run_inference(data_files: list[Path], model_path: Path, result_dir: Path, threshold: float = 0.5) list[Path][source]
Run inference on the specified data files and saves the results.
- Parameters:
data_files (list[Path]) – List of paths to the input data files.
model_path (Path) – Path to the trained model file.
result_dir (Path) – Directory where the inference results will be saved.
threshold (float, optional) – Threshold for binary classification. Defaults to 0.5.
- Returns:
List of paths to the saved result files.
- Return type:
list[Path]
- run_training(train_data: list[Path], train_labels: list[Path], labels: list[str], model_type: ModelType, model_name: str, label_key: str, result_dir: Path, val_data: list[Path] | None = None, val_labels: list[Path] | None = None, num_epochs: int = 50, log_training: bool = False) Path[source]
Run training on the specified data and labels.
- Parameters:
train_data (list[Path]) – List of paths to the training tomograms.
train_labels (list[Path]) – List of paths to the training labels.
labels (list[str]) – List of label names to train on.
model_type (ModelType) – Type of the model to train.
model_name (str) – Name of the model.
label_key (str) – Key for the label in the dataset.
result_dir (Path) – Directory where the training results will be saved.
val_data (Optional[list[Path]], optional) – List of paths to the validation tomograms. Defaults to None.
val_labels (Optional[list[Path]], optional) – List of paths to the validation labels. Defaults to None.
num_epochs (int, optional) – Number of training epochs. Defaults to 50.
log_training (bool, optional) – Whether to log training metrics to Tensorboard. Defaults to False.
- Returns:
Path to the saved model file.
- Return type:
Path