cryovit.run

Functions to run feature extraction, training, evaluation, and inference for users.

Functions

`run_dino`(train_data, result_dir, batch_size)	Run DINO feature extraction on the specified training data, and saves the results as .hdf files.
`run_evaluation`(test_data, test_labels, ...)	Run evaluation on the specified test data and labels, saving result metrics as a .csv file.
`run_inference`(data_files, model_path, result_dir)	Run inference on the specified data files and saves the results.
`run_training`(train_data, train_labels, ...)	Run training on the specified data and labels.

run_dino(train_data: list[Path], result_dir: Path, batch_size: int, window_size: int | None = 630, visualize: bool = False) → None[source]

Run DINO feature extraction on the specified training data, and saves the results as .hdf files. The saved result file will contain data, dino_features, and any labels present in the source tomogram in the labels/ group.

Parameters:

train_data (list[Path]) – List of paths to the training tomograms.
result_dir (Path) – Directory where the results will be saved.
batch_size (int) – Number of samples to process in each batch.
window_size (Optional[int], optional) – Size of the sliding window for feature extraction. If None, uses the default size.
visualize (bool, optional) – Whether to visualize the extracted features. Defaults to False.

run_evaluation(test_data: list[Path], test_labels: list[Path], labels: list[str], model_path: Path, result_dir: Path, visualize: bool = True) → Path[source]

Run evaluation on the specified test data and labels, saving result metrics as a .csv file.

Parameters:

test_data (list[Path]) – List of paths to the test tomograms.
test_labels (list[Path]) – List of paths to the test labels.
labels (list[str]) – List of label names to evaluate.
model_path (Path) – Path to the trained model file.
result_dir (Path) – Directory where the evaluation results will be saved.
visualize (bool, optional) – Whether to visualize the evaluation results. Defaults to True.

Returns:

Path to the evaluation results file.

Return type:

Path

run_inference(data_files: list[Path], model_path: Path, result_dir: Path, threshold: float = 0.5) → list[Path][source]

Run inference on the specified data files and saves the results.

Parameters:

data_files (list[Path]) – List of paths to the input data files.
model_path (Path) – Path to the trained model file.
result_dir (Path) – Directory where the inference results will be saved.
threshold (float, optional) – Threshold for binary classification. Defaults to 0.5.

Returns:

List of paths to the saved result files.

Return type:

list[Path]

run_training(train_data: list[Path], train_labels: list[Path], labels: list[str], model_type: ModelType, model_name: str, label_key: str, result_dir: Path, val_data: list[Path] | None = None, val_labels: list[Path] | None = None, num_epochs: int = 50, log_training: bool = False) → Path[source]

Run training on the specified data and labels.

Parameters:

train_data (list[Path]) – List of paths to the training tomograms.
train_labels (list[Path]) – List of paths to the training labels.
labels (list[str]) – List of label names to train on.
model_type (ModelType) – Type of the model to train.
model_name (str) – Name of the model.
label_key (str) – Key for the label in the dataset.
result_dir (Path) – Directory where the training results will be saved.
val_data (Optional[list[Path]], optional) – List of paths to the validation tomograms. Defaults to None.
val_labels (Optional[list[Path]], optional) – List of paths to the validation labels. Defaults to None.
num_epochs (int, optional) – Number of training epochs. Defaults to 50.
log_training (bool, optional) – Whether to log training metrics to Tensorboard. Defaults to False.

Returns:

Path to the saved model file.

Return type:

Path