cryovit.config
Hydra configuration classes for CryoViT experiments.
Functions
|
Validates the configuration for DINOv2 feature extraction. |
Validates an experiment configuration. |
Classes
|
Base configuration for datasets in CryoViT experiments. |
|
Base configuration for running CryoViT experiments. |
|
Base configuration for models used in CryoViT experiments. |
|
Base configuration for the trainer used in CryoViT experiments. |
|
Base configuration for computing DINOv2 features in CryoViT experiments. |
|
Configuration for managing experiment paths in CryoViT experiments. |
- class BaseModel(_target_: str = '???', name: str = '???', input_key: str = '???', model_dir: Path | None = None, lr: float = '???', weight_decay: float = 0.001, losses: dict = '???', metrics: dict = '???', custom_kwargs: dict | None = None)[source]
Bases:
objectBase configuration for models used in CryoViT experiments.
- name
Name of the model for identification purposes.
- Type:
str
- input_key
Key to get the input data from a tomogram.
- Type:
str
- model_dir
Optional directory to download model weights to (for SAMv2 models).
- Type:
Optional[Path]
- lr
Learning rate for the model training.
- Type:
float
- weight_decay
Weight decay (L2 penalty) rate. Default is 1e-3.
- Type:
float
- losses
Configurations for loss functions used in training.
- Type:
dict[str, Any]
- metrics
Configurations for metrics used during model evaluation.
- Type:
dict[str, Any]
- custom_kwargs
Optional dictionary of custom keyword arguments to pass to the model.
- Type:
Optional[dict[str, Any]]
- class BaseTrainer(_target_: str = 'pytorch_lightning.Trainer', accelerator: str = 'gpu', devices: str = '1', precision: str = '16-mixed', default_root_dir: Path | None = None, max_epochs: int | None = None, enable_checkpointing: bool = False, enable_model_summary: bool = True, log_every_n_steps: int | None = None)[source]
Bases:
objectBase configuration for the trainer used in CryoViT experiments.
- accelerator
Type of hardware acceleration. Default is ‘gpu’.
- Type:
str
- devices
Number of devices to use for training. Default is ‘1’.
- Type:
str
- precision
Precision configuration for training (e.g., ‘16-mixed’).
- Type:
str
- default_root_dir
Default root directory for saving checkpoints and logs.
- Type:
Optional[Path]
- max_epochs
The maximum number of epochs to train for.
- Type:
Optional[int]
- enable_checkpointing
Flag to enable or disable model checkpointing. Default is False.
- Type:
bool
- enable_model_summary
Enable model summarization. Default is True.
- Type:
bool
- log_every_n_steps
Frequency of logging in terms of training steps.
- Type:
Optional[int]
- class BaseDataModule(_target_: str = '', _partial_: bool = True, sample: Any = '???', split_id: int | None = None, split_key: str | None = 'split_id', test_sample: Any | None = None, dataset: dict = '???', dataloader: dict = '???')[source]
Bases:
objectBase configuration for datasets in CryoViT experiments.
- split_id
Optional split_id to use for validation.
- Type:
Optional[int]
- split_key
Key in the sample .csv file to use for splitting the data. Default is “split_id”.
- Type:
Optional[str]
- test_sample
Specific sample or samples used for testing.
- Type:
Optional[Any]
- dataset
Configuration for the dataset.
- Type:
dict[str, Any]
- dataloader
Configuration for the dataloader.
- Type:
dict[str, Any]
- class ExperimentPaths(model_dir: Path = '???', data_dir: Path = '???', exp_dir: Path = '???', results_dir: Path = '???', tomo_name: str = 'tomograms', feature_name: str = 'dino_features', dino_name: str = 'DINOv2', sam_name: str = 'SAM2', csv_name: str = 'csv', split_name: str = 'splits.csv')[source]
Bases:
objectConfiguration for managing experiment paths in CryoViT experiments.
- model_dir
Path to the folder containing downloaded models.
- Type:
Path
- data_dir
Path to the parent directory containing tomogram data and .csv files.
- Type:
Path
- exp_dir
Path to the parent directory for saving results from an experiment.
- Type:
Path
- results_dir
Path to the parent directory for saving overall results.
- Type:
Path
- tomo_name
Name of the folder in data_dir with tomograms.
- Type:
str
- feature_name
Name of the folder in data_dir with DINOv2 features.
- Type:
str
- dino_name
Name of the folder in model_dir to save DINOv2 model.
- Type:
str
- csv_name
Name of the folder in data_dir with .csv files.
- Type:
str
- split_name
Name of the .csv file with training splits.
- Type:
str
- class DinoFeaturesConfig(batch_size: int = 128, dino_dir: Path = '???', paths: ExperimentPaths = '???', datamodule: dict = '???', sample: Sample | None = '???', export_features: bool = False)[source]
Bases:
objectBase configuration for computing DINOv2 features in CryoViT experiments.
- batch_size
Number of tomogram slices to process as one batch. Default is 128.
- Type:
int
- dino_dir
Path to the DINOv2 foundation model.
- Type:
Path
- paths
Configuration for experiment paths.
- Type:
- datamodule
Configuration for the datamodule to use for loading tomograms.
- Type:
dict[str, Any]
- sample
Sample to calculate features for. None means to calculate features for all samples.
- Type:
Optional[Sample]
- export_features
Whether to additionally compute PCA colormaps for the calculated features.
- Type:
bool
- class BaseExperimentConfig(name: str = '???', label_key: str = '???', additional_keys: tuple[str] = (), random_seed: int = 42, paths: ExperimentPaths = '???', model: BaseModel = '???', trainer: BaseTrainer = '???', callbacks: dict[str, Any] = '???', logger: dict[str, Any] = '???', datamodule: BaseDataModule = '???', ckpt_path: Path | None = None, resume_ckpt: bool = False)[source]
Bases:
objectBase configuration for running CryoViT experiments.
- name
Name of the experiment, should be unique for each configuration.
- Type:
str
- label_key
Key used to specify the training labels.
- Type:
str
- additional_keys
Keys to pass through additional data from the dataset.
- Type:
tuple[str]
- random_seed
Random seed set for reproducibility. Default is 42.
- Type:
int
- paths
Configuration for experiment paths.
- Type:
- trainer
Configuration for the trainer to use.
- Type:
- callbacks
List of callback functions for the trainer.
- Type:
Optional[list]
- logger
List of logging functions for the trainer.
- Type:
Optional[list]
- datamodule
Configuration for the datamodule to use.
- Type:
- ckpt_path
Optional path to a checkpoint file to resume training from.
- Type:
Optional[Path]
- resume_ckpt
Whether to resume training from the checkpoint. Default is False.
- Type:
bool
- validate_dino_config(cfg: DinoFeaturesConfig) None[source]
Validates the configuration for DINOv2 feature extraction.
Checks if all necessary parameters are present in the configuration. If any required parameters are missing, it logs an error message and exits the script.
- Parameters:
cfg (DinoFeaturesConfig) – The configuration object containing settings for feature extraction.
- Raises:
SystemExit – If any configuration parameters are missing.
- validate_experiment_config(cfg: BaseExperimentConfig) None[source]
Validates an experiment configuration.
Checks if all necessary parameters are present in the configuration. If any required parameters are missing, it logs an error message and exits the script.
Additionally, checks that all Samples specified are valid, and logs an error and exits if any samples are not valid.
- Parameters:
cfg (BaseExperimentConfig) – The configuration object to validate.
- Raises:
SystemExit – If any configuration parameters are missing, or any samples are not valid, terminating the script.