cryovit.utils
Utility functions to process data and models in a format recognizable by CryoVIT.
Functions
|
Generates a random string of fixed size. |
|
Load data or labels from a given file path. |
|
Load files from a given directory or a .txt file listing file paths. |
|
Load labels from a given file path, given a list of label names in ascending-value order. |
|
Load a model from a given path. |
|
Read data from an HDF5 file. |
|
Read data from an MRC file. |
|
Read data from a TIFF file. |
|
Save a model to a given path. |
|
Save a model to a given path from a weights file. |
Classes
|
Metadata information for a file. |
|
A class to represent a pre-trained model. |
- id_generator(size: int = 6, chars='abcdefghijklmnopqrstuvwxyz0123456789')[source]
Generates a random string of fixed size.
- class FileMetadata(drange: tuple[float, float], dshape: tuple[int, ...], dtype: dtype, nunique: int = 0)[source]
Bases:
objectMetadata information for a file.
- drange
The dynamic range of the data.
- Type:
tuple[float, float]
- dshape
The shape of the data.
- Type:
tuple[int, …]
- dtype
The data type of the data.
- Type:
numpy.dtype
- nunique
The number of unique values in the data.
- Type:
int
- read_hdf(hdf_file: str | Path, key: str | None = None) tuple[str, ndarray, FileMetadata][source]
Read data from an HDF5 file. If a key is not specified, assumes the data with the most unique values is the data.
- Parameters:
hdf_file – The path to the HDF5 file.
key – The key to read from the HDF5 file. If None, assumes the data with the most unique values is the data. If not a valid key, will attempt to read all keys and use the one with the most unique values.
- Returns:
A tuple of the key used, the data, and the metadata.
- read_mrc(mrc_file: str | Path) tuple[ndarray, FileMetadata][source]
Read data from an MRC file.
- Parameters:
mrc_file – The path to the MRC file.
- Returns:
A tuple of the data and the metadata.
- read_tiff(tiff_file: str | Path) tuple[ndarray, FileMetadata][source]
Read data from a TIFF file.
- Parameters:
tiff_file – The path to the TIFF file.
- Returns:
A tuple of the data and the metadata.
- load_data(file_path: str | Path, key: str | None = None) tuple[ndarray, str][source]
Load data or labels from a given file path. Supports .h5, .hdf5, .mrc, .mrcs formats.
- Parameters:
file_path – The path to the data file.
key – An optional key to specify which dataset to load from an HDF5 file.
- Raises:
ValueError – If the file format is unsupported.
FileNotFoundError – If the specified file does not exist.
- Returns:
A tuple of the data and the key used (empty string if not applicable).
- load_labels(file_path: str | Path, label_keys: list[str], key: str | None) dict[str, ndarray][source]
Load labels from a given file path, given a list of label names in ascending-value order. Supports .h5, .hdf5, .mrc, .mrcs, .tiff, and .tif formats.
- Parameters:
file_path – The path to the label file.
label_keys – A list of label names in ascending-value order (e.g., [‘mito’, ‘cristae’] for 0=background, 1=mito, 2=cristae).
key – An optional key to specify which dataset to load from an HDF5 file.
- Raises:
ValueError – If the number of unique values in the label data does not match the number of provided label keys.
ValueError – If the file format is unsupported.
ValueError – If the specified key is not found in the label data.
FileNotFoundError – If the specified file does not exist.
- Returns:
A dictionary of label name to int8 label array.
- load_files_from_path(path: Path) list[Path][source]
Load files from a given directory or a .txt file listing file paths.
- Parameters:
path (Path) – The path to the directory or .txt file.
- Raises:
ValueError – If the path is not a directory or a .txt file.
- Returns:
A list of file paths.
- Return type:
list[Path]
- class SavedModel(name: str, model_type: ModelType, label_key: str, model_cfg: BaseModel, weights: dict[str, Any])[source]
Bases:
objectA class to represent a pre-trained model.
- name
The name of the model.
- Type:
str
- model_type
The type of the model, e.g., ‘CryoVIT’, ‘3D U-Net’.
- Type:
- label_key
The label key used for training the model.
- Type:
str
- model_cfg
The config dictionary to instantiate the model.
- Type:
- weights
The saved weights of the model.
- Type:
dict[str, Any]
- save_model(model_name: str, label_key: str, model: Module, model_cfg: BaseModel, save_path: str | Path) None[source]
Save a model to a given path.
- Parameters:
model_name – The name of the model.
label_key – The label key used for training the model.
model – The model to save.
model_cfg – The config dictionary to instantiate the model.
save_path – The path to save the model to.
- save_model_from_weights(model_name: str, label_key: str, model_type: ModelType, weights_path: str | Path, save_path: str | Path, **kwargs) None[source]
Save a model to a given path from a weights file.
- Parameters:
model_name – The name of the model.
label_key – The label key used for training the model.
model_type – The type of the model.
weights_path – The path to the weights file.
save_path – The path to save the model to.
**kwargs – Additional keyword arguments to pass to the model config. To access nested config parameters, use double underscores (e.g., a.b -> a__b).
- Raises:
FileNotFoundError – If the weights file does not exist.
- load_model(model_path: str | Path, load_model: bool = True) tuple[Module | None, ModelType, str, str][source]
Load a model from a given path.
- Parameters:
model_path – The path to the model file.
load_model – Whether to load the model weights. If False, only returns the model type, name, and label key.
- Raises:
FileNotFoundError – If the specified file does not exist.
- Returns:
A tuple of the model (or None if load_model is False), the model type, the model name, and the label key.