AtomAI Models

Segmentor

class atomai.models.Segmentor(model='Unet', nb_classes=1, **kwargs)[source]

Bases: SegTrainer

Model for semantic segmentation-based analysis of images

Parameters:

model (Type[Union[str, Module]]) – Type of model to train: ‘Unet’, ‘ResHedNet’ or ‘dilnet’ (Default: ‘Unet’). See atomai.nets for more details. One can also pass here a custom fully convolutional neural network model.
nb_classes (int) – Number of classes in classification scheme (last NN layer)
**batch_norm (bool) – Apply batch normalization after each convolutional layer (Default: True)
**dropout (bool) – Apply dropouts to the three inner blocks in the middle of a network (Default: False)
**upsampling_mode (str) – “bilinear” or “nearest” upsampling method (Default: “bilinear”)
**nb_filters (int) – Number of convolutional filters (aka “kernels”) in the first convolutional block (this number doubles in the consequtive block(s), see definition of Unet and dilnet models for details)
**with_dilation (bool) – Use dilated convolutions in the bottleneck of Unet (Default: False)
**layers (list) – List with a number of layers in each block. For UNet the first 4 elements in the list are used to determine the number of layers in each block of the encoder (including bottleneck layer), and the number of layers in the decoder is chosen accordingly (to maintain symmetry between encoder and decoder)

Example:

>>> # Initialize model
>>> model = aoi.models.Segmentor(nb_classes=3)
>>> # Train
>>> model.fit(images, masks, images_test, masks_test,
>>>        training_cycles=300, compute_accuracy=True, swa=True)
>>> # Predict with trained model
>>> nn_output, coordinates = model.predict(expdata)

fit(X_train, y_train, X_test=None, y_test=None, loss='ce', optimizer=None, training_cycles=1000, batch_size=32, compute_accuracy=False, full_epoch=False, swa=False, perturb_weights=False, **kwargs)[source]

Compiles a trainer and performs model training

Parameters:

X_train (Union[ndarray, Tensor]) – 4D numpy array or pytorch tensor of training images (n_samples, 1, height, width). One can also pass a regular 3D image stack without a channel dimension of 1 which will be added automatically
y_train (Union[ndarray, Tensor]) – 4D (binary) / 3D (multiclass) numpy array or pytorch tensor of training masks (aka ground truth) stacked along the first dimension. The reason why in the multiclass case the X_train is 4-dimensional and the y_train is 3-dimensional is because of how the cross-entropy loss is calculated in PyTorch (see https://pytorch.org/docs/stable/nn.html#nllloss).
X_test (Union[ndarray, Tensor, None]) – 4D numpy array or pytorch tensor of test images (stacked along the first dimension)
y_test (Union[ndarray, Tensor, None]) – 4D (binary) / 3D (multiclass) numpy array or pytorch tensor of training masks (aka ground truth) stacked along the first dimension.
loss (str) – loss function. Available loss functions are: ‘mse’ (MSE), ‘ce’ (cross-entropy), ‘focal’ (focal loss; single class only), and ‘dice’ (dice loss)
optimizer (Optional[Type[Optimizer]]) – weights optimizer (defaults to Adam optimizer with lr=1e-3)
training_cycles (int) – Number of training ‘epochs’. If full_epoch argument is set to False, 1 epoch == 1 mini-batch. Otherwise, each cycle corresponds to all mini-batches of data passing through a NN.
batch_size (int) – Size of training and test mini-batches
compute_accuracy (bool) – Computes accuracy function at each training cycle
full_epoch (bool) – If True, passes all mini-batches of training/test data at each training cycle and computes the average loss. If False, passes a single (randomly chosen) mini-batch at each cycle.
swa (bool) – Saves the recent stochastic weights and averages them at the end of training
perturb_weights (bool) – Time-dependent weight perturbation, \(w\leftarrow w + a / (1 + e)^\gamma\), where parameters a and gamma can be passed as a dictionary, together with parameter e_p determining every n-th epoch at which a perturbation is applied
**lr_scheduler (list of floats) – List with learning rates for each training iteration/epoch. If the length of list is smaller than the number of training iterations, the last values in the list is used for the remaining iterations.
**print_loss (int) – Prints loss every n-th epoch
**accuracy_metrics (str) – Accuracy metrics (used only for printing training statistics)
**filename (str) – Filename for model weights (appended with “_test_weights_best.pt” and “_weights_final.pt”)
**plot_training_history (bool) – Plots training and test curves vs. training cycles at the end of training
**kwargs – One can also pass kwargs for utils.datatransform class to perform the augmentation “on-the-fly” (e.g. rotation=True, gauss_noise=[20, 60], etc.)

predict(imgdata, refine=False, logits=True, resize=None, compute_coords=True, **kwargs)[source]

Apply (trained) model to new data

Parameters:

image_data – 3D image stack or a single 2D image (all greyscale)
refine (bool) – Atomic positions refinement with 2d Gaussian peak fitting (may take some time)
logits (bool) – Indicates whether the features are passed through a softmax/sigmoid layer at the end of a neural network (logits=True for AtomAI models)
resize (Optional[Tuple[int, int]]) – Resizes input data to (new_height, new_width) before passing to a neural network
compute_coords (bool) – Computes centers of the mass of individual blobs in the segmented images (Default: True)
**thresh (float) – Value between 0 and 1 for thresholding the NN output (Default: 0.5)
**d (int) – half-side of a square around each atomic position used for refinement with 2d Gaussian peak fitting. Defaults to 1/4 of average nearest neighbor atomic distance
**num_batches (int) – number of batches (Default: 10)
**norm (bool) – Normalize data to (0, 1) during pre-processing
**verbose (bool) – verbosity

Return type:

Tuple[ndarray, Dict[int, ndarray]]

Returns:

Semantically segmented image and coordinates of (atomic) objects

load_weights(filepath)[source]

Loads saved weights dictionary

Return type:: None

ImSpec

class atomai.models.ImSpec(in_dim, out_dim, latent_dim=2, **kwargs)[source]

Bases: ImSpecTrainer

Model for predicting spectra from images and vice versa

Parameters:

in_dim (Tuple[int]) – Input data dimensions. (height, width) for images or (length,) for spectra
out_dim (Tuple[int]) – Output dimensions. (length,) for spectra or (height, width) for images
latent_dim (int) – Dimensionality of the latent space (number of neurons in a fully connected “bottleneck” layer)
**seed (int) – Seed used when initializng model weights (Default: 1)
**nblayers_encoder (int) – Number of convolutional layers in the encoder
**nblayers_decoder (int) – Number of convolutional layers in the decoder
**nbfilters_encoder (int) – number of convolutional filters in each layer of the encoder
**nbfilters_decoder (int) – Number of convolutional filters in each layer of the decoder
**batch_norm (bool) – Apply batch normalization after each convolutional layer (Default: True)
**encoder_downsampling (int) – Downsamples input data by this factor before passing to convolutional layers (Default: no downsampling)
**decoder_upsampling (bool) – Performs upsampling+convolution operation twice on the reshaped latent vector (starting from image/spectra dims 4x smaller than the target dims) before passing to the decoder

Example:

>>> in_dim = (16, 16)  # Input dimensions (images)
>>> out_dim = (64,)  # Output dimensions (spectra)
>>> # Initialize and train model
>>> model = aoi.models.ImSpec(in_dim, out_dim, latent_dim=10)
>>> model.fit(imgs_train, spectra_train, imgs_test, spectra_test,
>>>           full_epoch=True, training_cycles=120, swa=True)
>>> # Make a prediction with the trained model
>>> prediction = model.predict(imgs_test, norm=False)

fit(X_train, y_train, X_test=None, y_test=None, loss='mse', optimizer=None, training_cycles=1000, batch_size=64, compute_accuracy=False, full_epoch=False, swa=False, perturb_weights=False, **kwargs)[source]

Compiles a trainer and performs model training

Parameters:

X_train (Union[ndarray, Tensor]) – 4D numpy array or torch tensor with image data (n_samples x 1 x height x width) or 3D array/tensor with spectral data (n_samples x 1 x signal_length). It is also possible to pass 3D and 2D arrays by ignoring the channel dim of 1, which will be added automatically. The X_train is typically referred to as ‘features’
y_train (Union[ndarray, Tensor]) – 3D numpy array or torch tensor with spectral data (n_samples x 1 x signal_length) or 4D array/tensor with image data (n_samples x 1 x height x width). It is also possible to pass 2D and 3D arrays by ignoring the channel dim of 1, which will be added automatically. Note that if your X_train data are images, then your y_train must be spectra and vice versa. The y_train is typicaly referred to as “targets”
X_test (Union[ndarray, Tensor, None]) – Test data (features) of the same dimesnionality (except for the number of samples) as X_train
y_test (Union[ndarray, Tensor, None]) – Test data (targets) of the same dimesnionality (except for the number of samples) as y_train
loss (str) – Loss function (currently works only with ‘mse’)
optimizer (Optional[Type[Optimizer]]) – weights optimizer (defaults to Adam optimizer with lr=1e-3)
training_cycles (int) – Number of training ‘epochs’. If full_epoch argument is set to False, 1 epoch == 1 mini-batch. Otherwise, each cycle corresponds to all mini-batches of data passing through a NN.
batch_size (int) – Size of training and test mini-batches
full_epoch (bool) – If True, passes all mini-batches of training/test data at each training cycle and computes the average loss. If False, passes a single (randomly chosen) mini-batch at each cycle.
swa (bool) – Saves the recent stochastic weights and averages them at the end of training
perturb_weights (bool) – Time-dependent weight perturbation, \(w\leftarrow w + a / (1 + e)^\gamma\), where parameters a and gamma can be passed as a dictionary, together with parameter e_p determining every n-th epoch at which a perturbation is applied
**print_loss (int) – Prints loss every n-th epoch
**filename (str) – Filename for model weights (appended with “_test_weights_best.pt” and “_weights_final.pt”)
**plot_training_history (bool) – Plots training and test curves vs. training cycles at the end of training
**kwargs – One can also pass kwargs for utils.datatransform class to perform the augmentation “on-the-fly” (e.g. gauss_noise=[20, 60], etc.)

predict(data, **kwargs)[source]

Apply (trained model) to new data

Parameters:

data (ndarray) – Input image/spectrum or batch of images/spectra
**num_batches (int) – number of batches (Default: 10)
**norm (bool) – Normalize data to (0, 1) during pre-processing
**verbose (bool) – verbosity (Default: True)

Return type:

ndarray

load_weights(filepath)[source]

Loads saved weights dictionary

Return type:: None

ParticleAnalyzer

class atomai.models.ParticleAnalyzer(checkpoint_path=None, model_type='vit_h', device='auto')[source]

Bases: object

A class to encapsulate an end-to-end particle segmentation and analysis workflow using the Segment Anything Model (SAM).

This class handles: - Automatic downloading of SAM model checkpoints. - Image pre-processing, including normalization and optional contrast enhancement. - Running SAM with preset or custom parameters. - Advanced post-processing to filter masks by area and shape, and to remove duplicates. - Extraction of detailed properties for each detected particle. - Conversion of results to a pandas DataFrame and visualization of results.

Example

>>> # 1. Initialize the analyzer (downloads model if needed)
>>> analyzer = ParticleAnalyzer(model_type="vit_h")
>>>
>>> # 2. Load image and run the analysis
>>> image = np.load(IMAGE_PATH)
>>> result = analyzer.analyze(image)
>>>
>>> # 3. Print summary and visualize results
>>> print(f"Found {result['total_count']} particles.")
>>> df = ParticleAnalyzer.particles_to_dataframe(result)
>>> print(df.head())
>>>
>>> # This will generate and show a side-by-side plot
>>> ParticleAnalyzer.visualize_particles(
...     result,
...     original_image_for_plot=image,
...     show_plot=True
... )

analyze(image_array, params=None)[source]

Runs the full analysis pipeline on a given image using a set of parameters.

Parameters:

image_array (np.array) – The input 2D grayscale image.
params (dict, optional) – A dictionary of parameters controlling the analysis. If None, a set of default parameters will be used.

static particles_to_dataframe(result)[source]: Converts the ‘particles’ list from the result into a pandas DataFrame.

static visualize_particles(result, original_image_for_plot=None, show_plot=False, show_labels=True, show_centroids=True)[source]

Creates an RGB image visualizing the detected particles and optionally displays a plot.

Parameters:

result (dict) – The output dictionary from the analyze method.
original_image_for_plot (np.array, optional) – The raw, unprocessed image for side-by-side comparison. If None, the processed image from the result is used.
show_plot (bool) – If True, displays a matplotlib plot comparing original and segmented images.
show_labels (bool) – If True, shows particle ID labels on the overlay.
show_centroids (bool) – If True, shows particle centroids on the overlay.

Returns:

The RGB overlay image with particles drawn on it.

Return type:

np.array

Denoiser

class atomai.models.DenoisingAutoencoder(encoder_filters=[8, 16, 32, 64], decoder_filters=[64, 32, 16, 8], encoder_layers=[1, 2, 2, 2], decoder_layers=[2, 2, 2, 1], use_batch_norm=False, upsampling_mode='nearest', **kwargs)[source]

Bases: BaseTrainer

Denoising autoencoder model for image cleaning and noise reduction

Parameters:

encoder_filters (list) – List of filter sizes for encoder layers (Default: [8, 16, 32, 64])
decoder_filters (list) – List of filter sizes for decoder layers (Default: [64, 32, 16, 8])
encoder_layers (list) – Number of convolutional layers per encoder block (Default: [1, 2, 2, 2])
decoder_layers (list) – Number of convolutional layers per decoder block (Default: [2, 2, 2, 1])
use_batch_norm (bool) – Whether to use batch normalization in both encoder and decoder (Default: True)
upsampling_mode (str) – Upsampling method (‘nearest’ or ‘bilinear’) (Default: ‘nearest’)
**seed – Random seed for reproducibility (Default: 1)

Example

>>> # Initialize model
>>> model = aoi.models.DenoisingAutoencoder()
>>> # Train on noisy/clean image pairs
>>> model.fit(noisy_images, clean_images, noisy_test, clean_test,
>>>           training_cycles=500, swa=True)
>>> # Denoise new images
>>> cleaned = model.predict(new_noisy_images)

fit(X_train, y_train, X_test=None, y_test=None, loss='mse', optimizer=None, training_cycles=500, batch_size=32, compute_accuracy=False, full_epoch=False, swa=True, perturb_weights=False, **kwargs)[source]

Train the denoising autoencoder

Parameters:

X_train (Union[ndarray, Tensor]) – Noisy input images for training
y_train (Union[ndarray, Tensor]) – Clean target images for training
X_test (Union[ndarray, Tensor, None]) – Noisy input images for testing
y_test (Union[ndarray, Tensor, None]) – Clean target images for testing
loss (str) – Loss function (Default: ‘mse’)
optimizer (Optional[Type[Optimizer]]) – Optimizer (Default: Adam with lr=1e-3)
training_cycles (int) – Number of training epochs
batch_size (int) – Batch size for training
compute_accuracy (bool) – Whether to compute accuracy metrics
full_epoch (bool) – Whether to use full epochs
swa (bool) – Whether to use stochastic weight averaging
perturb_weights (bool) – Whether to use weight perturbation
**kwargs – Additional arguments for training

predict(data, **kwargs)[source]

Denoise input images

Parameters:

data (Union[ndarray, Tensor]) – Input noisy images
**num_batches – Number of batches for prediction (Default: 10)

Return type:

ndarray

Returns:

Denoised images

load_weights(filepath)[source]

Load saved model weights

Return type:: None

Variational Autoencoder (VAE)

class atomai.models.VAE(in_dim=None, latent_dim=2, nb_classes=0, seed=0, **kwargs)[source]

Bases: BaseVAE

Implements a standard Variational Autoencoder (VAE)

Parameters:

in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data
latent_dim (int) – Number of VAE latent dimensions
nb_classes (int) – Number of classes for class-conditional rVAE
seed (int) – seed for torch and numpy (pseudo-)random numbers generators
**conv_encoder (bool) – use convolutional layers in encoder
**conv_decoder (bool) – use convolutional layers in decoder
**numlayers_encoder (int) – number of layers in encoder (Default: 2)
**numlayers_decoder (int) – number of layers in decoder (Default: 2)
**numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)
**numhidden_decoder (int) – number of hidden units OR conv filters in decoder (Default: 128)

Example:

>>> input_dim = (28, 28) # Input data dimensions (without n_samples)
>>> # Intitialize model
>>> vae = aoi.models.VAE(input_dim)
>>> # Train
>>> vae.fit(imstack_train, training_cycles=100, batch_size=100)
>>> # Visualize learned manifold (for 2 latent dimesnions)
>>> vae.manifold2d(origin="upper", cmap="gnuplot2)

One can also pass labels to train a class-conditioned VAE

>>> # Intitialize model
>>> vae = aoi.models.VAE(input_dim, nb_classes=10)
>>> # Train
>>> vae.fit(imstack_train, labels_train, training_cycles=100, batch_size=100)
>>> # Visualize learned manifold for class 1
>>> vae.manifold2d(label=1, origin="upper", cmap="gnuplot2")

elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Calculates ELBO

Return type:: Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

VAE’s forward pass with training/test loss computation

Return type:: Tensor

compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:

train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels
train_data – Test data and (optionally) corresponding targets or labels
optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4
elbo_fn (Optional[Callable]) – function that calculates elbo loss
training_cycles (int) – Number of training iterations (aka “epochs”)
batch_size (int) – Size of mini-batch for training
**kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:

z_sample (Union[ndarray, Tensor]) – coordinates in latent space
y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:

img (ndarray) – 2D numpy array
**num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:

imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image
**num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:

imgdata (ndarray) – NN output (preferable) or raw data
coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates
window_size (int) – size of subimages to crop
min_length (int) – minimum length of trajectory to be included
rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model(): Evaluates model on test data

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', **kwargs)[source]

Trains VAE model

Parameters:

X_train (Union[ndarray, Tensor]) – For images, 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data. For spectra, 2D stack of spectra with dimensions (length,)
y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images/spectra
X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images or 2D stack of spectra with the same dimensions as for the X_train (Default: None)
y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images/spectra
loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)
**capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the latent channel. Based on https://arxiv.org/pdf/1804.03599.pdf
**filename (str) – file path for saving model aftereach training cycle (“epoch”)

Return type:

None

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:

z (Tensor) – latent vector (reparametrized)
q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution
p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:: None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:: Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:: Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:

**d (int) – grid size
**l1 (list) – range of 1st latent variable
**l2 (list) – range of 2nd latent variable
**label (int) – label in class-conditioned (r)VAE
**disc_idx (int) – discrete “class”
**cmap (str) – color map (Default: gnuplot)
**draw_grid (bool) – plot semi-transparent grid
**origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:: ndarray

print_statistics(e): Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)
**num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:: Tensor

classmethod reparameterize_discrete(alpha, tau): Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:: None

save_weights(*args)

Saves trained weights

Return type:: None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:: None

set_decoder(decoder_net)

Sets a decoder network only

Return type:: None

set_encoder(encoder_net)

Sets an encoder network only

Return type:: None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:: None

train_epoch(): Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:

frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)
**moviename (str) – name of the movie
**frame_duration (int) – duration of each movie frame

Return type:

None

update_metadict()[source]

Rotational Variational Autoencoder (rVAE)

class atomai.models.rVAE(in_dim=None, latent_dim=2, nb_classes=0, translation=True, seed=0, **kwargs)[source]

Bases: BaseVAE

Implements rotationally and translationally invariant Variational Autoencoder (VAE) based on the idea of “spatial decoder” by Bepler et al. in arXiv:1909.11663. In addition, this class allows implementating the class-conditioned VAE and skip-VAE (arXiv:1807.04863) with rotational and translational variance.

Parameters:

in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data
latent_dim (int) – Number of VAE latent dimensions associated with image content
nb_classes (int) – Number of classes for class-conditional rVAE
translation (bool) – account for xy shifts of image content (Default: True)
seed (int) – seed for torch and numpy (pseudo-)random numbers generators
**conv_encoder (bool) – use convolutional layers in encoder
**numlayers_encoder (int) – number of layers in encoder (Default: 2)
**numlayers_decoder (int) – number of layers in decoder (Default: 2)
**numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)
**numhidden_decoder (int) – number of hidden units in decoder (Default: 128)
**skip (bool) – uses generative skip model with residual paths between latents and decoder layers (Default: False)

Example:

>>> input_dim = (28, 28)  # input dimensions
>>> # Intitialize model
>>> rvae = aoi.models.rVAE(input_dim)
>>> # Train
>>> rvae.fit(imstack_train, training_cycles=100,
             batch_size=100, rotation_prior=np.pi/2)
>>> rvae.manifold2d(origin="upper", cmap="gnuplot2")

One can also pass labels to train a class-conditioned rVAE

>>> # Intitialize model
>>> rvae = aoi.models.rVAE(input_dim, nb_classes=10)
>>> # Train
>>> rvae.fit(imstack_train, labels_train, training_cycles=100,
>>>          batch_size=100, rotation_prior=np.pi/2)
>>> # Visualize learned manifold for class 1
>>> rvae.manifold2d(label=1, origin="upper", cmap="gnuplot2")

elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Computes ELBO

Return type:: Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

rVAE’s forward pass with training/test loss computation

Return type:: Tensor

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', **kwargs)[source]

Trains rVAE model

Parameters:

X_train (Union[ndarray, Tensor]) – 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data
y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images
X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images with the same dimensions as for the X_train (Default: None)
y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images
loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)
**translation_prior (float) – translation prior
**rotation_prior (float) – rotational prior
**capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the latent channel. Based on https://arxiv.org/pdf/1804.03599.pdf
**filename (str) – file path for saving model aftereach training cycle (“epoch”)
**recording (bool) – saves a learned 2d manifold at each training step

Return type:

None

update_metadict()[source]

compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:

train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels
train_data – Test data and (optionally) corresponding targets or labels
optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4
elbo_fn (Optional[Callable]) – function that calculates elbo loss
training_cycles (int) – Number of training iterations (aka “epochs”)
batch_size (int) – Size of mini-batch for training
**kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:

z_sample (Union[ndarray, Tensor]) – coordinates in latent space
y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:

img (ndarray) – 2D numpy array
**num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:

imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image
**num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:

imgdata (ndarray) – NN output (preferable) or raw data
coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates
window_size (int) – size of subimages to crop
min_length (int) – minimum length of trajectory to be included
rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model(): Evaluates model on test data

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:

z (Tensor) – latent vector (reparametrized)
q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution
p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:: None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:: Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:: Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:

**d (int) – grid size
**l1 (list) – range of 1st latent variable
**l2 (list) – range of 2nd latent variable
**label (int) – label in class-conditioned (r)VAE
**disc_idx (int) – discrete “class”
**cmap (str) – color map (Default: gnuplot)
**draw_grid (bool) – plot semi-transparent grid
**origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:: ndarray

print_statistics(e): Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)
**num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:: Tensor

classmethod reparameterize_discrete(alpha, tau): Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:: None

save_weights(*args)

Saves trained weights

Return type:: None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:: None

set_decoder(decoder_net)

Sets a decoder network only

Return type:: None

set_encoder(encoder_net)

Sets an encoder network only

Return type:: None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:: None

train_epoch(): Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:

frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)
**moviename (str) – name of the movie
**frame_duration (int) – duration of each movie frame

Return type:

None

Joint Variational Autoencoder (jVAE)

class atomai.models.jVAE(in_dim=None, latent_dim=2, discrete_dim=[2], nb_classes=0, seed=0, **kwargs)[source]

Bases: BaseVAE

VAE for joint (continuous + discrete) latent representations

Parameters:

in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data
latent_dim (int) – Number of latent dimensions associated with image content
discrete_dim (List[int]) – List specifying dimensionalities of discrete (Gumbel-Softmax) latent variables associated with image content
nb_classes (int) – Number of classes for class-conditional VAE (leave it at 0 to learn discrete latent reprenetations)
seed (int) – seed for torch and numpy (pseudo-)random numbers generators
**conv_encoder (bool) – use convolutional layers in encoder
**conv_decoder (bool) – use convolutional layers in decoder
**numlayers_encoder (int) – number of layers in encoder (Default: 2)
**numlayers_decoder (int) – number of layers in decoder (Default: 2)
**numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)
**numhidden_decoder (int) – number of hidden units in decoder (Default: 128)
**skip (bool) – uses generative skip model with residual paths between latents and decoder layers (Default: False)

Example:

>>> input_dim = (28, 28)  # intput dimensions
>>> # Intitialize model
>>> jvae = aoi.models.jVAE(input_dim, latent_dim=2, discrete_dim=[10],
>>>                        numlayers_encoder=3, numhidden_encoder=512,
>>>                        numlayers_decoder=3, numhidden_decoder=512)
>>> # Train
>>> jvae.fit(imstack_train, training_cycles=100, batch_size=100)
>>> # View a traversal of the learned manifold
>>> jvae.manifold_traversal(cont_idx=1, origin="upper", cmap="gnuplot2")

elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Computes ELBO

Return type:: Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

Joint VAE’s forward pass with training/test loss computation

Return type:: Tensor

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', **kwargs)[source]

Trains joint VAE model

Parameters:

X_train (Union[ndarray, Tensor]) – For images, 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data. For spectra, 2D stack of spectra with dimensions (length,)
y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images/spectra
X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images or 2D stack of spectra with the same dimensions as for the X_train (Default: None)
y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images/spectra
loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)
**cont_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the continuous latent channel. Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104
**disc_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the discrete latent channel(s). Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104
**filename (str) – file path for saving model aftereach training cycle (“epoch”)

Return type:

None

update_metadict()[source]

compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:

train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels
train_data – Test data and (optionally) corresponding targets or labels
optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4
elbo_fn (Optional[Callable]) – function that calculates elbo loss
training_cycles (int) – Number of training iterations (aka “epochs”)
batch_size (int) – Size of mini-batch for training
**kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:

z_sample (Union[ndarray, Tensor]) – coordinates in latent space
y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:

img (ndarray) – 2D numpy array
**num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:

imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image
**num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:

imgdata (ndarray) – NN output (preferable) or raw data
coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates
window_size (int) – size of subimages to crop
min_length (int) – minimum length of trajectory to be included
rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model(): Evaluates model on test data

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:

z (Tensor) – latent vector (reparametrized)
q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution
p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:: None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:: Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:: Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:

**d (int) – grid size
**l1 (list) – range of 1st latent variable
**l2 (list) – range of 2nd latent variable
**label (int) – label in class-conditioned (r)VAE
**disc_idx (int) – discrete “class”
**cmap (str) – color map (Default: gnuplot)
**draw_grid (bool) – plot semi-transparent grid
**origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:: ndarray

print_statistics(e): Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)
**num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:: Tensor

classmethod reparameterize_discrete(alpha, tau): Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:: None

save_weights(*args)

Saves trained weights

Return type:: None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:: None

set_decoder(decoder_net)

Sets a decoder network only

Return type:: None

set_encoder(encoder_net)

Sets an encoder network only

Return type:: None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:: None

train_epoch(): Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:

frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)
**moviename (str) – name of the movie
**frame_duration (int) – duration of each movie frame

Return type:

None

Joint Rotational Variational Autoencoder (jrVAE)

class atomai.models.jrVAE(in_dim=None, latent_dim=2, discrete_dim=[2], nb_classes=0, translation=True, seed=0, **kwargs)[source]

Bases: BaseVAE

Rotationally-invariant VAE for joint continuous and discrete latent representations.

Parameters:

in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data
latent_dim (int) – Number of latent dimensions associated with image content
discrete_dim (List[int]) – List specifying dimensionalities of discrete (Gumbel-Softmax) latent variables associated with image content
nb_classes (int) – Number of classes for class-conditional VAE. (leave it at 0 to learn discrete latent reprenetations)
translation (bool) – account for xy shifts of image content (Default: True)
seed (int) – seed for torch and numpy (pseudo-)random numbers generators
**conv_encoder (bool) – use convolutional layers in encoder
**numlayers_encoder (int) – number of layers in encoder (Default: 2)
**numlayers_decoder (int) – number of layers in decoder (Default: 2)
**numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)
**numhidden_decoder (int) – number of hidden units in decoder (Default: 128)
**skip (bool) – uses generative skip model with residual paths between latents and decoder layers (Default: False)

Example:

>>> input_dim = (28, 28)  # intput dimensions
>>> # Intitialize model
>>> jrvae = aoi.models.jVAE(input_dim, latent_dim=2, discrete_dim=[10],
>>>                         numlayers_encoder=3, numhidden_encoder=512,
>>>                         numlayers_decoder=3, numhidden_decoder=512)
>>> # Train
>>> jrvae.fit(imstack_train, training_cycles=100,
              batch_size=100, rotation_prior=np.pi/4)
>>> jrvae.manifold2d(origin="upper", cmap="gnuplot2")

elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Computes ELBO

Return type:: Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

Joint rVAE’s forward pass with training/test loss computation

Return type:: Tensor

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', verbose='True', **kwargs)[source]

Trains joint rVAE model

Parameters:

X_train (Union[ndarray, Tensor]) – 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data
y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images
X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images with the same dimensions as for the X_train (Default: None)
y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images
loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)
**translation_prior (float) – translation prior
**rotation_prior (float) – rotational prior
**temperature (float) – Relaxation parameter for Gumbel-Softmax distribution
**cont_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the continuous latent channel. Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104
**disc_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the discrete latent channel(s). Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104
**filename (str) – file path for saving model after each training cycle (“epoch”)
verbose (str) – display training output, “True” or “False” (Default: “True”)

Return type:

None

update_metadict()[source]

compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:

train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels
train_data – Test data and (optionally) corresponding targets or labels
optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4
elbo_fn (Optional[Callable]) – function that calculates elbo loss
training_cycles (int) – Number of training iterations (aka “epochs”)
batch_size (int) – Size of mini-batch for training
**kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:

z_sample (Union[ndarray, Tensor]) – coordinates in latent space
y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:

img (ndarray) – 2D numpy array
**num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:

imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image
**num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:

imgdata (ndarray) – NN output (preferable) or raw data
coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates
window_size (int) – size of subimages to crop
min_length (int) – minimum length of trajectory to be included
rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model(): Evaluates model on test data

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:

z (Tensor) – latent vector (reparametrized)
q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution
p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:: None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:: Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:: Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:

**d (int) – grid size
**l1 (list) – range of 1st latent variable
**l2 (list) – range of 2nd latent variable
**label (int) – label in class-conditioned (r)VAE
**disc_idx (int) – discrete “class”
**cmap (str) – color map (Default: gnuplot)
**draw_grid (bool) – plot semi-transparent grid
**origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:: ndarray

print_statistics(e): Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:

x_new (Union[ndarray, Tensor]) – image array to encode
**label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)
**num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:: Tensor

classmethod reparameterize_discrete(alpha, tau): Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:: None

save_weights(*args)

Saves trained weights

Return type:: None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:: None

set_decoder(decoder_net)

Sets a decoder network only

Return type:: None

set_encoder(encoder_net)

Sets an encoder network only

Return type:: None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:: None

train_epoch(): Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:

frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)
**moviename (str) – name of the movie
**frame_duration (int) – duration of each movie frame

Return type:

None

Deep Kernel Learning

class atomai.models.dklGPR(indim, embedim=2, shared_embedding_space=True, **kwargs)[source]

Bases: dklGPTrainer

Deep kernel learning (DKL)-based Gaussian process regression (GPR)

Parameters:

indim (int) – input feature dimension
embedim (int) – embedding dimension (determines dimensionality of kernel space)
shared_embedding_space (bool) – use one embedding space for all target outputs

Keyword Arguments:

device – Sets device to which model and data will be moved. Defaults to ‘cuda:0’ if a GPU is available and to CPU otherwise.
precision – Sets tensor types for ‘single’ (torch.float32) or ‘double’ (torch.float64) precision
seed – Seed for enforcing reproducibility

Examples

Train a DKL-GPR model with high-dimensional inputs X and outputs y:

>>> data_dim = X.shape[-1]  # X dimensions are n_samples x d
>>> dklgp = aoi.models.dklGPR(data_dim, embedim=2, precision="double")
>>> dklgp.fit(X, y, training_cycles=100, lr=1e-2)

Make a prediction on new data (mean and variance for each ‘test’ point):

>>> mean, var = dklgp.predict(X_test, batch_size=len(X_test))

Alternatively, one can obtain a prediction as follows:

>>> samples = dklgp.sample_from_posterior(X_test, num_samples=1000)
>>> mean, var = samples.mean(0), samples.var(0)

fit(X, y, training_cycles=1, **kwargs)[source]

Initializes and trains a deep kernel GP model

Parameters:

X (Union[Tensor, ndarray]) – Input training data (aka features) of N x input_dim dimensions
y (Union[Tensor, ndarray]) – Output targets of batch_size x N or N (if batch_size=1) dimensions
training_cycles (int) – Number of training epochs

Keyword Arguments:

feature_extractor – (Optional) Custom neural network for feature extractor. Must take input/feature dims and embedding dims as its arguments.
freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.
lr – learning rate (Default: 0.01)
print_loss – print loss at every n-th training cycle (epoch)

Return type:

None

fit_ensemble(X, y, training_cycles=1, n_models=5, **kwargs)[source]

Initializes and trains an ensemble of deep kernel GP model

Parameters:

X (Union[Tensor, ndarray]) – Input training data (aka features) of N x input_dim dimensions
y (Union[Tensor, ndarray]) – Output targets of batch_size x N or N (if batch_size=1) dimensions
training_cycles (int) – Number of training epochs
n_models (int) – Number of models in ensemble

Keyword Arguments:

feature_extractor – (Optional) Custom neural network for feature extractor. Must take input/feature dims and embedding dims as its arguments.
freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.
lr – learning rate (Default: 0.01)
print_loss – print loss at every n-th training cycle (epoch)

Return type:

None

sample_from_posterior(X, num_samples=1000)[source]

Computes the posterior over model outputs at the provided points (X) and samples from it

Return type:: ndarray

thompson(X_cand, scalarize_func=None, maximize=True)[source]

Thompson sampling for selecting the next measurement point

Return type:: Tuple[ndarray, int]

predict(x_new, **kwargs)[source]

Prediction of mean and variance using the trained model

Return type:: Tuple[ndarray]

embed(x_new, **kwargs)[source]

Embeds the input data to a “latent” space using a trained feature extractor NN.

Return type:: Tensor

compile_multi_model_trainer(X, y, training_cycles=1, **kwargs)

Initializes deep kernel (feature extractor NNs + base kernels), sets optimizer and “loss” function. For vector-valued functions (multiple outputs), it assumes one latent space per output, that is, the number of neural networks is equal to the number of Gaussian processes. For example, if the outputs are spectra of length 128, one will have 128 neural networks and 128 GPs trained in parallel. It can be also used for training an ensembles of models for the same scalar output.

Return type:: None

compile_trainer(X, y, training_cycles=1, **kwargs)

Initializes deep kernel (feature extractor NN + base kernel), sets optimizer and “loss” function. For vector-valued functions (multiple outputs), it assumes a shared latent space, that is, a single neural network is connected to multiple Gaussian processes.

Parameters:

X (Union[Tensor, ndarray]) – Input training data (aka features) of N x input_dim dimensions
y (Union[Tensor, ndarray]) – Output targets of batch_size x N or N (if batch_size=1) dimensions
training_cycles (int) – Number of training epochs

Keyword Arguments:

feature_extractor – (Optional) Custom neural network for feature extractor. Must take input/feature dims and embedding dims as its arguments.
grid_size – Grid size for structured kernel interpolation (Default: 50)
freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.
lr – learning rate (Default: 0.01)

Return type:

None

print_statistics(e)

run(X=None, y=None, training_cycles=1, **kwargs)

Initializes and trains a deep kernel GP model

Parameters:

X (Union[Tensor, ndarray, None]) – Input training data (aka features) of N x input_dim dimensions
y (Union[Tensor, ndarray, None]) – Output targets of batch_size x N or N (if batch_size=1) dimensions
training_cycles (int) – Number of training epochs

Keyword Arguments:

feature_extractor – (Optional) Custom neural network for feature extractor
freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.
grid_size – Grid size for structured kernel interpolation (Default: 50)
lr – learning rate (Default: 0.01)
print_loss – print loss at every n-th training cycle (epoch)

Return type:

Type[ExactGP]

save_weights(filename)

Saves weights of the feature extractor.

Return type:: None

set_data(x, y=None, device=None)

Data preprocessing. Casts data array to a selected tensor type and moves it to a selected devive.

Return type:: Tuple[tensor]

train_step()

Single training step with backpropagation to computegradients and optimizes weights.

Return type:: None

Load trained models

atomai.models.load_model(filepath)[source]

Loads trained AtomAI models

Parameters:: meta_state_dict (str) – filepath to meta-state dictionary with trained weights and information about model’s structure
Return type:: Union[Segmentor, VAE, rVAE, jrVAE, jVAE, ImSpec]
Returns:: Model in evaluation state

atomai.models.load_ensemble(filepath)[source]

Loads trained ensemble models

Parameters:: meta_state_dict (str) – filepath to dictionary with trained weights and key information about model’s structure
Return type:: Tuple[Type[Module], Dict[int, Dict[str, Tensor]]]
Returns:: Single model with averaged weights and dictionary with weights of all models