AtomAI Models

Segmentor

class atomai.models.Segmentor(model='Unet', nb_classes=1, **kwargs)[source]

Bases: SegTrainer

Model for semantic segmentation-based analysis of images

Parameters:
  • model (Type[Union[str, Module]]) – Type of model to train: ‘Unet’, ‘ResHedNet’ or ‘dilnet’ (Default: ‘Unet’). See atomai.nets for more details. One can also pass here a custom fully convolutional neural network model.

  • nb_classes (int) – Number of classes in classification scheme (last NN layer)

  • **batch_norm (bool) – Apply batch normalization after each convolutional layer (Default: True)

  • **dropout (bool) – Apply dropouts to the three inner blocks in the middle of a network (Default: False)

  • **upsampling_mode (str) – “bilinear” or “nearest” upsampling method (Default: “bilinear”)

  • **nb_filters (int) – Number of convolutional filters (aka “kernels”) in the first convolutional block (this number doubles in the consequtive block(s), see definition of Unet and dilnet models for details)

  • **with_dilation (bool) – Use dilated convolutions in the bottleneck of Unet (Default: False)

  • **layers (list) – List with a number of layers in each block. For UNet the first 4 elements in the list are used to determine the number of layers in each block of the encoder (including bottleneck layer), and the number of layers in the decoder is chosen accordingly (to maintain symmetry between encoder and decoder)

Example:

>>> # Initialize model
>>> model = aoi.models.Segmentor(nb_classes=3)
>>> # Train
>>> model.fit(images, masks, images_test, masks_test,
>>>        training_cycles=300, compute_accuracy=True, swa=True)
>>> # Predict with trained model
>>> nn_output, coordinates = model.predict(expdata)
fit(X_train, y_train, X_test=None, y_test=None, loss='ce', optimizer=None, training_cycles=1000, batch_size=32, compute_accuracy=False, full_epoch=False, swa=False, perturb_weights=False, **kwargs)[source]

Compiles a trainer and performs model training

Parameters:
  • X_train (Union[ndarray, Tensor]) – 4D numpy array or pytorch tensor of training images (n_samples, 1, height, width). One can also pass a regular 3D image stack without a channel dimension of 1 which will be added automatically

  • y_train (Union[ndarray, Tensor]) – 4D (binary) / 3D (multiclass) numpy array or pytorch tensor of training masks (aka ground truth) stacked along the first dimension. The reason why in the multiclass case the X_train is 4-dimensional and the y_train is 3-dimensional is because of how the cross-entropy loss is calculated in PyTorch (see https://pytorch.org/docs/stable/nn.html#nllloss).

  • X_test (Union[ndarray, Tensor, None]) – 4D numpy array or pytorch tensor of test images (stacked along the first dimension)

  • y_test (Union[ndarray, Tensor, None]) – 4D (binary) / 3D (multiclass) numpy array or pytorch tensor of training masks (aka ground truth) stacked along the first dimension.

  • loss (str) – loss function. Available loss functions are: ‘mse’ (MSE), ‘ce’ (cross-entropy), ‘focal’ (focal loss; single class only), and ‘dice’ (dice loss)

  • optimizer (Optional[Type[Optimizer]]) – weights optimizer (defaults to Adam optimizer with lr=1e-3)

  • training_cycles (int) – Number of training ‘epochs’. If full_epoch argument is set to False, 1 epoch == 1 mini-batch. Otherwise, each cycle corresponds to all mini-batches of data passing through a NN.

  • batch_size (int) – Size of training and test mini-batches

  • compute_accuracy (bool) – Computes accuracy function at each training cycle

  • full_epoch (bool) – If True, passes all mini-batches of training/test data at each training cycle and computes the average loss. If False, passes a single (randomly chosen) mini-batch at each cycle.

  • swa (bool) – Saves the recent stochastic weights and averages them at the end of training

  • perturb_weights (bool) – Time-dependent weight perturbation, \(w\leftarrow w + a / (1 + e)^\gamma\), where parameters a and gamma can be passed as a dictionary, together with parameter e_p determining every n-th epoch at which a perturbation is applied

  • **lr_scheduler (list of floats) – List with learning rates for each training iteration/epoch. If the length of list is smaller than the number of training iterations, the last values in the list is used for the remaining iterations.

  • **print_loss (int) – Prints loss every n-th epoch

  • **accuracy_metrics (str) – Accuracy metrics (used only for printing training statistics)

  • **filename (str) – Filename for model weights (appended with “_test_weights_best.pt” and “_weights_final.pt”)

  • **plot_training_history (bool) – Plots training and test curves vs. training cycles at the end of training

  • **kwargs – One can also pass kwargs for utils.datatransform class to perform the augmentation “on-the-fly” (e.g. rotation=True, gauss_noise=[20, 60], etc.)

predict(imgdata, refine=False, logits=True, resize=None, compute_coords=True, **kwargs)[source]

Apply (trained) model to new data

Parameters:
  • image_data – 3D image stack or a single 2D image (all greyscale)

  • refine (bool) – Atomic positions refinement with 2d Gaussian peak fitting (may take some time)

  • logits (bool) – Indicates whether the features are passed through a softmax/sigmoid layer at the end of a neural network (logits=True for AtomAI models)

  • resize (Optional[Tuple[int, int]]) – Resizes input data to (new_height, new_width) before passing to a neural network

  • compute_coords (bool) – Computes centers of the mass of individual blobs in the segmented images (Default: True)

  • **thresh (float) – Value between 0 and 1 for thresholding the NN output (Default: 0.5)

  • **d (int) – half-side of a square around each atomic position used for refinement with 2d Gaussian peak fitting. Defaults to 1/4 of average nearest neighbor atomic distance

  • **num_batches (int) – number of batches (Default: 10)

  • **norm (bool) – Normalize data to (0, 1) during pre-processing

  • **verbose (bool) – verbosity

Return type:

Tuple[ndarray, Dict[int, ndarray]]

Returns:

Semantically segmented image and coordinates of (atomic) objects

load_weights(filepath)[source]

Loads saved weights dictionary

Return type:

None

ImSpec

class atomai.models.ImSpec(in_dim, out_dim, latent_dim=2, **kwargs)[source]

Bases: ImSpecTrainer

Model for predicting spectra from images and vice versa

Parameters:
  • in_dim (Tuple[int]) – Input data dimensions. (height, width) for images or (length,) for spectra

  • out_dim (Tuple[int]) – Output dimensions. (length,) for spectra or (height, width) for images

  • latent_dim (int) – Dimensionality of the latent space (number of neurons in a fully connected “bottleneck” layer)

  • **seed (int) – Seed used when initializng model weights (Default: 1)

  • **nblayers_encoder (int) – Number of convolutional layers in the encoder

  • **nblayers_decoder (int) – Number of convolutional layers in the decoder

  • **nbfilters_encoder (int) – number of convolutional filters in each layer of the encoder

  • **nbfilters_decoder (int) – Number of convolutional filters in each layer of the decoder

  • **batch_norm (bool) – Apply batch normalization after each convolutional layer (Default: True)

  • **encoder_downsampling (int) – Downsamples input data by this factor before passing to convolutional layers (Default: no downsampling)

  • **decoder_upsampling (bool) – Performs upsampling+convolution operation twice on the reshaped latent vector (starting from image/spectra dims 4x smaller than the target dims) before passing to the decoder

Example:

>>> in_dim = (16, 16)  # Input dimensions (images)
>>> out_dim = (64,)  # Output dimensions (spectra)
>>> # Initialize and train model
>>> model = aoi.models.ImSpec(in_dim, out_dim, latent_dim=10)
>>> model.fit(imgs_train, spectra_train, imgs_test, spectra_test,
>>>           full_epoch=True, training_cycles=120, swa=True)
>>> # Make a prediction with the trained model
>>> prediction = model.predict(imgs_test, norm=False)
fit(X_train, y_train, X_test=None, y_test=None, loss='mse', optimizer=None, training_cycles=1000, batch_size=64, compute_accuracy=False, full_epoch=False, swa=False, perturb_weights=False, **kwargs)[source]

Compiles a trainer and performs model training

Parameters:
  • X_train (Union[ndarray, Tensor]) – 4D numpy array or torch tensor with image data (n_samples x 1 x height x width) or 3D array/tensor with spectral data (n_samples x 1 x signal_length). It is also possible to pass 3D and 2D arrays by ignoring the channel dim of 1, which will be added automatically. The X_train is typically referred to as ‘features’

  • y_train (Union[ndarray, Tensor]) – 3D numpy array or torch tensor with spectral data (n_samples x 1 x signal_length) or 4D array/tensor with image data (n_samples x 1 x height x width). It is also possible to pass 2D and 3D arrays by ignoring the channel dim of 1, which will be added automatically. Note that if your X_train data are images, then your y_train must be spectra and vice versa. The y_train is typicaly referred to as “targets”

  • X_test (Union[ndarray, Tensor, None]) – Test data (features) of the same dimesnionality (except for the number of samples) as X_train

  • y_test (Union[ndarray, Tensor, None]) – Test data (targets) of the same dimesnionality (except for the number of samples) as y_train

  • loss (str) – Loss function (currently works only with ‘mse’)

  • optimizer (Optional[Type[Optimizer]]) – weights optimizer (defaults to Adam optimizer with lr=1e-3)

  • training_cycles (int) – Number of training ‘epochs’. If full_epoch argument is set to False, 1 epoch == 1 mini-batch. Otherwise, each cycle corresponds to all mini-batches of data passing through a NN.

  • batch_size (int) – Size of training and test mini-batches

  • full_epoch (bool) – If True, passes all mini-batches of training/test data at each training cycle and computes the average loss. If False, passes a single (randomly chosen) mini-batch at each cycle.

  • swa (bool) – Saves the recent stochastic weights and averages them at the end of training

  • perturb_weights (bool) – Time-dependent weight perturbation, \(w\leftarrow w + a / (1 + e)^\gamma\), where parameters a and gamma can be passed as a dictionary, together with parameter e_p determining every n-th epoch at which a perturbation is applied

  • **print_loss (int) – Prints loss every n-th epoch

  • **filename (str) – Filename for model weights (appended with “_test_weights_best.pt” and “_weights_final.pt”)

  • **plot_training_history (bool) – Plots training and test curves vs. training cycles at the end of training

  • **kwargs – One can also pass kwargs for utils.datatransform class to perform the augmentation “on-the-fly” (e.g. gauss_noise=[20, 60], etc.)

predict(data, **kwargs)[source]

Apply (trained model) to new data

Parameters:
  • data (ndarray) – Input image/spectrum or batch of images/spectra

  • **num_batches (int) – number of batches (Default: 10)

  • **norm (bool) – Normalize data to (0, 1) during pre-processing

  • **verbose (bool) – verbosity (Default: True)

Return type:

ndarray

load_weights(filepath)[source]

Loads saved weights dictionary

Return type:

None

Variational Autoencoder (VAE)

class atomai.models.VAE(in_dim=None, latent_dim=2, nb_classes=0, seed=0, **kwargs)[source]

Bases: BaseVAE

Implements a standard Variational Autoencoder (VAE)

Parameters:
  • in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data

  • latent_dim (int) – Number of VAE latent dimensions

  • nb_classes (int) – Number of classes for class-conditional rVAE

  • seed (int) – seed for torch and numpy (pseudo-)random numbers generators

  • **conv_encoder (bool) – use convolutional layers in encoder

  • **conv_decoder (bool) – use convolutional layers in decoder

  • **numlayers_encoder (int) – number of layers in encoder (Default: 2)

  • **numlayers_decoder (int) – number of layers in decoder (Default: 2)

  • **numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)

  • **numhidden_decoder (int) – number of hidden units OR conv filters in decoder (Default: 128)

Example:

>>> input_dim = (28, 28) # Input data dimensions (without n_samples)
>>> # Intitialize model
>>> vae = aoi.models.VAE(input_dim)
>>> # Train
>>> vae.fit(imstack_train, training_cycles=100, batch_size=100)
>>> # Visualize learned manifold (for 2 latent dimesnions)
>>> vae.manifold2d(origin="upper", cmap="gnuplot2)

One can also pass labels to train a class-conditioned VAE

>>> # Intitialize model
>>> vae = aoi.models.VAE(input_dim, nb_classes=10)
>>> # Train
>>> vae.fit(imstack_train, labels_train, training_cycles=100, batch_size=100)
>>> # Visualize learned manifold for class 1
>>> vae.manifold2d(label=1, origin="upper", cmap="gnuplot2")
elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Calculates ELBO

Return type:

Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

VAE’s forward pass with training/test loss computation

Return type:

Tensor

compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:
  • train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels

  • train_data – Test data and (optionally) corresponding targets or labels

  • optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4

  • elbo_fn (Optional[Callable]) – function that calculates elbo loss

  • training_cycles (int) – Number of training iterations (aka “epochs”)

  • batch_size (int) – Size of mini-batch for training

  • **kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:
  • z_sample (Union[ndarray, Tensor]) – coordinates in latent space

  • y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:
  • img (ndarray) – 2D numpy array

  • **num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:
  • imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image

  • **num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:
  • imgdata (ndarray) – NN output (preferable) or raw data

  • coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates

  • window_size (int) – size of subimages to crop

  • min_length (int) – minimum length of trajectory to be included

  • rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one

  • **num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model()

Evaluates model on test data

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', **kwargs)[source]

Trains VAE model

Parameters:
  • X_train (Union[ndarray, Tensor]) – For images, 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data. For spectra, 2D stack of spectra with dimensions (length,)

  • y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images/spectra

  • X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images or 2D stack of spectra with the same dimensions as for the X_train (Default: None)

  • y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images/spectra

  • loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)

  • **capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the latent channel. Based on https://arxiv.org/pdf/1804.03599.pdf

  • **filename (str) – file path for saving model aftereach training cycle (“epoch”)

Return type:

None

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:
  • z (Tensor) – latent vector (reparametrized)

  • q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution

  • p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:

None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:

Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:

Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:
  • **d (int) – grid size

  • **l1 (list) – range of 1st latent variable

  • **l2 (list) – range of 2nd latent variable

  • **label (int) – label in class-conditioned (r)VAE

  • **disc_idx (int) – discrete “class”

  • **cmap (str) – color map (Default: gnuplot)

  • **draw_grid (bool) – plot semi-transparent grid

  • **origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:

ndarray

print_statistics(e)

Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)

  • **num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:

Tensor

classmethod reparameterize_discrete(alpha, tau)

Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:

None

save_weights(*args)

Saves trained weights

Return type:

None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:

None

set_decoder(decoder_net)

Sets a decoder network only

Return type:

None

set_encoder(encoder_net)

Sets an encoder network only

Return type:

None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:

None

train_epoch()

Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:
  • frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)

  • **moviename (str) – name of the movie

  • **frame_duration (int) – duration of each movie frame

Return type:

None

update_metadict()[source]

Rotational Variational Autoencoder (rVAE)

class atomai.models.rVAE(in_dim=None, latent_dim=2, nb_classes=0, translation=True, seed=0, **kwargs)[source]

Bases: BaseVAE

Implements rotationally and translationally invariant Variational Autoencoder (VAE) based on the idea of “spatial decoder” by Bepler et al. in arXiv:1909.11663. In addition, this class allows implementating the class-conditioned VAE and skip-VAE (arXiv:1807.04863) with rotational and translational variance.

Parameters:
  • in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data

  • latent_dim (int) – Number of VAE latent dimensions associated with image content

  • nb_classes (int) – Number of classes for class-conditional rVAE

  • translation (bool) – account for xy shifts of image content (Default: True)

  • seed (int) – seed for torch and numpy (pseudo-)random numbers generators

  • **conv_encoder (bool) – use convolutional layers in encoder

  • **numlayers_encoder (int) – number of layers in encoder (Default: 2)

  • **numlayers_decoder (int) – number of layers in decoder (Default: 2)

  • **numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)

  • **numhidden_decoder (int) – number of hidden units in decoder (Default: 128)

  • **skip (bool) – uses generative skip model with residual paths between latents and decoder layers (Default: False)

Example:

>>> input_dim = (28, 28)  # input dimensions
>>> # Intitialize model
>>> rvae = aoi.models.rVAE(input_dim)
>>> # Train
>>> rvae.fit(imstack_train, training_cycles=100,
             batch_size=100, rotation_prior=np.pi/2)
>>> rvae.manifold2d(origin="upper", cmap="gnuplot2")

One can also pass labels to train a class-conditioned rVAE

>>> # Intitialize model
>>> rvae = aoi.models.rVAE(input_dim, nb_classes=10)
>>> # Train
>>> rvae.fit(imstack_train, labels_train, training_cycles=100,
>>>          batch_size=100, rotation_prior=np.pi/2)
>>> # Visualize learned manifold for class 1
>>> rvae.manifold2d(label=1, origin="upper", cmap="gnuplot2")
elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Computes ELBO

Return type:

Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

rVAE’s forward pass with training/test loss computation

Return type:

Tensor

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', **kwargs)[source]

Trains rVAE model

Parameters:
  • X_train (Union[ndarray, Tensor]) – 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data

  • y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images

  • X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images with the same dimensions as for the X_train (Default: None)

  • y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images

  • loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)

  • **translation_prior (float) – translation prior

  • **rotation_prior (float) – rotational prior

  • **capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the latent channel. Based on https://arxiv.org/pdf/1804.03599.pdf

  • **filename (str) – file path for saving model aftereach training cycle (“epoch”)

  • **recording (bool) – saves a learned 2d manifold at each training step

Return type:

None

update_metadict()[source]
compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:
  • train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels

  • train_data – Test data and (optionally) corresponding targets or labels

  • optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4

  • elbo_fn (Optional[Callable]) – function that calculates elbo loss

  • training_cycles (int) – Number of training iterations (aka “epochs”)

  • batch_size (int) – Size of mini-batch for training

  • **kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:
  • z_sample (Union[ndarray, Tensor]) – coordinates in latent space

  • y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:
  • img (ndarray) – 2D numpy array

  • **num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:
  • imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image

  • **num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:
  • imgdata (ndarray) – NN output (preferable) or raw data

  • coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates

  • window_size (int) – size of subimages to crop

  • min_length (int) – minimum length of trajectory to be included

  • rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one

  • **num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model()

Evaluates model on test data

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:
  • z (Tensor) – latent vector (reparametrized)

  • q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution

  • p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:

None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:

Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:

Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:
  • **d (int) – grid size

  • **l1 (list) – range of 1st latent variable

  • **l2 (list) – range of 2nd latent variable

  • **label (int) – label in class-conditioned (r)VAE

  • **disc_idx (int) – discrete “class”

  • **cmap (str) – color map (Default: gnuplot)

  • **draw_grid (bool) – plot semi-transparent grid

  • **origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:

ndarray

print_statistics(e)

Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)

  • **num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:

Tensor

classmethod reparameterize_discrete(alpha, tau)

Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:

None

save_weights(*args)

Saves trained weights

Return type:

None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:

None

set_decoder(decoder_net)

Sets a decoder network only

Return type:

None

set_encoder(encoder_net)

Sets an encoder network only

Return type:

None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:

None

train_epoch()

Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:
  • frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)

  • **moviename (str) – name of the movie

  • **frame_duration (int) – duration of each movie frame

Return type:

None

Joint Variational Autoencoder (jVAE)

class atomai.models.jVAE(in_dim=None, latent_dim=2, discrete_dim=[2], nb_classes=0, seed=0, **kwargs)[source]

Bases: BaseVAE

VAE for joint (continuous + discrete) latent representations

Parameters:
  • in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data

  • latent_dim (int) – Number of latent dimensions associated with image content

  • discrete_dim (List[int]) – List specifying dimensionalities of discrete (Gumbel-Softmax) latent variables associated with image content

  • nb_classes (int) – Number of classes for class-conditional VAE (leave it at 0 to learn discrete latent reprenetations)

  • seed (int) – seed for torch and numpy (pseudo-)random numbers generators

  • **conv_encoder (bool) – use convolutional layers in encoder

  • **conv_decoder (bool) – use convolutional layers in decoder

  • **numlayers_encoder (int) – number of layers in encoder (Default: 2)

  • **numlayers_decoder (int) – number of layers in decoder (Default: 2)

  • **numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)

  • **numhidden_decoder (int) – number of hidden units in decoder (Default: 128)

  • **skip (bool) – uses generative skip model with residual paths between latents and decoder layers (Default: False)

Example:

>>> input_dim = (28, 28)  # intput dimensions
>>> # Intitialize model
>>> jvae = aoi.models.jVAE(input_dim, latent_dim=2, discrete_dim=[10],
>>>                        numlayers_encoder=3, numhidden_encoder=512,
>>>                        numlayers_decoder=3, numhidden_decoder=512)
>>> # Train
>>> jvae.fit(imstack_train, training_cycles=100, batch_size=100)
>>> # View a traversal of the learned manifold
>>> jvae.manifold_traversal(cont_idx=1, origin="upper", cmap="gnuplot2")
elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Computes ELBO

Return type:

Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

Joint VAE’s forward pass with training/test loss computation

Return type:

Tensor

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', **kwargs)[source]

Trains joint VAE model

Parameters:
  • X_train (Union[ndarray, Tensor]) – For images, 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data. For spectra, 2D stack of spectra with dimensions (length,)

  • y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images/spectra

  • X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images or 2D stack of spectra with the same dimensions as for the X_train (Default: None)

  • y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images/spectra

  • loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)

  • **cont_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the continuous latent channel. Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104

  • **disc_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the discrete latent channel(s). Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104

  • **filename (str) – file path for saving model aftereach training cycle (“epoch”)

Return type:

None

update_metadict()[source]
compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:
  • train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels

  • train_data – Test data and (optionally) corresponding targets or labels

  • optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4

  • elbo_fn (Optional[Callable]) – function that calculates elbo loss

  • training_cycles (int) – Number of training iterations (aka “epochs”)

  • batch_size (int) – Size of mini-batch for training

  • **kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:
  • z_sample (Union[ndarray, Tensor]) – coordinates in latent space

  • y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:
  • img (ndarray) – 2D numpy array

  • **num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:
  • imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image

  • **num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:
  • imgdata (ndarray) – NN output (preferable) or raw data

  • coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates

  • window_size (int) – size of subimages to crop

  • min_length (int) – minimum length of trajectory to be included

  • rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one

  • **num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model()

Evaluates model on test data

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:
  • z (Tensor) – latent vector (reparametrized)

  • q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution

  • p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:

None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:

Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:

Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:
  • **d (int) – grid size

  • **l1 (list) – range of 1st latent variable

  • **l2 (list) – range of 2nd latent variable

  • **label (int) – label in class-conditioned (r)VAE

  • **disc_idx (int) – discrete “class”

  • **cmap (str) – color map (Default: gnuplot)

  • **draw_grid (bool) – plot semi-transparent grid

  • **origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:

ndarray

print_statistics(e)

Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)

  • **num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:

Tensor

classmethod reparameterize_discrete(alpha, tau)

Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:

None

save_weights(*args)

Saves trained weights

Return type:

None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:

None

set_decoder(decoder_net)

Sets a decoder network only

Return type:

None

set_encoder(encoder_net)

Sets an encoder network only

Return type:

None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:

None

train_epoch()

Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:
  • frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)

  • **moviename (str) – name of the movie

  • **frame_duration (int) – duration of each movie frame

Return type:

None

Joint Rotational Variational Autoencoder (jrVAE)

class atomai.models.jrVAE(in_dim=None, latent_dim=2, discrete_dim=[2], nb_classes=0, translation=True, seed=0, **kwargs)[source]

Bases: BaseVAE

Rotationally-invariant VAE for joint continuous and discrete latent representations.

Parameters:
  • in_dim (Optional[int]) – Input dimensions for image data passed as (heigth, width) for grayscale data or (height, width, channels) for multichannel data

  • latent_dim (int) – Number of latent dimensions associated with image content

  • discrete_dim (List[int]) – List specifying dimensionalities of discrete (Gumbel-Softmax) latent variables associated with image content

  • nb_classes (int) – Number of classes for class-conditional VAE. (leave it at 0 to learn discrete latent reprenetations)

  • translation (bool) – account for xy shifts of image content (Default: True)

  • seed (int) – seed for torch and numpy (pseudo-)random numbers generators

  • **conv_encoder (bool) – use convolutional layers in encoder

  • **numlayers_encoder (int) – number of layers in encoder (Default: 2)

  • **numlayers_decoder (int) – number of layers in decoder (Default: 2)

  • **numhidden_encoder (int) – number of hidden units OR conv filters in encoder (Default: 128)

  • **numhidden_decoder (int) – number of hidden units in decoder (Default: 128)

  • **skip (bool) – uses generative skip model with residual paths between latents and decoder layers (Default: False)

Example:

>>> input_dim = (28, 28)  # intput dimensions
>>> # Intitialize model
>>> jrvae = aoi.models.jVAE(input_dim, latent_dim=2, discrete_dim=[10],
>>>                         numlayers_encoder=3, numhidden_encoder=512,
>>>                         numlayers_decoder=3, numhidden_decoder=512)
>>> # Train
>>> jrvae.fit(imstack_train, training_cycles=100,
              batch_size=100, rotation_prior=np.pi/4)
>>> jrvae.manifold2d(origin="upper", cmap="gnuplot2")
elbo_fn(x, x_reconstr, *args, **kwargs)[source]

Computes ELBO

Return type:

Tensor

forward_compute_elbo(x, y=None, mode='train')[source]

Joint rVAE’s forward pass with training/test loss computation

Return type:

Tensor

fit(X_train, y_train=None, X_test=None, y_test=None, loss='mse', verbose='True', **kwargs)[source]

Trains joint rVAE model

Parameters:
  • X_train (Union[ndarray, Tensor]) – 3D or 4D stack of training images with dimensions (n_images, height, width) for grayscale data or or (n_images, height, width, channels) for multi-channel data

  • y_train (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of training images

  • X_test (Union[ndarray, Tensor, None]) – 3D or 4D stack of test images with the same dimensions as for the X_train (Default: None)

  • y_test (Union[ndarray, Tensor, None]) – Vector with labels of dimension (n_images,), where n_images is a number of test images

  • loss (str) – reconstruction loss function, “ce” or “mse” (Default: “mse”)

  • **translation_prior (float) – translation prior

  • **rotation_prior (float) – rotational prior

  • **temperature (float) – Relaxation parameter for Gumbel-Softmax distribution

  • **cont_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the continuous latent channel. Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104

  • **disc_capacity (list) – List containing (max_capacity, num_iters, gamma) parameters to control the capacity of the discrete latent channel(s). Default values: [5.0, 25000, 30]. Based on https://arxiv.org/pdf/1804.03599.pdf & https://arxiv.org/abs/1804.00104

  • **filename (str) – file path for saving model after each training cycle (“epoch”)

  • verbose (str) – display training output, “True” or “False” (Default: “True”)

Return type:

None

update_metadict()[source]
compile_trainer(train_data, test_data=None, optimizer=None, elbo_fn=None, training_cycles=100, batch_size=32, **kwargs)

Compiles model’s trainer

Parameters:
  • train_data (Tuple[Union[Tensor, ndarray]]) – Train data and (optionally) corresponding targets or labels

  • train_data – Test data and (optionally) corresponding targets or labels

  • optimizer (Optional[Type[Optimizer]]) – Weights optimizer. Defaults to Adam with learning rate 1e-4

  • elbo_fn (Optional[Callable]) – function that calculates elbo loss

  • training_cycles (int) – Number of training iterations (aka “epochs”)

  • batch_size (int) – Size of mini-batch for training

  • **kwargs (Union[str, float]) – Additional keyword arguments are ‘filename’ (for saving model) and ‘memory alloc’ (threshold for keeping data on GPU)

Return type:

None

decode(z_sample, y=None)

Takes a point in latent space and maps it to data space via the learned generative model

Parameters:
  • z_sample (Union[ndarray, Tensor]) – coordinates in latent space

  • y (Union[int, ndarray, Tensor, None]) – label (optional)

Return type:

ndarray

Returns:

Generated (“decoded”) image(s)

encode(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Mean and SD of the encoded continuous distribution, and alphas (“class probabilities”) for the encoded discrete distribution(s) (if any). For rVAE, the output is (z_mean, z_sd). For jVAE and jrVAE, the output is (z_mean, z_sd, alphas). In all the cases z_mean consists of the encoded angle as 1st dimension, encoded x- and y-shift as 2nd and 3rd dimensions (if translation is set to True), and standard VAE latent variables as 4th, 5th, …, n-th dimensions (if translation is set to True; otherwise, 2nd, 3rd, … n-th dimensions)

encode_(x_new, **kwargs)

Encodes input image data using a trained VAE’s encoder

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **num_batches (int) – number of batches (Default: 10)

Return type:

Tuple[ndarray]

Returns:

Concatenated array of encoded vectors

encode_image_(img, **kwargs)

Crops and encodes a subimage around each pixel in the input image. The size of subimage is determined by size of images in VAE training data.

Parameters:
  • img (ndarray) – 2D numpy array

  • **num_batches (int) – number of batches for encoding subimages

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image and encoded array (cropping is due to finite window size)

encode_images(imgdata, **kwargs)

Encodes every pixel of every image in image stack

Parameters:
  • imgdata (ndarray) – 3D numpy array of images. Can also be a single 2D image

  • **num_batches (int) – number of batches for for encoding pixels of a single image

Return type:

Tuple[ndarray, ndarray]

Returns:

Cropped original image stack and encoded array (cropping is due to finite window size)

encode_trajectories(imgdata, coord_class_dict, window_size, min_length, rmax, **kwargs)

Calculates trajectories and latent variable value for each point in a trajectory.

Parameters:
  • imgdata (ndarray) – NN output (preferable) or raw data

  • coord_class_dict (Dict[int, ndarray]) – atomic/defect/particle coordinates

  • window_size (int) – size of subimages to crop

  • min_length (int) – minimum length of trajectory to be included

  • rmax (int) – maximum allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one

  • **num_batches (int) – number of batches for self.encode (Default: 10)

Return type:

Tuple[List[ndarray], List[ndarray]]

Returns:

List of encoded trajectories and corresponding movie frame numbers

evaluate_model()

Evaluates model on test data

kld_normal(z, q_param, p_param=None)

Calculates KL divergence term between two normal distributions or (if p_param = None) between normal and standard normal distributions

Parameters:
  • z (Tensor) – latent vector (reparametrized)

  • q_param (Tuple[Tensor]) – tuple with mean and SD of the 1st distribution

  • p_param (Optional[Tuple[Tensor]]) – tuple with mean and SD of the 2nd distribution (optional)

Return type:

Tensor

load_weights(filepath)

Loads saved weights

Return type:

None

classmethod log_normal(x, mu, log_sd)

Computes log-pdf for a normal distribution

Return type:

Tensor

classmethod log_unit_normal(x)

Computes log-pdf of a unit normal distribution

Return type:

Tensor

manifold2d(**kwargs)

Performs mapping from latent space to data space allowing the learned manifold to be visualized. This works only for 2d latent variable (not counting angle & translation dimensions)

Parameters:
  • **d (int) – grid size

  • **l1 (list) – range of 1st latent variable

  • **l2 (list) – range of 2nd latent variable

  • **label (int) – label in class-conditioned (r)VAE

  • **disc_idx (int) – discrete “class”

  • **cmap (str) – color map (Default: gnuplot)

  • **draw_grid (bool) – plot semi-transparent grid

  • **origin (str) – plot origin (e.g. ‘lower’)

Return type:

None

manifold_traversal(cont_idx, d=10, cont_idx_fixed=0, plot=True, **kwargs)

Latent space traversals for joint continuous and discrete latent representations

Return type:

ndarray

print_statistics(e)

Prints training and (optionally) test loss after each training cycle

reconstruct(x_new, **kwargs)

Forward prediction with uncertainty quantification by sampling from the encoded mean and std. Works only for regular VAE (and not for rVAE)

Parameters:
  • x_new (Union[ndarray, Tensor]) – image array to encode

  • **label (int) – class to be reconstructed (for cVAE, crVAE, jVAE, and jrVAE)

  • **num_samples (int) – number of samples to generate from normal distribution

Return type:

ndarray

Returns:

Ensemble of “decoded” images

classmethod reparameterize(z_mean, z_sd)

Reparameterization trick for continuous distributions

Return type:

Tensor

classmethod reparameterize_discrete(alpha, tau)

Reparameterization trick for discrete gumbel-softmax distributions

save_model(*args)

Saves trained weights and the key model parameters

Return type:

None

save_weights(*args)

Saves trained weights

Return type:

None

set_data(X_train, y_train=None, X_test=None, y_test=None, memory_alloc=4)

Initializes train and (optionally) test data loaders

Return type:

None

set_decoder(decoder_net)

Sets a decoder network only

Return type:

None

set_encoder(encoder_net)

Sets an encoder network only

Return type:

None

set_model(encoder_net, decoder_net)

Sets encoder and decoder models

Return type:

None

train_epoch()

Trains a single epoch

classmethod visualize_manifold_learning(frames_dir, **kwargs)

Creates and stores a video showing evolution of learned 2D manifold during rVAE’s training

Parameters:
  • frames_dir (str) – directory with snapshots of manifold as .png files (the files should be named as “1.png”, “2.png”, etc.)

  • **moviename (str) – name of the movie

  • **frame_duration (int) – duration of each movie frame

Return type:

None

Deep Kernel Learning

class atomai.models.dklGPR(indim, embedim=2, shared_embedding_space=True, **kwargs)[source]

Bases: dklGPTrainer

Deep kernel learning (DKL)-based Gaussian process regression (GPR)

Parameters:
  • indim (int) – input feature dimension

  • embedim (int) – embedding dimension (determines dimensionality of kernel space)

  • shared_embedding_space (bool) – use one embedding space for all target outputs

Keyword Arguments:
  • device – Sets device to which model and data will be moved. Defaults to ‘cuda:0’ if a GPU is available and to CPU otherwise.

  • precision – Sets tensor types for ‘single’ (torch.float32) or ‘double’ (torch.float64) precision

  • seed – Seed for enforcing reproducibility

Examples

Train a DKL-GPR model with high-dimensional inputs X and outputs y:

>>> data_dim = X.shape[-1]  # X dimensions are n_samples x d
>>> dklgp = aoi.models.dklGPR(data_dim, embedim=2, precision="double")
>>> dklgp.fit(X, y, training_cycles=100, lr=1e-2)

Make a prediction on new data (mean and variance for each ‘test’ point):

>>> mean, var = dklgp.predict(X_test, batch_size=len(X_test))

Alternatively, one can obtain a prediction as follows:

>>> samples = dklgp.sample_from_posterior(X_test, num_samples=1000)
>>> mean, var = samples.mean(0), samples.var(0)
fit(X, y, training_cycles=1, **kwargs)[source]

Initializes and trains a deep kernel GP model

Parameters:
  • X (Union[Tensor, ndarray]) – Input training data (aka features) of N x input_dim dimensions

  • y (Union[Tensor, ndarray]) – Output targets of batch_size x N or N (if batch_size=1) dimensions

  • training_cycles (int) – Number of training epochs

Keyword Arguments:
  • feature_extractor – (Optional) Custom neural network for feature extractor. Must take input/feature dims and embedding dims as its arguments.

  • freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.

  • lr – learning rate (Default: 0.01)

  • print_loss – print loss at every n-th training cycle (epoch)

Return type:

None

fit_ensemble(X, y, training_cycles=1, n_models=5, **kwargs)[source]

Initializes and trains an ensemble of deep kernel GP model

Parameters:
  • X (Union[Tensor, ndarray]) – Input training data (aka features) of N x input_dim dimensions

  • y (Union[Tensor, ndarray]) – Output targets of batch_size x N or N (if batch_size=1) dimensions

  • training_cycles (int) – Number of training epochs

  • n_models (int) – Number of models in ensemble

Keyword Arguments:
  • feature_extractor – (Optional) Custom neural network for feature extractor. Must take input/feature dims and embedding dims as its arguments.

  • freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.

  • lr – learning rate (Default: 0.01)

  • print_loss – print loss at every n-th training cycle (epoch)

Return type:

None

sample_from_posterior(X, num_samples=1000)[source]

Computes the posterior over model outputs at the provided points (X) and samples from it

Return type:

ndarray

thompson(X_cand, scalarize_func=None, maximize=True)[source]

Thompson sampling for selecting the next measurement point

Return type:

Tuple[ndarray, int]

predict(x_new, **kwargs)[source]

Prediction of mean and variance using the trained model

Return type:

Tuple[ndarray]

embed(x_new, **kwargs)[source]

Embeds the input data to a “latent” space using a trained feature extractor NN.

Return type:

Tensor

compile_multi_model_trainer(X, y, training_cycles=1, **kwargs)

Initializes deep kernel (feature extractor NNs + base kernels), sets optimizer and “loss” function. For vector-valued functions (multiple outputs), it assumes one latent space per output, that is, the number of neural networks is equal to the number of Gaussian processes. For example, if the outputs are spectra of length 128, one will have 128 neural networks and 128 GPs trained in parallel. It can be also used for training an ensembles of models for the same scalar output.

Return type:

None

compile_trainer(X, y, training_cycles=1, **kwargs)

Initializes deep kernel (feature extractor NN + base kernel), sets optimizer and “loss” function. For vector-valued functions (multiple outputs), it assumes a shared latent space, that is, a single neural network is connected to multiple Gaussian processes.

Parameters:
  • X (Union[Tensor, ndarray]) – Input training data (aka features) of N x input_dim dimensions

  • y (Union[Tensor, ndarray]) – Output targets of batch_size x N or N (if batch_size=1) dimensions

  • training_cycles (int) – Number of training epochs

Keyword Arguments:
  • feature_extractor – (Optional) Custom neural network for feature extractor. Must take input/feature dims and embedding dims as its arguments.

  • grid_size – Grid size for structured kernel interpolation (Default: 50)

  • freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.

  • lr – learning rate (Default: 0.01)

Return type:

None

print_statistics(e)
run(X=None, y=None, training_cycles=1, **kwargs)

Initializes and trains a deep kernel GP model

Parameters:
  • X (Union[Tensor, ndarray, None]) – Input training data (aka features) of N x input_dim dimensions

  • y (Union[Tensor, ndarray, None]) – Output targets of batch_size x N or N (if batch_size=1) dimensions

  • training_cycles (int) – Number of training epochs

Keyword Arguments:
  • feature_extractor – (Optional) Custom neural network for feature extractor

  • freeze_weights – Freezes weights of feature extractor, that is, they are not passed to the optimizer. Used for a transfer learning.

  • grid_size – Grid size for structured kernel interpolation (Default: 50)

  • lr – learning rate (Default: 0.01)

  • print_loss – print loss at every n-th training cycle (epoch)

Return type:

Type[ExactGP]

save_weights(filename)

Saves weights of the feature extractor.

Return type:

None

set_data(x, y=None, device=None)

Data preprocessing. Casts data array to a selected tensor type and moves it to a selected devive.

Return type:

Tuple[tensor]

train_step()

Single training step with backpropagation to computegradients and optimizes weights.

Return type:

None

Load trained models

atomai.models.load_model(filepath)[source]

Loads trained AtomAI models

Parameters:

meta_state_dict (str) – filepath to meta-state dictionary with trained weights and information about model’s structure

Return type:

Union[Segmentor, VAE, rVAE, jrVAE, jVAE, ImSpec]

Returns:

Model in evaluation state

atomai.models.load_ensemble(filepath)[source]

Loads trained ensemble models

Parameters:

meta_state_dict (str) – filepath to dictionary with trained weights and key information about model’s structure

Return type:

Tuple[Type[Module], Dict[int, Dict[str, Tensor]]]

Returns:

Single model with averaged weights and dictionary with weights of all models