Other utilities
Statistics
- class atomai.stat.imlocal(network_output, coord_class_dict_all, window_size=None, coord_class=0)[source]
Class for extraction and statistical analysis of local image descriptors. It assumes that input image data is an output of a neural network, but it can also work with regular experimental images (make sure you have extra dimensions for channel and batch size).
- Parameters
network_output (4D numpy array) – Output of a fully convolutional neural network where a class is assigned to every pixel in the input image(s). The dimensions are \(images \times height \times width \times channels\)
coord_class_dict_all (dict) – Prediction from atomnet.locator (can be from other source but must be in the same format) Each element is a \(N \times 3\) numpy array, where N is a number of detected atoms/defects, the first 2 columns are xy coordinates and the third columns is class (starts with 0)
window_size (int) – Side of the square for subimage cropping
coord_class (int) – Class of atoms/defects around around which the subimages will be cropped; in the atomnet.locator output the class is the 3rd column (the first two are xy positions)
Examples:
Identification of distortion domains in a single atomic image:
>>> # First obtain a "cleaned" image and atomic coordinates using a trained model >>> nn_output, coordinates = model.predict(expdata) >>> # Now get local image descriptors using ```atomai.stat.imlocal``` >>> imstack = stat.imlocal(nn_output, coordinates, window_size=32, coord_class=1) >>> # Compute PCA scree plot to estimate the number of components/sources >>> imstack.pca_scree_plot(plot_results=True); >>> # Do PCA analysis and plot results >>> pca_results = imstack.imblock_pca(n_components=4, plot_results=True) >>> # Do NMF analysis and plot results >>> pca_results = imstack.imblock_nmf(n_components=4, plot_results=True)
Analysis of atomic/defect trajectories from movies (3D image stack):
>>> # Get local descriptors (such as subimages centered around impurities) >>> imstack = stat.imlocal(nn_output, coordinates, window_size=32, coord_class=1) >>> # Calculate Gaussian mixture model (GMM) components >>> components_img, classes_list = imstack.gmm(n_components=10, plot_results=True) >>> # Calculate GMM components and transition probabilities for different trajectories >>> traj_all, trans_all, fram_all = imstack.transition_matrix(n_components=10, rmax=10)
- extract_subimages_()[source]
Extracts subimages centered at certain atom class/type in the neural network output
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
stack of subimages
(x, y) coordinates of their centers
frame number associated with each subimage
- gmm(n_components, covariance='diag', random_state=1, plot_results=False)[source]
Applies Gaussian mixture model to image stack.
- Parameters
n_components (int) – Number of components
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting gmm components
- Return type
Tuple
[ndarray
,List
]- Returns
3-element tuple containing
4D numpy array with GMM “centroids” (averaged images for each class)
List where each element contains 4D images belonging to each GMM class
2D numpy array with xy coordinates, label and corresponding frame number for each subimage
- pca(n_components, random_state=1, plot_results=False)[source]
Computes PCA eigenvectors for a stack of subimages.
- Parameters
n_components (int) – Number of PCA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
4D numpy array with computed and reshaped principal axes
2D numpy with projection of X_vec (vector with flattened subimages) on the first principal components
2D numpy array with center-of-mass coordinates and corresponding frame number for each subimage
- ica(n_components, random_state=1, plot_results=False)[source]
Computes ICA independent souces for a stack of subimages.
- Parameters
n_components (int) – Number of ICA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed sources
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
4D numpy array with computed and reshaped independent sources
2D numpy array with recovered sources from X_vec (vector with flattned subimages)
2D numpy aray with center-of-mass coordinates and corresponding frame number for each subimage
- nmf(n_components, random_state=1, plot_results=False, **kwargs)[source]
Applies NMF to source separation from a stack of subimages
- Parameters
n_components (int) – Number of NMF components
random_state (int) – Random state instance
plot_results (bool) – Plots computed sources
**max_iterations (int) – Maximum number of iterations before timing out
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
4D numpy array with computed and reshaped sources
2D numpy array with transformed data according to the trained NMF model,
2D numpy aray with center-of-mass coordinates and corresponding frame number for each subimage
- pca_gmm(n_components_gmm, n_components_pca, plot_results=False, covariance_type='diag', random_state=1)[source]
Performs PCA decomposition on GMM-unmixed classes. Can be used when GMM allows separating different symmetries (e.g. different sublattices in graphene)
- Parameters
n_components_gmm (int) – Number of components for GMM
n_components_pca (int or list of int) – Number of PCA components. Pass a list of integers in order to have different number PCA of components for each GMM class
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting GMM components
- Return type
Tuple
[ndarray
,List
]- Returns
4-element tuple containing
4D numpy array with GMM “centroids” (averaged images for each GMM class)
List of 4D numpy arrays with PCA components
List with PCA-transformed data
2D numpy array with xy coordinates, GMM-assigned labels, and corresponding frame numbers
- pca_scree_plot(plot_results=True)[source]
Computes and plots PCA ‘scree plot’ (explained variance ratio vs number of components)
- Return type
ndarray
- pca_gmm_scree_plot(n_components_gmm, covariance_type='diag', random_state=1, plot_results=True)[source]
Computes PCA scree plot for each GMM class
- Parameters
n_components_gmm (int) – Number of components for GMM
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting GMM components and PCA scree plot
- Return type
List
[ndarray
]- Returns
List with PCA explained variances for each GMM component
- imblock_pca(n_components, random_state=1, plot_results=False, **kwargs)[source]
Computes PCA eigenvectors and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.
- Parameters
n_components (int) – Number of PCA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**marker_size (int) – Controls marker size for loading maps plot
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
4D numpy array with computed (and reshaped) principal axes
2D numpy array with projection of X_vec (vector with flattened subimages) on the first principal components
2D numpy array with coordinates of each subimage
- imblock_ica(n_components, random_state=1, plot_results=False, **kwargs)[source]
Computes ICA independent souces and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.
- Parameters
n_components (int) – Number of ICA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**marker_size (int) – controls marker size for loading maps plot
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
4D numpy array with computed (and reshaped) independent sources
2D numpy array with recovered sources from X_vec (vector with flattened subimages)
2D numpy array with coordinates of each subimage
- imblock_nmf(n_components, random_state=1, plot_results=False, **kwargs)[source]
Applies NMF to source separation. Computes sources and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.
- Parameters
n_components (int) – Number of NMF components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**max_iterations (int) – Maximum number of iterations before timing out
**marker_size (int) – Controls marker’s size for loading maps plots
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
4D numpy array with computed (and reshaped) sources
2D numpy array with transformed X_vec (vector with flattened subimages) according to the trained NMF model
2D numpy array with coordinates of each subimage
- classmethod plot_decomposition_results(components, X_vec_t, image_hw=None, xy_centers=None, plot_loading_maps=True, **kwargs)[source]
Plots decomposition “eigenvectors”. Plots loading maps
- Parameters
components (4D numpy array) – Computed (and reshaped) principal axes / independent sources / factorization matrix for stack of subimages
X_vec_t (2D numpy array) – Projection of X_vec on the first principal components / Recovered sources from X_vec / transformed X_vec according to the learned NMF model (is used to create “loading maps”)
img_hw (tuple) – Height and width of the “mother image”
xy_centers (n x 2 numpy array) – (x, y) coordinates of the extracted subimages
plot_loading_maps (bool) – Plots loading maps for each “eigenvector”
**marker_size (int) – Controls marker’s size for loading maps plots
- Return type
None
- classmethod get_trajectory(coord_class_dict, start_coord, rmax)[source]
Extracts a trajectory of a single defect/atom from image stack
- Parameters
coord_class_dict (dict) – Dictionary of atomic coordinates (same format as produced by atomnet.locator)
start_coord (N x 2 numpy array) – Coordinate of defect/atom in the first frame whose trajectory we are going to track
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
- Return type
Tuple
[ndarray
]- Returns
2-element tuple containing
Numpy array of defect/atom coordinates form a single trajectory
Frames corresponding to this trajectory
- get_all_trajectories(min_length=0, run_gmm=False, rmax=10, **kwargs)[source]
Extracts trajectories for the detected defects starting from the first frame. Applies (optionally) Gaussian mixture model to a stack of local descriptors (subimages).
- Parameters
min_length (int) – Minimal length of trajectory to return
run_gmm (bool) – Optional GMM separation into different classes
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**n_components (int) – Number of components for Gaussian mixture model
**covariance (str) – Type of covariance for Gaussian mixture model (‘full’, ‘diag’, ‘tied’, ‘spherical’)
**random_state (int) – Random state instance for Gaussian mixture model
- Return type
Dict
- Returns
Python dictionary containing
list of numpy arrays with defects/atoms trajectories (“trajectories”)
list of frames corresponding to the extracted trajectories (“frames”)
GMM components when run_gmm=True (“gmm_components”)
- classmethod renumerate_classes(classes)[source]
Helper functions for renumerating Gaussian mixture model classes for Markov transition analysis
- Return type
ndarray
- transition_matrix(n_components, covariance='diag', random_state=1, rmax=10, min_length=0, sum_all_transitions=False)[source]
Applies Gaussian mixture model to a stack of local descriptors (subimages). Extracts trajectories for the detected defects starting from the first frame. Calculates transition probability for each trajectory.
- Parameters
n_components (int) – Number of components for Gaussian mixture model
covariance (str) – Type of covariance for Gaussian mixture model (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance for Gaussian mixture model
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
min_length (int) – Minimal length of trajectory to return
- Return type
Dict
- Returns
Pyhton dictionary containing
List of defects/atoms trajectories (“trajectories”)
List of transition matrices for each trajectory (“transitions”)
List of frames corresponding to the extracted trajectories (“frames”)
GMM components as images (“gmm_components”)
- atomai.stat.update_classes(coordinates, nn_input, method='threshold', **kwargs)[source]
Updates atomic/defect classes based on the calculated intensities at each predicted position or local neighborhood analysis based on subimages cropped around each predicted position
- Parameters
coordinates (dict) – Output of atomnet.predictor. It is also possible to pass a single dictionary value associated with a specific image in a stack. In this case, the same image needs to be passed as ‘nn_input’.
nn_input (numpy array) – Image(s) served as an input to neural network
method (str) – Method for intensity-based update of atomic classes (‘threshold’, ‘kmeans’, ‘gmm_local’)
**thresh (float or int) – Intensity threshold value. Values above/below are set to 1/0
**n_components (int) – Number of components for k-means clustering
- Return type
Dict
[int
,ndarray
]- Returns
Updated coordinates
Image transforms
- class atomai.transforms.datatransform(n_channels=None, dim_order_in='channel_last', dim_order_out='channel_first', squeeze_channels=False, seed=None, **kwargs)[source]
Applies a sequence of pre-defined operations for data augmentation.
- Parameters
n_channels (int) – Number of classes (channels) in the ground truth
dim_order_in (str) – Channel first or channel last ordering in the input masks
dim_order_out (str) – Channel first or channel last ordering in the output masks
seed (int) – Determenism
**custom_transform (Callable) – Python function that takes two ndarrays (images and masks) as input, applies a set of transformation to them, and returns the two transformed arrays
**rotation (bool) – Rotating image by +- 90 deg (if image is square) and horizontal/vertical flipping.
**zoom (bool or int) – Zooming-in by a specified zoom factor (Default: 2) Note that a zoom window is always square
**gauss_noise (bool or list ot tuple) – Gaussian noise. You can pass min and max values as a list/tuple (Default [min, max] range: [0, 50])
**poisson_noise (bool or list ot tuple) – Poisson noise. You can pass min and max values as a list/tuple (Default [min, max] range: [30, 40])
**salt_and_pepper (bool or list ot tuple) – Salt and pepper noise. You can pass min and max values as a list/tuple (Default [min, max] range: [0, 50])
**blur (bool or list ot tuple) – Gaussian blurring. You can pass min and max values as a list/tuple (Default [min, max] range: [1, 50])
**contrast (bool or list ot tuple) – Contrast level. You can pass min and max values as a list/tuple (Default [min, max] range: [5, 20])
**background (bool) – Adds/substracts asymmetric 2D gaussian of random width and intensity from the image
**resize (tuple) – Values for image resizing [downscale factor (default: 2), upscale factor (default:1.5)]
- apply_gauss(X_batch, y_batch)[source]
Random application of gaussian noise to each training inage in a stack
- Return type
Tuple
[ndarray
]
- apply_jitter(X_batch, y_batch)[source]
Random application of jitter noise to each training image in a stack
- Return type
Tuple
[ndarray
]
- apply_poisson(X_batch, y_batch)[source]
Random application of poisson noise to each training inage in a stack
- Return type
Tuple
[ndarray
]
- apply_sp(X_batch, y_batch)[source]
Random application of salt & pepper noise to each training inage in a stack
- Return type
Tuple
[ndarray
]
- apply_blur(X_batch, y_batch)[source]
Random blurring of each training image in a stack
- Return type
Tuple
[ndarray
]
- apply_contrast(X_batch, y_batch)[source]
Randomly change level of contrast of each training image on a stack
- Return type
Tuple
[ndarray
]
- apply_zoom(X_batch, y_batch)[source]
Zoom-in achieved by cropping image and then resizing to the original size. The zooming window is a square.
- Return type
Tuple
[ndarray
]
- apply_background(X_batch, y_batch)[source]
Emulates thickness variation in STEM or height variation in STM
- Return type
Tuple
[ndarray
]
- apply_rotation(X_batch, y_batch)[source]
Flips and rotates training images and correponding ground truth images
- Return type
Tuple
[ndarray
]
- apply_imresize(X_batch, y_batch)[source]
Resizes training images and corresponding ground truth images
- Return type
Tuple
[ndarray
]
- run(images, targets)[source]
Applies a sequence of augmentation procedures to images and (except for noise) targets. Starts with user defined custom_transform if available. Then proceeds with rotation->zoom->resize->gauss->jitter->poisson->sp->blur->contrast->background. The operations that are not specified in kwargs are skipped.
- Return type
Tuple
[ndarray
]
Training data preparation
- atomai.utils.create_lattice_mask(lattice, xy_atoms, *args, **kwargs)[source]
Given experimental image and xy atomic coordinates creates ground truth image. Currently works only for the case where all atoms are one class. Notice that it will round fractional pixels.
- Parameters
lattice (2D numpy array) – Experimental image as 2D numpy array
xy_atoms (N x 2 numpy array) – Position of atoms in the experimental data
*arg (python function) –
Function that creates a 2D numpy array with atom and corresponding mask for each atomic coordinate. It must have two parameters, ‘scale’ and ‘rmask’ that control sizes of simulated atom and corresponding mask
Example:
>>> def create_atomic_mask(scale=7, rmask=5): >>> atom = MakeAtom(r).atom2dgaussian() >>> _, mask = cv2.threshold(atom, thresh, 1, cv2.THRESH_BINARY) >>> return atom, mask
**scale (int) – Controls the atom size (width of 2D Gaussian)
**rmask (int) – Controls the atomic mask size
- Return type
ndarray
- Returns
2D numpy array with ground truth data
- atomai.utils.create_multiclass_lattice_mask(imgdata, coord_class_dict, *args, **kwargs)[source]
Given a stack of experimental images and dictionary with atomic coordinates and classes creates a ground truth image. Notice that it will round fractional pixels.
- Parameters
lattice (3D numpy array) – Experimental image as 2D numpy array
coord_class_dict (dict or N x 3 numpy array) – Dictionary with arrays containing coordiantes and classes for each atom/defect In each array, the first two columns are position of atoms. The third column is the “intensity”/class of each atom. It is also possible to pass a single N x 3 ndarray, which will be wrapped into a dictioanry automatically.
*arg (python function) – Function that creates two 2D numpy arrays with atom and corresponding mask for each atomic coordinate. It must have three parameters, ‘scale’, ‘rmask’, and ‘intensity’ that control size and intensity of simulated atom and corresponding atomic mask
**scale (int) – Controls the atom size (width of 2D Gaussian)
**rmask (int) – Controls the atomic mask size
- Return type
Union
[List
[ndarray
],ndarray
]- Returns
4D numpy array with ground truth data or list of 3D numpy arrays
- atomai.utils.extract_patches(images, masks, patch_size, num_patches, **kwargs)[source]
Takes batch of images and batch of corresponding masks as an input and for each image-mask pair it extracts stack of subimages (patches) of the selected size.
- Return type
Tuple
[ndarray
]
- atomai.utils.extract_random_subimages(imgdata, window_size, num_images, coordinates=None, **kwargs)[source]
Extracts randomly subimages centered at certain atom class/type (usually from a neural network output) or just at random pixels (if coordinates are not known/available)
- Parameters
imgdata (numpy array) – 4D stack of images (n, height, width, channel)
window_size (int) – Side of the square for subimage cropping
num_images (int) – number of images to extract from each “frame” in the stack
coordinates (dict) – Optional. Prediction from atomnet.locator (can be from other source but must be in the same format) Each element is a \(N \times 3\) numpy array, where N is a number of detected atoms/defects, the first 2 columns are xy coordinates and the third columns is class (starts with 0)
**coord_class (int) – Class of atoms/defects around around which the subimages will be cropped (3rd column in the atomnet.locator output)
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
stack of subimages
(x, y) coordinates of their centers
frame number associated with each subimage
- atomai.utils.extract_subimages(imgdata, coordinates, window_size, coord_class=0)[source]
Extracts subimages centered at certain atom class/type (usually from a neural network output)
- Parameters
imgdata (numpy array) – 4D stack of images (n, height, width, channel). It is also possible to pass a single 2D image.
coordinates (dict or N x 2 numpy arry) – Prediction from atomnet.locator (can be from other source but must be in the same format) Each element is a \(N \times 3\) numpy array, where N is a number of detected atoms/defects, the first 2 columns are xy coordinates and the third columns is class (starts with 0). It is also possible to pass N x 2 numpy array if the corresponding imgdata is a single 2D image.
window_size (int) – Side of the square for subimage cropping
coord_class (int) – Class of atoms/defects around around which the subimages will be cropped (3rd column in the atomnet.locator output)
- Return type
Tuple
[ndarray
]- Returns
3-element tuple containing
stack of subimages,
(x, y) coordinates of their centers,
frame number associated with each subimage
- class atomai.utils.MakeAtom(sc=5, r_mask=3, intensity=1, theta=0, offset=0)[source]
Creates an image of atom modelled as 2D Gaussian and a corresponding mask
- atomai.utils.FFTmask(imgsrc, maskratio=10)[source]
Takes a square real space image and filter out a disk with radius equal to: 1/maskratio * image size. Retruns FFT transform of the image and the filtered FFT transform
- Return type
Tuple
[ndarray
]
Image pre/post processing
- atomai.utils.torch_format(image_data)[source]
Reshapes (if needed), normalizes and converts image data to pytorch format for model training and prediction
- Parameters
image_data (3D or 4D numpy array) – Image stack with dimensions (n_batches x height x width) or (n_batches x 1 x height x width)
- Return type
Tensor
- atomai.utils.img_resize(image_data, rs, round_=False)[source]
Resizes a stack of images
- Parameters
image_data (3D numpy array) – Image stack with dimensions (n_batches x height x width)
rs (tuple) – Target height and width
round (bool) – rounding (in case of labeled pixels)
- Return type
ndarray
- Returns
Resized stack of images
- atomai.utils.img_pad(image_data, pooling)[source]
Pads the image if its size (w, h) is not divisible by \(2^n\), where n is a number of pooling layers in a network
- Parameters
image_data (3D numpy array) – Image stack with dimensions (n_batches x height x width)
pooling (int) – Downsampling factor (equal to \(2^n\), where n is a number of pooling operations)
- Return type
ndarray
- atomai.utils.crop_borders(imgdata, thresh=0)[source]
Crops image border where all values are zeros
- Parameters
imgdata (numpy array) – 3D numpy array (h, w, c)
thresh (
float
) – border values to crop
Returns: Cropped array
- Return type
ndarray
- atomai.utils.filter_cells(imgdata, im_thresh=0.5, blob_thresh=50, filter_='below')[source]
Filters blobs above/below certain size for each image in the stack. The ‘imgdata’ must have dimensions (n x h x w).
- Parameters
imgdata (3D numpy array) – stack of images (without channel dimension)
im_thresh (float) – value at which each image in the stack will be thresholded
blob_thresh (int) – maximum/mimimun blob size for thresholding
filter (string) – Select ‘above’ or ‘below’ to remove larger or smaller blobs, respectively
- Return type
ndarray
- Returns
Image stack with the same dimensions as the input data
- atomai.utils.get_blob_params(nn_output, im_thresh, blob_thresh, filter_='below')[source]
Extracts position and angle of particles in each movie frame
- Parameters
nn_output (4D numpy array) – out of neural network returned by atomnet.predictor
im_thresh (float) – value at which each image in the stack will be thresholded
blob_thresh (int) – maximum/mimimun blob size for thresholding
filter (string) – Select ‘above’ or ‘below’ to remove larger or smaller blobs, respectively
- Return type
Dict
- Returns
Nested dictionary where for each frame there is an ordered dictionary with values of centers of the mass and angle for each detected particle in that frame.
- atomai.utils.cv_thresh(imgdata, threshold=0.5)[source]
Wrapper for opencv binary threshold method. Returns thresholded image.
- atomai.utils.cv_resize(img, rs, round_=False)[source]
Wrapper for open-cv resize function
- Parameters
img (2D numpy array) – input 2D image
rs (tuple) – target height and width
round (bool) – rounding (in case of labeled pixels)
- Return type
ndarray
- Returns
Resized image
- atomai.utils.cv_resize_stack(imgdata, rs, round_=False)[source]
Resizes a 3D stack of images
- Parameters
imgdata (3D numpy array) – stack of 3D images to be resized
rs (tuple or int) – target height and width
round (bool) – rounding (in case of labeled pixels)
- Return type
ndarray
- Returns
Resized image
Atomic Coordinates
- atomai.utils.map_bonds(coordinates, nn=2, upper_bound=None, distance_ideal=None, plot_results=True, **kwargs)[source]
Generates plots with lattice bonds (color-coded according to the variation in their length)
- Parameters
coordinates (dict) – Dictionary where keys are frame numbers and values are \(N \times 3\) numpy arrays with atomic coordinates. In each array the first two columns are xy coordinates and the third column is atom class.
nn (int) – Number of nearest neighbors to search for.
upper_bound (float or int, non-negative) – Upper distance bound (in px) for Query the kd-tree for nearest neighbors. Only distances below this value will be counted.
distance_ideal (float) – Bond distance in ideal lattice. Defaults to average distance in the frame
plot_results (bool) – Plot bond maps
**savedir (str) – directory to save plots
**h (int) – image height
**w (int) – image width
- Return type
ndarray
- Returns
Array of distances to nearest neighbors for each atom
- atomai.utils.find_com(image_data)[source]
Find atoms via center of mass methods
- Parameters
image_data (2D numpy array) – 2D image (usually an output of neural network)
- Return type
ndarray
- atomai.utils.get_nn_distances(coordinates, nn=2, upper_bound=None)[source]
Calculates nearest-neighbor distances for a stack of images
- Parameters
coordinates (
Union
[Dict
[int
,ndarray
],ndarray
]) – Dictionary where keys are frame numbers and values are \(N \times 3\) numpy arrays with atomic coordinates. In each array the first two columns are xy coordinates and the third column is atom class. One can also pass a single numpy array (if all the coordiantes correspond to a single image)nn (
int
) – Number of nearest neighbors to search for.upper_bound (float or int, non-negative) – Upper distance bound for Query the kd-tree for nearest neighbors. Only distances below this value will be counted.
- Return type
Tuple
[List
[ndarray
]]- Returns
Tuple with list of \(atoms \times nn\) arrays of distances to nearest neighbors and list of \(atoms \times (nn+1) \times 3\) array of coordinates (including coordinates of the “center” atom), where n_atoms is less or equal to the total number of atoms in the ‘coordinates’ (due to ‘upper_bound’ criterion)
- atomai.utils.get_nn_distances_(coordinates, nn=2, upper_bound=None)[source]
Calculates nearest-neighbor distances for a single image
- Parameters
coordinates (numpy array) – \(N \times 3\) array with atomic coordinates where first two columns are xy coordinates and the third column is atom class
nn (int) – Number of nearest neighbors to search for.
upper_bound (float or int, non-negative) – Upper distance bound for Query the kd-tree for nearest neighbors. Only di
- Return type
Tuple
[ndarray
]- Returns
Tuple with \(atoms \times nn\) array of distances to nearest neighbors and \(atoms \times (nn+1) \times 3\) array of coordinates (including coordinates of the “center” atom), where n_atoms is less or equal to the total number of atoms in the ‘coordinates’ (due to ‘upper_bound’ criterion)
- atomai.utils.peak_refinement(imgdata, coordinates, d=None)[source]
Performs a refinement of atomic postitions by fitting 2d Gaussian where the neural network predictions serve as initial guess.
- Parameters
imgdata (2D numpy array) – Single experimental image/frame
coordinates (N x 3 numpy array) – Atomic coordinates where first two columns are xy coordinates and the third column is atom class
d (int) – Half-side of a square around the identified atom for peak fitting If d is not specified, it is set to 1/4 of average nearest neighbor distance in the lattice.
- Return type
ndarray
- Returns
Refined array of coordinates
- atomai.utils.compare_coordinates(coordinates1, coordinates2, d_max, plot_results=False, **kwargs)[source]
Finds difference between predicted (‘coordinates1’) and “true” (‘coordinates2’) coordinates using scipy.spatialcKDTree method. Use ‘d_max’ to set maximum search radius. For plotting, pass figure size and experimental image using keyword arguments ‘fsize’ and ‘expdata’.
- Return type
Tuple
[ndarray
]
Visualization
- atomai.utils.plot_trajectories(traj, frames, **kwargs)[source]
Plots individual trajectory (as position (radius) vector)
- Parameters
traj (n x 3 ndarray) – numpy array where first two columns are coordinates and the 3rd columd are classes
frames ((n,) ndarray) – numpy array with frame numbers
**lv (int) – latent variable value to visualize (Default: 1)
**fov (int or list) – field of view or scan size
**fsize (int) – figure size (Default: 6)
**cmap (str) – colormap (Default: jet)
- Return type
None
- atomai.utils.plot_transitions(matrix, states=None, gmm_components=None, plot_values=False, **kwargs)[source]
Plots transition matrix and (optionally) most frequent/probable transitions
- Parameters
m (2D numpy array) – Transition matrix
states (numpy array) – Array with states (e.g. [2, 5, 7])
gmm_components (4D numpy array) – GMM components (optional)
plot_values (bool) – Show calculated transtion rates
**transitions_to_plot (int) – number of transitions (associated with largest prob values) to plot
**plot_toself (bool) – Skips transitions into self when plotting transitions with largest probs
**fsize (int) – figure size
**cmap (str) – color map
- Return type
None
- atomai.utils.plot_trajectories_transitions(trans_dict, k, plot_values=False, **kwargs)[source]
Plots trajectory witht he associated transitions.
- Parameters
trans_dict (dict) – Python dictionary containing trajectories, frame numbers, transitions and the averaged GMM components. Usually this is an output of atomstat.transition_matrix
k (int) – Number of trajectory to vizualize
plot_values (bool) – Show calculated transtion rates
**transitions_to_plot (int) – number of transitions (associated with largerst prob values) to plot
**fsize (int) – figure size
**cmap (str) – color map
**fov (int or list) – field of view (scan size)
- Return type
None
ASE utilities
- atomai.utils.ase_obj_basic(coords_dict, frame_number, material_system, map_dict, filepath, px2ang)[source]
Takes the atomic coordinates and classes predicted by AtomAI’s Segmentor models and uses them to create a structure file readable by packages such as Atomic Simulation Environment (ASE), VESTA, etc. It uses a simple cubic cell.
- Parameters
coords_dict (
Union
[Dict
[int
,ndarray
],ndarray
]) – dictionary object of coordinates produced by AtomAIframe_number (
int
) – image frame number (assumes a stack of images)material_system (
str
) – name of materialmap_dict (
Dict
[int
,str
]) – dictionary which maps atomic classes from the NN output (keys) to strings corresponding to chemical elements (values)filepath (
str
) – Savepath for the ASE objectpx2ang (
float
) – Pixels-to-angstroms conversion coefficient, which is specific to each experiment
Examples:
>>> # Save coordinates for specific frame (0) as ASE object >>> ase_obj_basic(coordinates, 0, "Graphene", >>> map_dict = {0: "C", 1: "Si"}, >>> "/content/Drive/POSCAR", >>> px2ang=0.104) >>> # Read the saved files using ASE reader >>> cell = ase.io.vasp.read_vasp("/content/Drive/POSCAR")
- Return type
None
- atomai.utils.ase_obj_adv(a_lattice, b_lattice, c_lattice, coords_dict, frame_number, material_system, map_dict, filepath, px2ang)[source]
Takes the atomic coordinates and classes predicted by AtomAI’s Segmentor models and uses them to create a structure file readable by packages such as Atomic Simulation Environment (ASE), VESTA, etc. It uses a customized cell with multiple atoms per user’s choice.
- Parameters
a_lattice (
float
) – list of lattice vector in a direction ([a1,a2,a3])b_lattice (
float
) – list of lattice vector in a direction ([b1,b2,c3])c_lattice (
float
) – list of lattice vector in a direction ([c1,c2,c3])coords_dict (
Union
[Dict
[int
,ndarray
],ndarray
]) – dictionary object of coordinates produced by AtomAIframe_number (
int
) – image frame numbermaterial_system (
str
) – name of materialmap_dict (
Dict
[int
,str
]) – dictionary which maps atomic classes from the NN output (keys) to strings corresponding to chemical elements (values)filepath (
str
) – Savepath for the ASE objectpx2ang (
float
) – Pixels-to-Angstrom conversion coefficient, which is specific to each experiment
Examples:
>>> # Save coordinates for specific frame (0) as ASE object >>> ase_obj_adv([86.00000,0.00000,0.00000], >>> [0.00000,86.00000,0.00000], >>> [0.00000,0.00000,86.00000], coordinates, 0, >>> "Graphene", map_dict = {0: "C", 1: "Si"}, >>> "/content/Drive/POSCAR_adv", >>> px2ang=0.104) >>> # Read the saved file using ASE reader >>> cell = ase.io.vasp.read_vasp("/content/Drive/POSCAR_adv")
- Return type
None
Datasets
- atomai.utils.datasets.stem_smbfo(download=True, filedir='./')[source]
Downloads the scanning transmission electron microscopy (STEM) datasets from the combinatorial library of the Sm-doped BiFeO3 (BFO) grown to cover the composition range from pure ferroelectric BFO to orthorhombic 20% Sm-doped BFO. For details, see npj Comput Mater 6, 127 (2020). https://doi.org/10.1038/s41524-020-00396-2.
- Parameters
download (
bool
) – downloads the dataset from the public repositoryfiledir (
str
) – directory to save the downloaded data
- Return type
Dict
[str
,Dict
[str
,ndarray
]]- Returns
Nested dictionary where the 1st dictionary describes different Sm concentrations, and the 2nd dictionary has chemical and physical descriptors for each concentration.
Examples
>>> # Download the dataset >>> dataset = atomai.utils.datasets.stem_smbfo() >>> # Plot main image and polarization values for each Sm concentration >>> for k, d in dataset.items(): >>> _, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(16, 5)) >>> y, x = d["xy_COM"].T # get the center of the mass for each unit cell >>> ax1.imshow(d["main_image"], origin="lower", cmap='gray') >>> ax1.set_title(k) >>> ax2.scatter(x, y, c=d["Pxy"][:, 0], s=3, cmap='RdBu_r') >>> ax3.scatter(x, y, c=d["Pxy"][:, 1], s=3, cmap='RdBu_r') >>> plt.show()
- atomai.utils.datasets.stem_graphene(download=True, filedir='./')[source]
Downloads the scanning transmission electron microscopy (STEM) datasets from the graphene samples. See https://doi.ccs.ornl.gov/ui/doi/338 for details.
- Parameters
download (
bool
) – downloads the dataset from the public repositoryfiledir (
str
) – directory to save the downloaded data
- Return type
Dict
[int
,Dict
[str
,Union
[ndarray
,Dict
]]]- Returns
Nested dictionary with STEM movies in the form of M x N x L arrays and the corresponding metadata
Examples
>>> # Download the dataset >>> dataset = atomai.utils.datasets.stem_graphene() >>> # Get one STEM movie and the associated metadata >>> data = dataset[3]["image_data"] # ndarray of size (50, 1024, 1024) >>> metadata = dataset[3]["metadata] # dictionary with experimental parameters