Statistical Analysis

Sliding FFT + NMF

class atomai.stat.SlidingFFTNMF(window_size_x=None, window_size_y=None, window_step_x=None, window_step_y=None, interpolation_factor=2, zoom_factor=2, hamming_filter=True, components=4)[source]

Bases: object

make_windows(image)[source]: Generate windows from an image using efficient striding operations

process_fft(windows)[source]: Perform FFT on each window with optional hamming filter and zooming

run_nmf(fft_results)[source]: Run NMF on FFT results to extract components

analyze_image(image_input, output_path=None)[source]

Full analysis pipeline for an image

Parameters:

image_inputstr or numpy.ndarray: Either a file path to an image or a numpy array containing image data
output_pathstr, optional: Path for saving output files. If None, will be auto-generated for file inputs or use current directory for array inputs

Spectral Unmixing

class atomai.stat.SpectralUnmixer(method='nmf', n_components=4, normalize=False, **kwargs)[source]

Bases: object

Applies various decomposition algorithms to hyperspectral data for spectral unmixing and component analysis.

Supported methods: ‘nmf’, ‘pca’, ‘ica’, ‘gmm’.

fit(hspy_data)[source]: Fits the selected model to a hyperspectral data cube.

plot_results(x_axis_vals=None, x_axis_units=None, **kwargs)[source]

Local image analysis

class atomai.stat.imlocal(network_output, coord_class_dict_all, window_size=None, coord_class=0)[source]

Bases: object

Class for extraction and statistical analysis of local image descriptors. It assumes that input image data is an output of a neural network, but it can also work with regular experimental images (make sure you have extra dimensions for channel and batch size).

Parameters:

network_output (4D numpy array) – Output of a fully convolutional neural network where a class is assigned to every pixel in the input image(s). The dimensions are \(images \times height \times width \times channels\)
coord_class_dict_all (dict) – Prediction from atomnet.locator (can be from other source but must be in the same format) Each element is a \(N \times 3\) numpy array, where N is a number of detected atoms/defects, the first 2 columns are xy coordinates and the third columns is class (starts with 0)
window_size (int) – Side of the square for subimage cropping
coord_class (int) – Class of atoms/defects around around which the subimages will be cropped; in the atomnet.locator output the class is the 3rd column (the first two are xy positions)

Examples:

Identification of distortion domains in a single atomic image:

>>> # First obtain a "cleaned" image and atomic coordinates using a trained model
>>> nn_output, coordinates = model.predict(expdata)
>>> # Now get local image descriptors using ```atomai.stat.imlocal```
>>> imstack = stat.imlocal(nn_output, coordinates, window_size=32, coord_class=1)
>>> # Compute PCA scree plot to estimate the number of components/sources
>>> imstack.pca_scree_plot(plot_results=True);
>>> # Do PCA analysis and plot results
>>> pca_results = imstack.imblock_pca(n_components=4, plot_results=True)
>>> # Do NMF analysis and plot results
>>> pca_results = imstack.imblock_nmf(n_components=4, plot_results=True)

Analysis of atomic/defect trajectories from movies (3D image stack):

>>> # Get local descriptors (such as subimages centered around impurities)
>>> imstack = stat.imlocal(nn_output, coordinates, window_size=32, coord_class=1)
>>> # Calculate Gaussian mixture model (GMM) components
>>> components_img, classes_list = imstack.gmm(n_components=10, plot_results=True)
>>> # Calculate GMM components and transition probabilities for different trajectories
>>> traj_all, trans_all, fram_all = imstack.transition_matrix(n_components=10, rmax=10)

extract_subimages_()[source]

Extracts subimages centered at certain atom class/type in the neural network output

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

stack of subimages
(x, y) coordinates of their centers
frame number associated with each subimage

gmm(n_components, covariance='diag', random_state=1, plot_results=False)[source]

Applies Gaussian mixture model to image stack.

Parameters:

n_components (int) – Number of components
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting gmm components

Return type:

Tuple[ndarray, List]

Returns:

3-element tuple containing

4D numpy array with GMM “centroids” (averaged images for each class)
List where each element contains 4D images belonging to each GMM class
2D numpy array with xy coordinates, label and corresponding frame number for each subimage

pca(n_components, random_state=1, plot_results=False)[source]

Computes PCA eigenvectors for a stack of subimages.

Parameters:

n_components (int) – Number of PCA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

4D numpy array with computed and reshaped principal axes
2D numpy with projection of X_vec (vector with flattened subimages) on the first principal components
2D numpy array with center-of-mass coordinates and corresponding frame number for each subimage

ica(n_components, random_state=1, plot_results=False)[source]

Computes ICA independent souces for a stack of subimages.

Parameters:

n_components (int) – Number of ICA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed sources

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

4D numpy array with computed and reshaped independent sources
2D numpy array with recovered sources from X_vec (vector with flattned subimages)
2D numpy aray with center-of-mass coordinates and corresponding frame number for each subimage

nmf(n_components, random_state=1, plot_results=False, **kwargs)[source]

Applies NMF to source separation from a stack of subimages

Parameters:

n_components (int) – Number of NMF components
random_state (int) – Random state instance
plot_results (bool) – Plots computed sources
**max_iterations (int) – Maximum number of iterations before timing out

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

4D numpy array with computed and reshaped sources
2D numpy array with transformed data according to the trained NMF model,
2D numpy aray with center-of-mass coordinates and corresponding frame number for each subimage

pca_gmm(n_components_gmm, n_components_pca, plot_results=False, covariance_type='diag', random_state=1)[source]

Performs PCA decomposition on GMM-unmixed classes. Can be used when GMM allows separating different symmetries (e.g. different sublattices in graphene)

Parameters:

n_components_gmm (int) – Number of components for GMM
n_components_pca (int or list of int) – Number of PCA components. Pass a list of integers in order to have different number PCA of components for each GMM class
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting GMM components

Return type:

Tuple[ndarray, List]

Returns:

4-element tuple containing

4D numpy array with GMM “centroids” (averaged images for each GMM class)
List of 4D numpy arrays with PCA components
List with PCA-transformed data
2D numpy array with xy coordinates, GMM-assigned labels, and corresponding frame numbers

pca_scree_plot(plot_results=True)[source]

Computes and plots PCA ‘scree plot’ (explained variance ratio vs number of components)

Return type:: ndarray

pca_gmm_scree_plot(n_components_gmm, covariance_type='diag', random_state=1, plot_results=True)[source]

Computes PCA scree plot for each GMM class

Parameters:

n_components_gmm (int) – Number of components for GMM
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting GMM components and PCA scree plot

Return type:

List[ndarray]

Returns:

List with PCA explained variances for each GMM component

imblock_pca(n_components, random_state=1, plot_results=False, **kwargs)[source]

Computes PCA eigenvectors and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.

Parameters:

n_components (int) – Number of PCA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**marker_size (int) – Controls marker size for loading maps plot

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

4D numpy array with computed (and reshaped) principal axes
2D numpy array with projection of X_vec (vector with flattened subimages) on the first principal components
2D numpy array with coordinates of each subimage

imblock_ica(n_components, random_state=1, plot_results=False, **kwargs)[source]

Computes ICA independent souces and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.

Parameters:

n_components (int) – Number of ICA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**marker_size (int) – controls marker size for loading maps plot

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

4D numpy array with computed (and reshaped) independent sources
2D numpy array with recovered sources from X_vec (vector with flattened subimages)
2D numpy array with coordinates of each subimage

imblock_nmf(n_components, random_state=1, plot_results=False, **kwargs)[source]

Applies NMF to source separation. Computes sources and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.

Parameters:

n_components (int) – Number of NMF components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**max_iterations (int) – Maximum number of iterations before timing out
**marker_size (int) – Controls marker’s size for loading maps plots

Return type:

Tuple[ndarray]

Returns:

3-element tuple containing

4D numpy array with computed (and reshaped) sources
2D numpy array with transformed X_vec (vector with flattened subimages) according to the trained NMF model
2D numpy array with coordinates of each subimage

classmethod plot_decomposition_results(components, X_vec_t, image_hw=None, xy_centers=None, plot_loading_maps=True, **kwargs)[source]

Plots decomposition “eigenvectors”. Plots loading maps

Parameters:

components (4D numpy array) – Computed (and reshaped) principal axes / independent sources / factorization matrix for stack of subimages
X_vec_t (2D numpy array) – Projection of X_vec on the first principal components / Recovered sources from X_vec / transformed X_vec according to the learned NMF model (is used to create “loading maps”)
img_hw (tuple) – Height and width of the “mother image”
xy_centers (n x 2 numpy array) – (x, y) coordinates of the extracted subimages
plot_loading_maps (bool) – Plots loading maps for each “eigenvector”
**marker_size (int) – Controls marker’s size for loading maps plots

Return type:

None

classmethod get_trajectory(coord_class_dict, start_coord, rmax)[source]

Extracts a trajectory of a single defect/atom from image stack

Parameters:

coord_class_dict (dict) – Dictionary of atomic coordinates (same format as produced by atomnet.locator)
start_coord (N x 2 numpy array) – Coordinate of defect/atom in the first frame whose trajectory we are going to track
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one

Return type:

Tuple[ndarray]

Returns:

2-element tuple containing

Numpy array of defect/atom coordinates form a single trajectory
Frames corresponding to this trajectory

get_all_trajectories(min_length=0, run_gmm=False, rmax=10, **kwargs)[source]

Extracts trajectories for the detected defects starting from the first frame. Applies (optionally) Gaussian mixture model to a stack of local descriptors (subimages).

Parameters:

min_length (int) – Minimal length of trajectory to return
run_gmm (bool) – Optional GMM separation into different classes
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**n_components (int) – Number of components for Gaussian mixture model
**covariance (str) – Type of covariance for Gaussian mixture model (‘full’, ‘diag’, ‘tied’, ‘spherical’)
**random_state (int) – Random state instance for Gaussian mixture model

Return type:

Dict

Returns:

Python dictionary containing

list of numpy arrays with defects/atoms trajectories (“trajectories”)
list of frames corresponding to the extracted trajectories (“frames”)
GMM components when run_gmm=True (“gmm_components”)

classmethod renumerate_classes(classes)[source]

Helper functions for renumerating Gaussian mixture model classes for Markov transition analysis

Return type:: ndarray

transition_matrix(n_components, covariance='diag', random_state=1, rmax=10, min_length=0, sum_all_transitions=False)[source]

Applies Gaussian mixture model to a stack of local descriptors (subimages). Extracts trajectories for the detected defects starting from the first frame. Calculates transition probability for each trajectory.

Parameters:

n_components (int) – Number of components for Gaussian mixture model
covariance (str) – Type of covariance for Gaussian mixture model (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance for Gaussian mixture model
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
min_length (int) – Minimal length of trajectory to return

Return type:

Dict

Returns:

Pyhton dictionary containing

List of defects/atoms trajectories (“trajectories”)
List of transition matrices for each trajectory (“transitions”)
List of frames corresponding to the extracted trajectories (“frames”)
GMM components as images (“gmm_components”)