Statistical Analysis
Sliding FFT + NMF
- class atomai.stat.SlidingFFTNMF(window_size_x=None, window_size_y=None, window_step_x=None, window_step_y=None, interpolation_factor=2, zoom_factor=2, hamming_filter=True, components=4)[source]
Bases:
object- analyze_image(image_input, output_path=None)[source]
Full analysis pipeline for an image
Parameters:
- image_inputstr or numpy.ndarray
Either a file path to an image or a numpy array containing image data
- output_pathstr, optional
Path for saving output files. If None, will be auto-generated for file inputs or use current directory for array inputs
Spectral Unmixing
Local image analysis
- class atomai.stat.imlocal(network_output, coord_class_dict_all, window_size=None, coord_class=0)[source]
Bases:
objectClass for extraction and statistical analysis of local image descriptors. It assumes that input image data is an output of a neural network, but it can also work with regular experimental images (make sure you have extra dimensions for channel and batch size).
- Parameters:
network_output (4D numpy array) – Output of a fully convolutional neural network where a class is assigned to every pixel in the input image(s). The dimensions are \(images \times height \times width \times channels\)
coord_class_dict_all (dict) – Prediction from atomnet.locator (can be from other source but must be in the same format) Each element is a \(N \times 3\) numpy array, where N is a number of detected atoms/defects, the first 2 columns are xy coordinates and the third columns is class (starts with 0)
window_size (int) – Side of the square for subimage cropping
coord_class (int) – Class of atoms/defects around around which the subimages will be cropped; in the atomnet.locator output the class is the 3rd column (the first two are xy positions)
Examples:
Identification of distortion domains in a single atomic image:
>>> # First obtain a "cleaned" image and atomic coordinates using a trained model >>> nn_output, coordinates = model.predict(expdata) >>> # Now get local image descriptors using ```atomai.stat.imlocal``` >>> imstack = stat.imlocal(nn_output, coordinates, window_size=32, coord_class=1) >>> # Compute PCA scree plot to estimate the number of components/sources >>> imstack.pca_scree_plot(plot_results=True); >>> # Do PCA analysis and plot results >>> pca_results = imstack.imblock_pca(n_components=4, plot_results=True) >>> # Do NMF analysis and plot results >>> pca_results = imstack.imblock_nmf(n_components=4, plot_results=True)
Analysis of atomic/defect trajectories from movies (3D image stack):
>>> # Get local descriptors (such as subimages centered around impurities) >>> imstack = stat.imlocal(nn_output, coordinates, window_size=32, coord_class=1) >>> # Calculate Gaussian mixture model (GMM) components >>> components_img, classes_list = imstack.gmm(n_components=10, plot_results=True) >>> # Calculate GMM components and transition probabilities for different trajectories >>> traj_all, trans_all, fram_all = imstack.transition_matrix(n_components=10, rmax=10)
- extract_subimages_()[source]
Extracts subimages centered at certain atom class/type in the neural network output
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
stack of subimages
(x, y) coordinates of their centers
frame number associated with each subimage
- gmm(n_components, covariance='diag', random_state=1, plot_results=False)[source]
Applies Gaussian mixture model to image stack.
- Parameters:
n_components (int) – Number of components
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting gmm components
- Return type:
Tuple[ndarray,List]- Returns:
3-element tuple containing
4D numpy array with GMM “centroids” (averaged images for each class)
List where each element contains 4D images belonging to each GMM class
2D numpy array with xy coordinates, label and corresponding frame number for each subimage
- pca(n_components, random_state=1, plot_results=False)[source]
Computes PCA eigenvectors for a stack of subimages.
- Parameters:
n_components (int) – Number of PCA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
4D numpy array with computed and reshaped principal axes
2D numpy with projection of X_vec (vector with flattened subimages) on the first principal components
2D numpy array with center-of-mass coordinates and corresponding frame number for each subimage
- ica(n_components, random_state=1, plot_results=False)[source]
Computes ICA independent souces for a stack of subimages.
- Parameters:
n_components (int) – Number of ICA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed sources
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
4D numpy array with computed and reshaped independent sources
2D numpy array with recovered sources from X_vec (vector with flattned subimages)
2D numpy aray with center-of-mass coordinates and corresponding frame number for each subimage
- nmf(n_components, random_state=1, plot_results=False, **kwargs)[source]
Applies NMF to source separation from a stack of subimages
- Parameters:
n_components (int) – Number of NMF components
random_state (int) – Random state instance
plot_results (bool) – Plots computed sources
**max_iterations (int) – Maximum number of iterations before timing out
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
4D numpy array with computed and reshaped sources
2D numpy array with transformed data according to the trained NMF model,
2D numpy aray with center-of-mass coordinates and corresponding frame number for each subimage
- pca_gmm(n_components_gmm, n_components_pca, plot_results=False, covariance_type='diag', random_state=1)[source]
Performs PCA decomposition on GMM-unmixed classes. Can be used when GMM allows separating different symmetries (e.g. different sublattices in graphene)
- Parameters:
n_components_gmm (int) – Number of components for GMM
n_components_pca (int or list of int) – Number of PCA components. Pass a list of integers in order to have different number PCA of components for each GMM class
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting GMM components
- Return type:
Tuple[ndarray,List]- Returns:
4-element tuple containing
4D numpy array with GMM “centroids” (averaged images for each GMM class)
List of 4D numpy arrays with PCA components
List with PCA-transformed data
2D numpy array with xy coordinates, GMM-assigned labels, and corresponding frame numbers
- pca_scree_plot(plot_results=True)[source]
Computes and plots PCA ‘scree plot’ (explained variance ratio vs number of components)
- Return type:
ndarray
- pca_gmm_scree_plot(n_components_gmm, covariance_type='diag', random_state=1, plot_results=True)[source]
Computes PCA scree plot for each GMM class
- Parameters:
n_components_gmm (int) – Number of components for GMM
covariance (str) – Type of covariance (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance
plot_results (bool) – Plotting GMM components and PCA scree plot
- Return type:
List[ndarray]- Returns:
List with PCA explained variances for each GMM component
- imblock_pca(n_components, random_state=1, plot_results=False, **kwargs)[source]
Computes PCA eigenvectors and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.
- Parameters:
n_components (int) – Number of PCA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**marker_size (int) – Controls marker size for loading maps plot
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
4D numpy array with computed (and reshaped) principal axes
2D numpy array with projection of X_vec (vector with flattened subimages) on the first principal components
2D numpy array with coordinates of each subimage
- imblock_ica(n_components, random_state=1, plot_results=False, **kwargs)[source]
Computes ICA independent souces and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.
- Parameters:
n_components (int) – Number of ICA components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**marker_size (int) – controls marker size for loading maps plot
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
4D numpy array with computed (and reshaped) independent sources
2D numpy array with recovered sources from X_vec (vector with flattened subimages)
2D numpy array with coordinates of each subimage
- imblock_nmf(n_components, random_state=1, plot_results=False, **kwargs)[source]
Applies NMF to source separation. Computes sources and their loading maps for a stack of subimages. Intended to be used for finding domains (“blocks”) (e.g. ferroic domains) in a single image.
- Parameters:
n_components (int) – Number of NMF components
random_state (int) – Random state instance
plot_results (bool) – Plots computed eigenvectors and loading maps
**max_iterations (int) – Maximum number of iterations before timing out
**marker_size (int) – Controls marker’s size for loading maps plots
- Return type:
Tuple[ndarray]- Returns:
3-element tuple containing
4D numpy array with computed (and reshaped) sources
2D numpy array with transformed X_vec (vector with flattened subimages) according to the trained NMF model
2D numpy array with coordinates of each subimage
- classmethod plot_decomposition_results(components, X_vec_t, image_hw=None, xy_centers=None, plot_loading_maps=True, **kwargs)[source]
Plots decomposition “eigenvectors”. Plots loading maps
- Parameters:
components (4D numpy array) – Computed (and reshaped) principal axes / independent sources / factorization matrix for stack of subimages
X_vec_t (2D numpy array) – Projection of X_vec on the first principal components / Recovered sources from X_vec / transformed X_vec according to the learned NMF model (is used to create “loading maps”)
img_hw (tuple) – Height and width of the “mother image”
xy_centers (n x 2 numpy array) – (x, y) coordinates of the extracted subimages
plot_loading_maps (bool) – Plots loading maps for each “eigenvector”
**marker_size (int) – Controls marker’s size for loading maps plots
- Return type:
None
- classmethod get_trajectory(coord_class_dict, start_coord, rmax)[source]
Extracts a trajectory of a single defect/atom from image stack
- Parameters:
coord_class_dict (dict) – Dictionary of atomic coordinates (same format as produced by atomnet.locator)
start_coord (N x 2 numpy array) – Coordinate of defect/atom in the first frame whose trajectory we are going to track
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
- Return type:
Tuple[ndarray]- Returns:
2-element tuple containing
Numpy array of defect/atom coordinates form a single trajectory
Frames corresponding to this trajectory
- get_all_trajectories(min_length=0, run_gmm=False, rmax=10, **kwargs)[source]
Extracts trajectories for the detected defects starting from the first frame. Applies (optionally) Gaussian mixture model to a stack of local descriptors (subimages).
- Parameters:
min_length (int) – Minimal length of trajectory to return
run_gmm (bool) – Optional GMM separation into different classes
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
**n_components (int) – Number of components for Gaussian mixture model
**covariance (str) – Type of covariance for Gaussian mixture model (‘full’, ‘diag’, ‘tied’, ‘spherical’)
**random_state (int) – Random state instance for Gaussian mixture model
- Return type:
Dict- Returns:
Python dictionary containing
list of numpy arrays with defects/atoms trajectories (“trajectories”)
list of frames corresponding to the extracted trajectories (“frames”)
GMM components when run_gmm=True (“gmm_components”)
- classmethod renumerate_classes(classes)[source]
Helper functions for renumerating Gaussian mixture model classes for Markov transition analysis
- Return type:
ndarray
- transition_matrix(n_components, covariance='diag', random_state=1, rmax=10, min_length=0, sum_all_transitions=False)[source]
Applies Gaussian mixture model to a stack of local descriptors (subimages). Extracts trajectories for the detected defects starting from the first frame. Calculates transition probability for each trajectory.
- Parameters:
n_components (int) – Number of components for Gaussian mixture model
covariance (str) – Type of covariance for Gaussian mixture model (‘full’, ‘diag’, ‘tied’, ‘spherical’)
random_state (int) – Random state instance for Gaussian mixture model
rmax (int) – Max allowed distance (projected on xy plane) between defect in one frame and the position of its nearest neigbor in the next one
min_length (int) – Minimal length of trajectory to return
- Return type:
Dict- Returns:
Pyhton dictionary containing
List of defects/atoms trajectories (“trajectories”)
List of transition matrices for each trajectory (“transitions”)
List of frames corresponding to the extracted trajectories (“frames”)
GMM components as images (“gmm_components”)