AVS 57th International Symposium & Exhibition
    Applied Surface Science Thursday Sessions
       Session AS1-ThM

Invited Paper AS1-ThM3
Strategies for Multivariate Analysis of Very Large Spectral Images

Thursday, October 21, 2010, 8:40 am, Room Cochiti

Session: Advanced Automation and Data Processing
Presenter: M.R. Keenan, Consultant
Correspondent: Click to Email

The sizes of spectral image data sets, always large, are becoming truly huge with modern spectral imaging techniques. Taking ToF-SIMS as one example, image depth profiling can yield data sets comprising several million individual mass spectra arrayed in three spatial dimensions. Spectral complexity is also increasing, particularly in biological applications where more mass channels and higher spectral resolution are required to separate and identify the species of interest. The tools of multivariate statistical analysis (MVA) have proven valuable aids to interpreting complex, high-dimensional data. Given the realities of huge data sets, however, straightforward application of these techniques strains the computing resources available in the typical analytical laboratory. In this paper, we propose a two-stage strategy for multivariate analysis of very large spectral images. In the extraction phase, we seek to efficiently distill the chemical information contained in the data into a minimum number of components that describe the spatial and spectral characteristics the species making up the sample. Principal Component Analysis (PCA) of data suitably preprocessed to account for non-uniform noise is the maximally parsimonious method for extracting information. Techniques for exploiting characteristics of the raw data, such as sparsity, and approaches to estimating the noise covariance on-the-fly can make order-of-magnitude computational improvements in PCA. Owing to the physically irrelevant constraints imposed on the principal components, however, they are notoriously abstract in appearance and difficult to interpret. In the second, or interpretive, stage of MVA, we will perform rotations or transformations of the principal components that are inspired by physically meaningful sample or spectral features such as component non-negativity, sparsity, independence and simplicity. Abstract factor rotations, such as the Varimax method, are time-honored tools in Factor Analysis, but appear to be underutilized in chemometrics. In this talk, we will discuss a general and rapid method for performing factor rotations based on arbitrary optimization criteria. Besides making a connection between factor rotation and seemingly disparate techniques such as Independent Component Analysis (ICA) and Maximum Autocorrelation Factors (MAF), we will present several novel rotations that have potential use in spectral image analysis. An important point, here, is that the rotations entail relatively low computational cost allowing us to examine our results from multiple points of view with an eye toward find representations that best help us solve the chemical problem at hand.