pyls.meancentered_pls

pyls.meancentered_pls(X, *, groups=None, n_cond=1, mean_centering=0, n_perm=5000, n_boot=5000, n_split=0, rotate=True, ci=95, permsamples=None, bootsamples=None, seed=None, verbose=True, n_proc=None, **kwargs)[source]

Performs mean-centered PLS on X, sorted into groups and conditions.

Mean-centered PLS is a multivariate statistical approach that attempts to find sets of variables in a matrix which maximally discriminate between subgroups within the matrix.

While it carries the name PLS, mean-centered PLS is perhaps more related to principal components analysis than it is to pyls.behavioral_pls. In contrast to behavioral PLS, mean-centered PLS does not construct a cross- covariance matrix. Instead, it operates by averaging the provided data (X) within groups and/or conditions. The resultant matrix \(M\) is mean-centered, generating a new matrix \(R_{mean\_centered}\) which is submitted to singular value decomposition.

Parameters
  • X ((S, B) array_like) – Input data matrix, where S is samples and B is features

  • groups ((G,) list of int) – List with the number of subjects present in each of G groups. Input data should be organized as subjects within groups (i.e., groups should be vertically stacked). If there is only one group this can be left blank.

  • n_cond (int) – Number of conditions observed in data. Note that all subjects must have the same number of conditions. If both conditions and groups are present then the input data should be organized as subjects within conditions within groups (i.e., g1c1s[1-S], g1c2s[1-S], g2c1s[1-S], g2c2s[1-S]).

  • mean_centering ({0, 1, 2}, optional) – Mean-centering method to use. This will determine how the mean-centered matrix is generated and what effects are “boosted” during the SVD. Default: 0

  • n_perm (int, optional) –

    Number of permutations to use for testing significance of components.

    Default: 5000

  • n_boot (int, optional) – Number of bootstraps to use for testing reliability of data features. Default: 5000

  • n_split (int, optional) – Number of split-half resamples to assess during permutation testing. Default: 0

  • rotate (bool, optional) – Whether to perform Procrustes rotations during permutation testing. Can inflate false-positive rates; see Kovacevic et al., (2013) for more information. Default: True

  • ci ([0, 100] float, optional) – Confidence interval to use for assessing bootstrap results. This roughly corresponds to an alpha rate; e.g., the 95%ile CI is approximately equivalent to a two-tailed p <= 0.05. Default: 95

  • permsamples (array_like, optional) – Re-sampling array to be used during permutation test (if n_perm > 0). If not specified a set of unique permutations will be generated. Default: None

  • bootsamples (array_like, optional) – Resampling array to be used during bootstrap resampling (if n_boot > 0). If not specified a set of unique bootstraps will be generated. Default: None

  • seed ({int, numpy.random.RandomState, None}, optional) – Seed to use for random number generation. Helps ensure reproducibility of results. Default: None

  • verbose (bool, optional) – Whether to show progress bars as the analysis runs. Note that progress bars will not persist after the analysis is completed. Default: True

  • n_proc (int, optional) – How many processes to use for parallelizing permutation testing and bootstrap resampling. If not specified will default to serialized processing (i.e., one processor). Can optionally specify ‘max’ to use all available processors. Default: None

Returns

results – Dictionary-like object containing results from the PLS analysis

Return type

pyls.structures.PLSResults

Notes

The provided mean_centering argument can be changed to highlight or “boost” potential group / condition differences by modifying how \(R_{mean\_centered}\) is generated:

  • mean_centering=0 will remove group means collapsed across conditions, emphasizing potential differences between conditions while removing overall group differences

  • mean_centering=1 will remove condition means collapsed across groups, emphasizing potential differences between groups while removing overall condition differences

  • mean_centering=2 will remove the grand mean collapsed across both groups _and_ conditions, permitting investigation of the full spectrum of potential group and condition effects.

The singular value decomposition generates mutually orthogonal latent variables (LVs), comprised of left and right singular vectors and a diagonal matrix of singular values. The i-th pair of singular vectors detail the contributions of individual input features to an overall, multivariate pattern (the i-th LV), and the singular values explain the amount of variance captured by that pattern.

Statistical significance of the LVs is determined via permutation testing. Bootstrap resampling is used to examine the contribution and reliability of the input features to each LV. Split-half resampling can optionally be used to assess the reliability of the LVs. A cross-validated framework can optionally be used to examine how accurate the decomposition is when employed in a predictive framework.

References

McIntosh, A. R., Bookstein, F. L., Haxby, J. V., & Grady, C. L. (1996). Spatial pattern analysis of functional brain images using partial least squares. NeuroImage, 3(3), 143-157.

McIntosh, A. R., & Lobaugh, N. J. (2004). Partial least squares analysis of neuroimaging data: applications and advances. NeuroImage, 23, S250-S263.

Krishnan, A., Williams, L. J., McIntosh, A. R., & Abdi, H. (2011). Partial Least Squares (PLS) methods for neuroimaging: a tutorial and review. NeuroImage, 56(2), 455-475.

Kovacevic, N., Abdi, H., Beaton, D., & McIntosh, A. R. (2013). Revisiting PLS resampling: comparing significance versus reliability across range of simulations. In New Perspectives in Partial Least Squares and Related Methods (pp. 159-170). Springer, New York, NY. Chicago