pyls.behavioral_pls

pyls.behavioral_pls(X, Y, *, groups=None, n_cond=1, n_perm=5000, n_boot=5000, n_split=0, test_size=0.25, test_split=100, covariance=False, rotate=True, ci=95, permsamples=None, bootsamples=None, seed=None, verbose=True, n_proc=None, **kwargs)[source]

Performs behavioral PLS on X and Y.

Behavioral PLS is a multivariate statistical approach that relates two sets of variables together. Traditionally, one of these arrays represents a set of brain features (e.g., functional connectivity estimates) and the other represents a set of behavioral variables; however, these arrays can be any two sets of features belonging to a common group of samples.

Using a singular value decomposition, behavioral PLS attempts to find linear combinations of features from the provided arrays that maximally covary with each other. The decomposition is performed on the cross- covariance matrix \(R\), where \(R = Y^{T} \times X\), which represents the covariation of all the input features across samples.

Parameters
  • X ((S, B) array_like) – Input data matrix, where S is samples and B is features

  • Y ((S, T) array_like) – Input data matrix, where S is samples and T is features

  • groups ((G,) list of int) – List with the number of subjects present in each of G groups. Input data should be organized as subjects within groups (i.e., groups should be vertically stacked). If there is only one group this can be left blank.

  • n_cond (int) – Number of conditions observed in data. Note that all subjects must have the same number of conditions. If both conditions and groups are present then the input data should be organized as subjects within conditions within groups (i.e., g1c1s[1-S], g1c2s[1-S], g2c1s[1-S], g2c2s[1-S]).

  • n_perm (int, optional) –

    Number of permutations to use for testing significance of components.

    Default: 5000

  • n_boot (int, optional) – Number of bootstraps to use for testing reliability of data features. Default: 5000

  • n_split (int, optional) – Number of split-half resamples to assess during permutation testing. Default: 0

  • test_split (int, optional) – Number of splits for generating test sets during cross-validation. Default: 100

  • test_size ([0, 1) float, optional) – Proportion of data to partition to test set during cross-validation. Default: 0.25

  • covariance (bool, optional) – Whether to use the cross-covariance matrix instead of the cross- correlation during the decomposition. Only set if you are sure this is what you want as many of the results may become more difficult to interpret (i.e., behavcorr will no longer be intepretable as Pearson correlation values). Default: False

  • rotate (bool, optional) – Whether to perform Procrustes rotations during permutation testing. Can inflate false-positive rates; see Kovacevic et al., (2013) for more information. Default: True

  • ci ([0, 100] float, optional) – Confidence interval to use for assessing bootstrap results. This roughly corresponds to an alpha rate; e.g., the 95%ile CI is approximately equivalent to a two-tailed p <= 0.05. Default: 95

  • permsamples (array_like, optional) – Re-sampling array to be used during permutation test (if n_perm > 0). If not specified a set of unique permutations will be generated. Default: None

  • bootsamples (array_like, optional) – Resampling array to be used during bootstrap resampling (if n_boot > 0). If not specified a set of unique bootstraps will be generated. Default: None

  • seed ({int, numpy.random.RandomState, None}, optional) – Seed to use for random number generation. Helps ensure reproducibility of results. Default: None

  • verbose (bool, optional) – Whether to show progress bars as the analysis runs. Note that progress bars will not persist after the analysis is completed. Default: True

  • n_proc (int, optional) – How many processes to use for parallelizing permutation testing and bootstrap resampling. If not specified will default to serialized processing (i.e., one processor). Can optionally specify ‘max’ to use all available processors. Default: None

Returns

results – Dictionary-like object containing results from the PLS analysis

Return type

pyls.structures.PLSResults

Notes

The singular value decomposition generates mutually orthogonal latent variables (LVs), comprised of left and right singular vectors and a diagonal matrix of singular values. The i-th pair of singular vectors detail the contributions of individual input features to an overall, multivariate pattern (the i-th LV), and the singular values explain the amount of variance captured by that pattern.

Statistical significance of the LVs is determined via permutation testing. Bootstrap resampling is used to examine the contribution and reliability of the input features to each LV. Split-half resampling can optionally be used to assess the reliability of the LVs. A cross-validated framework can optionally be used to examine how accurate the decomposition is when employed in a predictive framework.

References

McIntosh, A. R., Bookstein, F. L., Haxby, J. V., & Grady, C. L. (1996). Spatial pattern analysis of functional brain images using partial least squares. NeuroImage, 3(3), 143-157.

McIntosh, A. R., & Lobaugh, N. J. (2004). Partial least squares analysis of neuroimaging data: applications and advances. NeuroImage, 23, S250-S263.

Krishnan, A., Williams, L. J., McIntosh, A. R., & Abdi, H. (2011). Partial Least Squares (PLS) methods for neuroimaging: a tutorial and review. NeuroImage, 56(2), 455-475.

Kovacevic, N., Abdi, H., Beaton, D., & McIntosh, A. R. (2013). Revisiting PLS resampling: comparing significance versus reliability across range of simulations. In New Perspectives in Partial Least Squares and Related Methods (pp. 159-170). Springer, New York, NY. Chicago

Misic, B., Betzel, R. F., de Reus, M. A., van den Heuvel, M.P., Berman, M. G., McIntosh, A. R., & Sporns, O. (2016). Network level structure-function relationships in human neocortex. Cerebral Cortex, 26, 3285-96.