pyls.behavioral_pls¶

pyls.behavioral_pls(X, Y, *, groups=None, n_cond=1, n_perm=5000, n_boot=5000, n_split=0, test_size=0.25, test_split=100, covariance=False, rotate=True, ci=95, permsamples=None, bootsamples=None, seed=None, verbose=True, n_proc=None, **kwargs)[source]¶

Performs behavioral PLS on X and Y.

Behavioral PLS is a multivariate statistical approach that relates two sets of variables together. Traditionally, one of these arrays represents a set of brain features (e.g., functional connectivity estimates) and the other represents a set of behavioral variables; however, these arrays can be any two sets of features belonging to a common group of samples.

Using a singular value decomposition, behavioral PLS attempts to find linear combinations of features from the provided arrays that maximally covary with each other. The decomposition is performed on the cross- covariance matrix \(R\), where \(R = Y^{T} \times X\), which represents the covariation of all the input features across samples.

Parameters

X ((S, B) array_like) – Input data matrix, where S is samples and B is features
Y ((S, T) array_like) – Input data matrix, where S is samples and T is features
groups ((G,) list of int) – List with the number of subjects present in each of G groups. Input data should be organized as subjects within groups (i.e., groups should be vertically stacked). If there is only one group this can be left blank.
n_cond (int) – Number of conditions observed in data. Note that all subjects must have the same number of conditions. If both conditions and groups are present then the input data should be organized as subjects within conditions within groups (i.e., g1c1s[1-S], g1c2s[1-S], g2c1s[1-S], g2c2s[1-S]).
n_perm (int, optional) –

Number of permutations to use for testing significance of components.
Default: 5000
n_boot (int, optional) – Number of bootstraps to use for testing reliability of data features. Default: 5000
n_split (int, optional) – Number of split-half resamples to assess during permutation testing. Default: 0
test_split (int, optional) – Number of splits for generating test sets during cross-validation. Default: 100
test_size ([0, 1) float, optional) – Proportion of data to partition to test set during cross-validation. Default: 0.25
covariance (bool, optional) – Whether to use the cross-covariance matrix instead of the cross- correlation during the decomposition. Only set if you are sure this is what you want as many of the results may become more difficult to interpret (i.e., behavcorr will no longer be intepretable as Pearson correlation values). Default: False
rotate (bool, optional) – Whether to perform Procrustes rotations during permutation testing. Can inflate false-positive rates; see Kovacevic et al., (2013) for more information. Default: True
ci ([0, 100] float, optional) – Confidence interval to use for assessing bootstrap results. This roughly corresponds to an alpha rate; e.g., the 95%ile CI is approximately equivalent to a two-tailed p <= 0.05. Default: 95
permsamples (array_like, optional) – Re-sampling array to be used during permutation test (if n_perm > 0). If not specified a set of unique permutations will be generated. Default: None
bootsamples (array_like, optional) – Resampling array to be used during bootstrap resampling (if n_boot > 0). If not specified a set of unique bootstraps will be generated. Default: None
seed ({int, numpy.random.RandomState, None}, optional) – Seed to use for random number generation. Helps ensure reproducibility of results. Default: None
verbose (bool, optional) – Whether to show progress bars as the analysis runs. Note that progress bars will not persist after the analysis is completed. Default: True
n_proc (int, optional) – How many processes to use for parallelizing permutation testing and bootstrap resampling. If not specified will default to serialized processing (i.e., one processor). Can optionally specify ‘max’ to use all available processors. Default: None

Returns

results – Dictionary-like object containing results from the PLS analysis

Return type

pyls.structures.PLSResults

Notes

The singular value decomposition generates mutually orthogonal latent variables (LVs), comprised of left and right singular vectors and a diagonal matrix of singular values. The i-th pair of singular vectors detail the contributions of individual input features to an overall, multivariate pattern (the i-th LV), and the singular values explain the amount of variance captured by that pattern.

Statistical significance of the LVs is determined via permutation testing. Bootstrap resampling is used to examine the contribution and reliability of the input features to each LV. Split-half resampling can optionally be used to assess the reliability of the LVs. A cross-validated framework can optionally be used to examine how accurate the decomposition is when employed in a predictive framework.

References

McIntosh, A. R., Bookstein, F. L., Haxby, J. V., & Grady, C. L. (1996). Spatial pattern analysis of functional brain images using partial least squares. NeuroImage, 3(3), 143-157.

McIntosh, A. R., & Lobaugh, N. J. (2004). Partial least squares analysis of neuroimaging data: applications and advances. NeuroImage, 23, S250-S263.

Krishnan, A., Williams, L. J., McIntosh, A. R., & Abdi, H. (2011). Partial Least Squares (PLS) methods for neuroimaging: a tutorial and review. NeuroImage, 56(2), 455-475.

Kovacevic, N., Abdi, H., Beaton, D., & McIntosh, A. R. (2013). Revisiting PLS resampling: comparing significance versus reliability across range of simulations. In New Perspectives in Partial Least Squares and Related Methods (pp. 159-170). Springer, New York, NY. Chicago

Misic, B., Betzel, R. F., de Reus, M. A., van den Heuvel, M.P., Berman, M. G., McIntosh, A. R., & Sporns, O. (2016). Network level structure-function relationships in human neocortex. Cerebral Cortex, 26, 3285-96.