2. Mean-centered PLS¶
In contrast to behavioral PLS, mean-centered PLS doesn’t aim to find relationships between two sets of variables. Instead, it tries to find relationships between groupings in a single set of variables. Indeed, you can think of it almost like a multivariate t-test or ANOVA (depending on how many groups you have).
2.1. An oenological example¶
>>> from pyls.examples import load_dataset
>>> data = load_dataset('wine')
This is the same dataset as in sklearn.datasets.load_wine()
; the
formatting has just been lightly modified to better suit our purposes.
Our data
object can be treated as a dictionary, containing all the
information necessary to run a PLS analysis. The keys can be accessed as
attributes, so we can take a quick look at our input matrix:
>>> sorted(data.keys())
['X', 'groups', 'n_boot', 'n_perm']
>>> data.X.shape
(178, 13)
>>> data.X.columns
Index(['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium',
'total_phenols', 'flavanoids', 'nonflavanoid_phenols',
'proanthocyanins', 'color_intensity', 'hue',
'od280/od315_of_diluted_wines', 'proline'],
dtype='object')
>>> data.groups
[59, 71, 48]