utils.pca.get_pca
utils.pca.get_pca(df, n_components=3)
Scale the numeric data and perform PCA.
Parameters
| df |
pd.DataFrame |
Input DataFrame. |
required |
| n_components |
int |
Number of principal components to compute. Defaults to 3. |
3 |
Returns
| tuple |
tuple |
- pca (PCA): Fitted PCA object. - scaled_data (np.ndarray): Scaled numeric data. - feature_names (pd.Index): Names of the numeric features. - sample_names (pd.Index): Index of the samples. - pca_data (np.ndarray): PCA-transformed data. |
Examples
>>> import pandas as pd
>>> from spotpython.utils.pca import get_pca
>>> df = pd.DataFrame({
... "A": [1, 2, 3],
... "B": [4, 5, 6],
... "C": ["x", "y", "z"] # Non-numeric column will be ignored
... })
>>> pca, scaled_data, feature_names, sample_names, pca_data = get_pca(df)
>>> print(feature_names)
Index(['A', 'B'], dtype='object')
>>> print(pca_data.shape)
(3, 2)