utils.pca.plot_pca1vs2
utils.pca.plot_pca1vs2(pca, pca_data, df_name='', figsize=(12, 6))Create a scatter plot of the first two principal components from PCA.
This function visualizes the first two principal components (PC1 vs PC2) from a PCA analysis, creating a scatter plot where each point represents a sample in the transformed space. The percentage of variance explained by each component is shown on the axes.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| pca | sklearn.decomposition.PCA |
Fitted PCA object containing the explained variance ratios and components. | required |
| pca_data | array - like |
PCA-transformed data, where each row represents a sample and each column represents a principal component. | required |
| df_name | str | Name of the dataset to be displayed in the plot title. Defaults to empty string. | '' |
| figsize | tuple | Size of the figure as (width, height). Defaults to (12, 6). | (12, 6) |
Returns
| Name | Type | Description |
|---|---|---|
| None | None | The function creates and displays a matplotlib plot. |
Examples
>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> from sklearn.datasets import load_iris
>>> from spotpython.utils.pca import plot_pca1vs2
>>>
>>> # Load and prepare the iris dataset
>>> iris = load_iris()
>>> X = iris.data
>>>
>>> # Fit PCA and transform the data
>>> pca = PCA()
>>> pca_data = pca.fit_transform(X)
>>>
>>> # Create PCA scatter plot
>>> plot_pca1vs2(pca,
... pca_data,
... df_name="Iris Dataset",
... figsize=(10, 5))Note
- The function assumes that the input data has at least two principal components
- Sample names are taken from the index of the created DataFrame
- The percentage of variance explained is rounded to 1 decimal place