utils.pca.plot_pca1vs2

utils.pca.plot_pca1vs2(pca, pca_data, df_name='', figsize=(12, 6))

Create a scatter plot of the first two principal components from PCA.

This function visualizes the first two principal components (PC1 vs PC2) from a PCA analysis, creating a scatter plot where each point represents a sample in the transformed space. The percentage of variance explained by each component is shown on the axes.

Parameters

Name	Type	Description	Default
pca	`sklearn`.`decomposition`.`PCA`	Fitted PCA object containing the explained variance ratios and components.	required
pca_data	array - `like`	PCA-transformed data, where each row represents a sample and each column represents a principal component.	required
df_name	str	Name of the dataset to be displayed in the plot title. Defaults to empty string.	`''`
figsize	tuple	Size of the figure as (width, height). Defaults to (12, 6).	`(12, 6)`

Returns

Name	Type	Description
None	None	The function creates and displays a matplotlib plot.

Examples

>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> from sklearn.datasets import load_iris
>>> from spotpython.utils.pca import plot_pca1vs2
>>>
>>> # Load and prepare the iris dataset
>>> iris = load_iris()
>>> X = iris.data
>>>
>>> # Fit PCA and transform the data
>>> pca = PCA()
>>> pca_data = pca.fit_transform(X)
>>>
>>> # Create PCA scatter plot
>>> plot_pca1vs2(pca,
...             pca_data,
...             df_name="Iris Dataset",
...             figsize=(10, 5))

Note

The function assumes that the input data has at least two principal components
Sample names are taken from the index of the created DataFrame
The percentage of variance explained is rounded to 1 decimal place