utils.pca.plot_pca1vs2

utils.pca.plot_pca1vs2(pca, pca_data, df_name='', figsize=(12, 6))

Create a scatter plot of the first two principal components from PCA.

This function visualizes the first two principal components (PC1 vs PC2) from a PCA analysis, creating a scatter plot where each point represents a sample in the transformed space. The percentage of variance explained by each component is shown on the axes.

Parameters

Name Type Description Default
pca sklearn.decomposition.PCA Fitted PCA object containing the explained variance ratios and components. required
pca_data array - like PCA-transformed data, where each row represents a sample and each column represents a principal component. required
df_name str Name of the dataset to be displayed in the plot title. Defaults to empty string. ''
figsize tuple Size of the figure as (width, height). Defaults to (12, 6). (12, 6)

Returns

Name Type Description
None None The function creates and displays a matplotlib plot.

Examples

>>> import numpy as np
>>> from sklearn.decomposition import PCA
>>> from sklearn.datasets import load_iris
>>> from spotpython.utils.pca import plot_pca1vs2
>>>
>>> # Load and prepare the iris dataset
>>> iris = load_iris()
>>> X = iris.data
>>>
>>> # Fit PCA and transform the data
>>> pca = PCA()
>>> pca_data = pca.fit_transform(X)
>>>
>>> # Create PCA scatter plot
>>> plot_pca1vs2(pca,
...             pca_data,
...             df_name="Iris Dataset",
...             figsize=(10, 5))

Note

  • The function assumes that the input data has at least two principal components
  • Sample names are taken from the index of the created DataFrame
  • The percentage of variance explained is rounded to 1 decimal place