factor_analyzer.confirmatory_factor_analyzer

factor_analyzer.confirmatory_factor_analyzer

Confirmatory factor analysis using machine learning methods.

Classes

Name	Description
ConfirmatoryFactorAnalyzer	Fit a confirmatory factor analysis model using maximum likelihood.
ModelSpecification	Encapsulate the model specification for CFA.
ModelSpecificationParser	Generate the model specification for CFA.

ConfirmatoryFactorAnalyzer

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer(
    specification=None,
    n_obs=None,
    is_cov_matrix=False,
    bounds=None,
    max_iter=200,
    tol=None,
    impute='median',
    disp=True,
)

Fit a confirmatory factor analysis model using maximum likelihood.

Parameters

Name	Type	Description	Default
specification	ModelSpecification	A model specification. This must be a :class:`ModelSpecification` object or `None`. If `None`, a :class:`ModelSpecification` object will be generated assuming that `n_factors` == `n_variables`, and that all variables load on all factors. Note that this could mean the factor model is not identified, and the optimization could fail. Defaults to `None`.	`None`
n_obs	int	The number of observations in the original data set. If this is not passed and `is_cov_matrix` is `True`, then an error will be raised. Defaults to `None`.	`None`
is_cov_matrix	bool	Whether the input `X` is a covariance matrix. If `False`, assume it is the full data set. Defaults to `False`.	`False`
bounds	list of tuples	A list of minimum and maximum boundaries for each element of the input array. This must equal `x0`, which is the input array from your parsed and combined model specification. The length is: ((n_factors * n_variables) + n_variables + n_factors + (((n_factors * n_factors) - n_factors) // 2). If `None`, nothing will be bounded. Defaults to `None`.	`None`
max_iter	int	The maximum number of iterations for the optimization routine. Defaults to 200.	`200`
tol	float	The tolerance for convergence. Defaults to `None`.	`None`
disp	bool	Whether to print the scipy optimization `fmin` message to standard output. Defaults to `True`.	`True`

Raises

Name	Type	Description
	ValueError	If `is_cov_matrix` is `True`, and `n_obs` is not provided.

Attributes

Name	Type	Description
model	ModelSpecification	The model specification object.
loadings_	numpy.ndarray	The factor loadings matrix. `None`, if `fit()` has not been called.
error_vars_	numpy.ndarray	The error variance matrix.
factor_varcovs_	numpy.ndarray	The factor covariance matrix.
log_likelihood_	float	The log likelihood from the optimization routine.
aic_	float	The Akaike information criterion.
bic_	float	The Bayesian information criterion.

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.loadings_, 2))
# array([[0.99, 0.  ],
#        [0.46, 0.  ],
#        [0.35, 0.  ],
#        [0.58, 0.  ],
#        [0.  , 0.99],
#        [0.  , 0.73],
#        [0.  , 0.38],
#        [0.  , 0.5 ]])

print(np.round(cfa.factor_varcovs_, 2))
# array([[1.  , 0.17],
#        [0.17, 1.  ]])

loadings_se, variances_se = cfa.get_standard_errors()
print(np.round(loadings_se, 2))
# array([[0.07, 0.  ],
#        [0.04, 0.  ],
#        [0.04, 0.  ],
#        [0.05, 0.  ],
#        [0.  , 0.06],
#        [0.  , 0.05],
#        [0.  , 0.04],
#        [0.  , 0.04]])

print(np.round(variances_se, 2))
# array([0.12, 0.05, 0.05, 0.06, 0.1 , 0.07, 0.05, 0.05])

print(np.round(cfa.transform(X.values), 2))
# array([[-0.47, -1.09],
#        [ 2.59,  1.2 ],
#        [-0.47,  2.66],
#        ...,
#        [-1.59, -0.92],
#        [ 0.19,  0.88],
#        [-0.28, -0.77]])

[[0.99 0.  ]
 [0.46 0.  ]
 [0.35 0.  ]
 [0.58 0.  ]
 [0.   0.99]
 [0.   0.73]
 [0.   0.38]
 [0.   0.5 ]]
[[1.   0.17]
 [0.17 1.  ]]
[[0.07 0.  ]
 [0.04 0.  ]
 [0.04 0.  ]
 [0.05 0.  ]
 [0.   0.06]
 [0.   0.05]
 [0.   0.04]
 [0.   0.04]]
[0.12 0.05 0.05 0.06 0.1  0.07 0.05 0.05]
[[-0.47 -1.09]
 [ 2.59  1.2 ]
 [-0.47  2.66]
 ...
 [-1.59 -0.92]
 [ 0.19  0.88]
 [-0.28 -0.77]]

Methods

Name	Description
fit	Perform confirmatory factor analysis.
get_model_implied_cov	Get the model-implied covariance matrix (sigma) for an estimated model.
get_standard_errors	Get standard errors from the implied covariance matrix and implied means.
transform	Get the factor scores for a new data set.

fit

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.fit(
    X,
    y=None,
)

Perform confirmatory factor analysis.

Parameters

Name	Type	Description	Default
X	array - `like`	The data to use for confirmatory factor analysis. If this is just a covariance matrix, make sure `is_cov_matrix` was set to `True`.	required
y	`ignored`	Ignored.	`None`

Returns

Name	Type	Description
self	ConfirmatoryFactorAnalyzer	The fitted confirmatory factor analyzer object.

Raises

Name	Type	Description
	ValueError	If the specification is not None or a :class:`ModelSpecification` object.
	AssertionError	If `is_cov_matrix` was `True` and the matrix is not square.
	AssertionError	If `len(bounds)` != `len(x0)`

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.loadings_, 2))
# array([[0.99, 0.  ],
#        [0.46, 0.  ],
#        [0.35, 0.  ],
#        [0.58, 0.  ],
#        [0.  , 0.99],
#        [0.  , 0.73],
#        [0.  , 0.38],
#        [0.  , 0.5 ]])

[[0.99 0.  ]
 [0.46 0.  ]
 [0.35 0.  ]
 [0.58 0.  ]
 [0.   0.99]
 [0.   0.73]
 [0.   0.38]
 [0.   0.5 ]]

get_model_implied_cov

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.get_model_implied_cov(
)

Get the model-implied covariance matrix (sigma) for an estimated model.

Returns

Name	Type	Description
model_implied_cov	numpy.ndarray	The model-implied covariance matrix.

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.get_model_implied_cov(), 2))
# array([[2.08, 0.46, 0.35, 0.58, 0.17, 0.13, 0.06, 0.09],
#        [0.46, 1.17, 0.16, 0.27, 0.08, 0.06, 0.03, 0.04],
#        [0.35, 0.16, 1.07, 0.2 , 0.06, 0.04, 0.02, 0.03],
#        [0.58, 0.27, 0.2 , 1.29, 0.1 , 0.07, 0.04, 0.05],
#        [0.17, 0.08, 0.06, 0.1 , 2.04, 0.72, 0.37, 0.49],
#        [0.13, 0.06, 0.04, 0.07, 0.72, 1.48, 0.28, 0.37],
#        [0.06, 0.03, 0.02, 0.04, 0.37, 0.28, 1.12, 0.19],
#        [0.09, 0.04, 0.03, 0.05, 0.49, 0.37, 0.19, 1.29]])

[[2.08 0.46 0.35 0.58 0.17 0.13 0.06 0.09]
 [0.46 1.17 0.16 0.27 0.08 0.06 0.03 0.04]
 [0.35 0.16 1.07 0.2  0.06 0.04 0.02 0.03]
 [0.58 0.27 0.2  1.29 0.1  0.07 0.04 0.05]
 [0.17 0.08 0.06 0.1  2.04 0.72 0.37 0.49]
 [0.13 0.06 0.04 0.07 0.72 1.48 0.28 0.37]
 [0.06 0.03 0.02 0.04 0.37 0.28 1.12 0.19]
 [0.09 0.04 0.03 0.05 0.49 0.37 0.19 1.29]]

get_standard_errors

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.get_standard_errors(
)

Get standard errors from the implied covariance matrix and implied means.

Returns

Name	Type	Description
tuple	Tuple[np.ndarray, np.ndarray]	- loadings_se (numpy.ndarray): The standard errors for the factor loadings. - error_vars_se (numpy.ndarray): The standard errors for the error variances.

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
loadings_se, variances_se = cfa.get_standard_errors()
print(np.round(loadings_se, 2))
# array([[0.07, 0.  ],
#        [0.04, 0.  ],
#        [0.04, 0.  ],
#        [0.05, 0.  ],
#        [0.  , 0.06],
#        [0.  , 0.05],
#        [0.  , 0.04],
#        [0.  , 0.04]])

print(np.round(variances_se, 2))
# array([0.12, 0.05, 0.05, 0.06, 0.1 , 0.07, 0.05, 0.05])

[[0.07 0.  ]
 [0.04 0.  ]
 [0.04 0.  ]
 [0.05 0.  ]
 [0.   0.06]
 [0.   0.05]
 [0.   0.04]
 [0.   0.04]]
[0.12 0.05 0.05 0.06 0.1  0.07 0.05 0.05]

transform

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.transform(
    X,
)

Get the factor scores for a new data set.

Parameters

Name	Type	Description	Default
X	array - `like`	The data to score using the fitted factor model, shape (`n_samples`, `n_features`).	required

Returns

Name	Type	Description
scores	numpy.ndarray	The latent variables of X, shape (`n_samples`, `n_components`).

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.transform(X.values), 2))
# array([[-0.47, -1.09],
#        [ 2.59,  1.2 ],
#        [-0.47,  2.66],
#        ...,
#        [-1.59, -0.92],
#        [ 0.19,  0.88],
#        [-0.28, -0.77]])

[[-0.47 -1.09]
 [ 2.59  1.2 ]
 [-0.47  2.66]
 ...
 [-1.59 -0.92]
 [ 0.19  0.88]
 [-0.28 -0.77]]

References

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157408/

ModelSpecification

factor_analyzer.confirmatory_factor_analyzer.ModelSpecification(
    loadings,
    n_factors,
    n_variables,
    factor_names=None,
    variable_names=None,
)

Encapsulate the model specification for CFA.

This class contains a number of specification properties that are used in the CFA procedure.

Parameters

Name	Type	Description	Default
loadings	array - `like`	The factor loadings specification.	required
n_factors	int	The number of factors.	required
n_variables	int	The number of variables.	required
factor_names	list of str	A list of factor names, if available. Defaults to `None`.	`None`
variable_names	list of str	A list of variable names, if available. Defaults to `None`.	`None`

Attributes

Name	Description
error_vars	Get the error variance specification.
error_vars_free	Get the indices of “free” error variance parameters.
factor_covs	Get the factor covariance specification.
factor_covs_free	Get the indices of “free” factor covariance parameters.
factor_names	Get list of factor names, if available.
loadings	Get the factor loadings specification.
loadings_free	Get the indices of “free” factor loading parameters.
n_factors	Get the number of factors.
n_lower_diag	Get the lower diagonal of the factor covariance matrix.
n_variables	Get the number of variables.
variable_names	Get list of variable names, if available.

Methods

Name	Description
copy	Return a copy of the model specification.
get_model_specification_as_dict	Get the model specification as a dictionary.

copy

factor_analyzer.confirmatory_factor_analyzer.ModelSpecification.copy()

Return a copy of the model specification.

get_model_specification_as_dict

factor_analyzer.confirmatory_factor_analyzer.ModelSpecification.get_model_specification_as_dict(
)

Get the model specification as a dictionary.

Returns

Name	Type	Description
model_specification	dict	The model specification keys and values, as a dictionary.

ModelSpecificationParser

factor_analyzer.confirmatory_factor_analyzer.ModelSpecificationParser()

Generate the model specification for CFA.

This class includes two static methods to generate a :class:ModelSpecification object from either a dictionary or a numpy array.

Methods

Name	Description
parse_model_specification_from_array	Generate the model specification from a numpy array.
parse_model_specification_from_dict	Generate the model specification from a dictionary.

parse_model_specification_from_array

factor_analyzer.confirmatory_factor_analyzer.ModelSpecificationParser.parse_model_specification_from_array(
    X,
    specification=None,
)

Generate the model specification from a numpy array.

The columns should correspond to the factors, and the rows should correspond to the variables. If this method is used to create the :class:ModelSpecification object, then no factor names and variable names will be added as properties to that object.

Parameters

Name	Type	Description	Default
X	array - `like`	The data set that will be used for CFA.	required
specification	array - `like`	An array with the loading details. If `None`, the matrix will be created assuming all variables load on all factors. Defaults to `None`.	`None`

Returns

Name	Type	Description
ModelSpecification	ModelSpecification	A model specification object.

Raises

Name	Type	Description
	ValueError	If `specification` is not in the expected format.

Examples

import pandas as pd
import numpy as np
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_array = np.array([[1, 1, 1, 1, 0, 0, 0, 0], [0, 0, 0, 0, 1, 1, 1, 1]])
model_spec = ModelSpecificationParser.parse_model_specification_from_array(X,
                                                                             model_array)

parse_model_specification_from_dict

factor_analyzer.confirmatory_factor_analyzer.ModelSpecificationParser.parse_model_specification_from_dict(
    X,
    specification=None,
)

Generate the model specification from a dictionary.

The keys in the dictionary should be the factor names, and the values should be the feature names. If this method is used to create the :class:ModelSpecification object, then factor names and variable names will be added as properties to that object.

Parameters

Name	Type	Description	Default
X	array - `like`	The data set that will be used for CFA.	required
specification	dict	A dictionary with the loading details. If `None`, the matrix will be created assuming all variables load on all factors. Defaults to `None`.	`None`

Returns

Name	Type	Description
ModelSpecification	ModelSpecification	A model specification object.

Raises

Name	Type	Description
	ValueError	If `specification` is not in the expected format.

Examples

import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)