factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer(
    specification=None,
    n_obs=None,
    is_cov_matrix=False,
    bounds=None,
    max_iter=200,
    tol=None,
    impute='median',
    disp=True,
)

Fit a confirmatory factor analysis model using maximum likelihood.

Parameters

Name Type Description Default
specification ModelSpecification A model specification. This must be a :class:ModelSpecification object or None. If None, a :class:ModelSpecification object will be generated assuming that n_factors == n_variables, and that all variables load on all factors. Note that this could mean the factor model is not identified, and the optimization could fail. Defaults to None. None
n_obs int The number of observations in the original data set. If this is not passed and is_cov_matrix is True, then an error will be raised. Defaults to None. None
is_cov_matrix bool Whether the input X is a covariance matrix. If False, assume it is the full data set. Defaults to False. False
bounds list of tuples A list of minimum and maximum boundaries for each element of the input array. This must equal x0, which is the input array from your parsed and combined model specification. The length is: ((n_factors * n_variables) + n_variables + n_factors + (((n_factors * n_factors) - n_factors) // 2). If None, nothing will be bounded. Defaults to None. None
max_iter int The maximum number of iterations for the optimization routine. Defaults to 200. 200
tol float The tolerance for convergence. Defaults to None. None
disp bool Whether to print the scipy optimization fmin message to standard output. Defaults to True. True

Raises

Name Type Description
ValueError If is_cov_matrix is True, and n_obs is not provided.

Attributes

Name Type Description
model ModelSpecification The model specification object.
loadings_ numpy.ndarray The factor loadings matrix. None, if fit() has not been called.
error_vars_ numpy.ndarray The error variance matrix.
factor_varcovs_ numpy.ndarray The factor covariance matrix.
log_likelihood_ float The log likelihood from the optimization routine.
aic_ float The Akaike information criterion.
bic_ float The Bayesian information criterion.

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.loadings_, 2))
# array([[0.99, 0.  ],
#        [0.46, 0.  ],
#        [0.35, 0.  ],
#        [0.58, 0.  ],
#        [0.  , 0.99],
#        [0.  , 0.73],
#        [0.  , 0.38],
#        [0.  , 0.5 ]])

print(np.round(cfa.factor_varcovs_, 2))
# array([[1.  , 0.17],
#        [0.17, 1.  ]])

loadings_se, variances_se = cfa.get_standard_errors()
print(np.round(loadings_se, 2))
# array([[0.07, 0.  ],
#        [0.04, 0.  ],
#        [0.04, 0.  ],
#        [0.05, 0.  ],
#        [0.  , 0.06],
#        [0.  , 0.05],
#        [0.  , 0.04],
#        [0.  , 0.04]])

print(np.round(variances_se, 2))
# array([0.12, 0.05, 0.05, 0.06, 0.1 , 0.07, 0.05, 0.05])

print(np.round(cfa.transform(X.values), 2))
# array([[-0.47, -1.09],
#        [ 2.59,  1.2 ],
#        [-0.47,  2.66],
#        ...,
#        [-1.59, -0.92],
#        [ 0.19,  0.88],
#        [-0.28, -0.77]])
[[0.99 0.  ]
 [0.46 0.  ]
 [0.35 0.  ]
 [0.58 0.  ]
 [0.   0.99]
 [0.   0.73]
 [0.   0.38]
 [0.   0.5 ]]
[[1.   0.17]
 [0.17 1.  ]]
[[0.07 0.  ]
 [0.04 0.  ]
 [0.04 0.  ]
 [0.05 0.  ]
 [0.   0.06]
 [0.   0.05]
 [0.   0.04]
 [0.   0.04]]
[0.12 0.05 0.05 0.06 0.1  0.07 0.05 0.05]
[[-0.47 -1.09]
 [ 2.59  1.2 ]
 [-0.47  2.66]
 ...
 [-1.59 -0.92]
 [ 0.19  0.88]
 [-0.28 -0.77]]

Methods

Name Description
fit Perform confirmatory factor analysis.
get_model_implied_cov Get the model-implied covariance matrix (sigma) for an estimated model.
get_standard_errors Get standard errors from the implied covariance matrix and implied means.
transform Get the factor scores for a new data set.

fit

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.fit(
    X,
    y=None,
)

Perform confirmatory factor analysis.

Parameters

Name Type Description Default
X array - like The data to use for confirmatory factor analysis. If this is just a covariance matrix, make sure is_cov_matrix was set to True. required
y ignored Ignored. None

Returns

Name Type Description
self ConfirmatoryFactorAnalyzer The fitted confirmatory factor analyzer object.

Raises

Name Type Description
ValueError If the specification is not None or a :class:ModelSpecification object.
AssertionError If is_cov_matrix was True and the matrix is not square.
AssertionError If len(bounds) != len(x0)

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.loadings_, 2))
# array([[0.99, 0.  ],
#        [0.46, 0.  ],
#        [0.35, 0.  ],
#        [0.58, 0.  ],
#        [0.  , 0.99],
#        [0.  , 0.73],
#        [0.  , 0.38],
#        [0.  , 0.5 ]])
[[0.99 0.  ]
 [0.46 0.  ]
 [0.35 0.  ]
 [0.58 0.  ]
 [0.   0.99]
 [0.   0.73]
 [0.   0.38]
 [0.   0.5 ]]

get_model_implied_cov

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.get_model_implied_cov(
)

Get the model-implied covariance matrix (sigma) for an estimated model.

Returns

Name Type Description
model_implied_cov numpy.ndarray The model-implied covariance matrix.

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.get_model_implied_cov(), 2))
# array([[2.08, 0.46, 0.35, 0.58, 0.17, 0.13, 0.06, 0.09],
#        [0.46, 1.17, 0.16, 0.27, 0.08, 0.06, 0.03, 0.04],
#        [0.35, 0.16, 1.07, 0.2 , 0.06, 0.04, 0.02, 0.03],
#        [0.58, 0.27, 0.2 , 1.29, 0.1 , 0.07, 0.04, 0.05],
#        [0.17, 0.08, 0.06, 0.1 , 2.04, 0.72, 0.37, 0.49],
#        [0.13, 0.06, 0.04, 0.07, 0.72, 1.48, 0.28, 0.37],
#        [0.06, 0.03, 0.02, 0.04, 0.37, 0.28, 1.12, 0.19],
#        [0.09, 0.04, 0.03, 0.05, 0.49, 0.37, 0.19, 1.29]])
[[2.08 0.46 0.35 0.58 0.17 0.13 0.06 0.09]
 [0.46 1.17 0.16 0.27 0.08 0.06 0.03 0.04]
 [0.35 0.16 1.07 0.2  0.06 0.04 0.02 0.03]
 [0.58 0.27 0.2  1.29 0.1  0.07 0.04 0.05]
 [0.17 0.08 0.06 0.1  2.04 0.72 0.37 0.49]
 [0.13 0.06 0.04 0.07 0.72 1.48 0.28 0.37]
 [0.06 0.03 0.02 0.04 0.37 0.28 1.12 0.19]
 [0.09 0.04 0.03 0.05 0.49 0.37 0.19 1.29]]

get_standard_errors

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.get_standard_errors(
)

Get standard errors from the implied covariance matrix and implied means.

Returns

Name Type Description
tuple Tuple[np.ndarray, np.ndarray] - loadings_se (numpy.ndarray): The standard errors for the factor loadings. - error_vars_se (numpy.ndarray): The standard errors for the error variances.

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
loadings_se, variances_se = cfa.get_standard_errors()
print(np.round(loadings_se, 2))
# array([[0.07, 0.  ],
#        [0.04, 0.  ],
#        [0.04, 0.  ],
#        [0.05, 0.  ],
#        [0.  , 0.06],
#        [0.  , 0.05],
#        [0.  , 0.04],
#        [0.  , 0.04]])

print(np.round(variances_se, 2))
# array([0.12, 0.05, 0.05, 0.06, 0.1 , 0.07, 0.05, 0.05])
[[0.07 0.  ]
 [0.04 0.  ]
 [0.04 0.  ]
 [0.05 0.  ]
 [0.   0.06]
 [0.   0.05]
 [0.   0.04]
 [0.   0.04]]
[0.12 0.05 0.05 0.06 0.1  0.07 0.05 0.05]

transform

factor_analyzer.confirmatory_factor_analyzer.ConfirmatoryFactorAnalyzer.transform(
    X,
)

Get the factor scores for a new data set.

Parameters

Name Type Description Default
X array - like The data to score using the fitted factor model, shape (n_samples, n_features). required

Returns

Name Type Description
scores numpy.ndarray The latent variables of X, shape (n_samples, n_components).

Examples

import numpy as np
import pandas as pd
import os
from spotoptim.factor_analyzer import (ConfirmatoryFactorAnalyzer,
                                      ModelSpecificationParser)
from spotoptim.utils import get_internal_datasets_folder
X = pd.read_csv(os.path.join(get_internal_datasets_folder(), 'test11.csv'))
model_dict = {"F1": ["V1", "V2", "V3", "V4"],
              "F2": ["V5", "V6", "V7", "V8"]}
model_spec = ModelSpecificationParser.parse_model_specification_from_dict(X, model_dict)
cfa = ConfirmatoryFactorAnalyzer(model_spec, disp=False)
cfa = cfa.fit(X.values)
print(np.round(cfa.transform(X.values), 2))
# array([[-0.47, -1.09],
#        [ 2.59,  1.2 ],
#        [-0.47,  2.66],
#        ...,
#        [-1.59, -0.92],
#        [ 0.19,  0.88],
#        [-0.28, -0.77]])
[[-0.47 -1.09]
 [ 2.59  1.2 ]
 [-0.47  2.66]
 ...
 [-1.59 -0.92]
 [ 0.19  0.88]
 [-0.28 -0.77]]

References

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157408/