factor_analyzer.factor_analyzer_utils

factor_analyzer.factor_analyzer_utils

Utility functions, used primarily by the confirmatory factor analysis module.

Functions

Name Description
apply_impute_nan Apply a function to impute np.nan values with the mean or the median.
commutation_matrix Calculate the commutation matrix.
corr Calculate the correlation matrix.
cov Calculate the covariance matrix.
covariance_to_correlation Compute cross-correlations from the given covariance matrix.
duplication_matrix Calculate the duplication matrix.
duplication_matrix_pre_post Transform given input symmetric matrix using pre-post duplication.
fill_lower_diag Fill the lower diagonal of a square matrix, given a 1-D input array.
get_first_idxs_from_values Get the indexes for a given value.
get_free_parameter_idxs Get the free parameter indices from the flattened matrix.
get_symmetric_lower_idxs Get the indices for the lower triangle of a symmetric matrix.
get_symmetric_upper_idxs Get the indices for the upper triangle of a symmetric matrix.
impute_values Impute np.nan values with the mean or median, or drop the containing rows.
inv_chol Calculate matrix inverse using Cholesky decomposition.
merge_variance_covariance Merge variances and covariances into a single variance-covariance matrix.
partial_correlations Compute partial correlations between variable pairs.
smc Calculate the squared multiple correlations.
unique_elements Get first unique instance of every list element, while maintaining order.

apply_impute_nan

factor_analyzer.factor_analyzer_utils.apply_impute_nan(x, how='mean')

Apply a function to impute np.nan values with the mean or the median.

Parameters

Name Type Description Default
x array - like The 1-D array to impute. required
how str Whether to impute the ‘mean’ or ‘median’. Defaults to ‘mean’. 'mean'

Returns

Name Type Description
x numpy.ndarray The array, with the missing values imputed.

commutation_matrix

factor_analyzer.factor_analyzer_utils.commutation_matrix(p, q)

Calculate the commutation matrix.

This matrix transforms the vectorized form of the matrix into the vectorized form of its transpose.

Parameters

Name Type Description Default
p int The number of rows. required
q int The number of columns. required

Returns

Name Type Description
commutation_matrix :obj:numpy.ndarray The commutation matrix

References

https://en.wikipedia.org/wiki/Commutation_matrix

corr

factor_analyzer.factor_analyzer_utils.corr(x)

Calculate the correlation matrix.

Parameters

Name Type Description Default
x array - like A 1-D or 2-D array containing multiple variables and observations. Each column of x represents a variable, and each row a single observation of all those variables. required

Returns

Name Type Description
r numpy.ndarray The correlation matrix of the variables.

cov

factor_analyzer.factor_analyzer_utils.cov(x, ddof=0)

Calculate the covariance matrix.

Parameters

Name Type Description Default
x array - like A 1-D or 2-D array containing multiple variables and observations. Each column of x represents a variable, and each row a single observation of all those variables. required
ddof int Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. Defaults to 0. 0

Returns

Name Type Description
r numpy.ndarray The covariance matrix of the variables.

covariance_to_correlation

factor_analyzer.factor_analyzer_utils.covariance_to_correlation(m)

Compute cross-correlations from the given covariance matrix.

This is a port of R cov2cor() function.

Parameters

Name Type Description Default
m array - like The covariance matrix. required

Returns

Name Type Description
retval numpy.ndarray The cross-correlation matrix.

Raises

Name Type Description
ValueError If the input matrix is not square.

duplication_matrix

factor_analyzer.factor_analyzer_utils.duplication_matrix(n=1)

Calculate the duplication matrix.

A function to create the duplication matrix (Dn), which is the unique n2 × n(n+1)/2 matrix which, for any n × n symmetric matrix A, transforms vech(A) into vec(A), as in Dn vech(A) = vec(A).

Parameters

Name Type Description Default
n int The dimension of the n x n symmetric matrix. Defaults to 1. 1

Returns

Name Type Description
duplication_matrix :obj:numpy.ndarray The duplication matrix.

Raises

Name Type Description
ValueError If n is not a positive integer greater than 1.

References

https://en.wikipedia.org/wiki/Duplication_and_elimination_matrices

duplication_matrix_pre_post

factor_analyzer.factor_analyzer_utils.duplication_matrix_pre_post(x)

Transform given input symmetric matrix using pre-post duplication.

Parameters

Name Type Description Default
x array - like The input matrix. required

Returns

Name Type Description
out :obj:numpy.ndarray The transformed matrix.

Raises

Name Type Description
AssertionError If x is not symmetric.

fill_lower_diag

factor_analyzer.factor_analyzer_utils.fill_lower_diag(x)

Fill the lower diagonal of a square matrix, given a 1-D input array.

Parameters

Name Type Description Default
x array - like The flattened input matrix that will be used to fill the lower diagonal of the square matrix. required

Returns

Name Type Description
out :obj:numpy.ndarray The output square matrix, with the lower diagonal filled by x.

References

[1] https://stackoverflow.com/questions/51439271/ convert-1d-array-to-lower-triangular-matrix

get_first_idxs_from_values

factor_analyzer.factor_analyzer_utils.get_first_idxs_from_values(
    x,
    eq=1,
    use_columns=True,
)

Get the indexes for a given value.

Parameters

Name Type Description Default
x array - like The input matrix. required
eq str or int The given value to find. Defaults to 1. 1
use_columns bool Whether to get the first indexes using the columns. If False, then use the rows instead. Defaults to True. True

Returns

Name Type Description
tuple Tuple[List[int], List[int]] - row_idx (list): A list of row indexes. - col_idx (list): A list of column indexes.

get_free_parameter_idxs

factor_analyzer.factor_analyzer_utils.get_free_parameter_idxs(x, eq=1)

Get the free parameter indices from the flattened matrix.

Parameters

Name Type Description Default
x array - like The input matrix. required
eq str or int The value that free parameters should be equal to. np.nan fields will be populated with this value. Defaults to 1. 1

Returns

Name Type Description
idx :obj:numpy.ndarray The free parameter indexes.

get_symmetric_lower_idxs

factor_analyzer.factor_analyzer_utils.get_symmetric_lower_idxs(n=1, diag=True)

Get the indices for the lower triangle of a symmetric matrix.

Parameters

Name Type Description Default
n int The dimension of the n x n symmetric matrix. Defaults to 1. 1
diag bool Whether to include the diagonal. True

Returns

Name Type Description
indices :obj:numpy.ndarray The indices for the lower triangle.

get_symmetric_upper_idxs

factor_analyzer.factor_analyzer_utils.get_symmetric_upper_idxs(n=1, diag=True)

Get the indices for the upper triangle of a symmetric matrix.

Parameters

Name Type Description Default
n int The dimension of the n x n symmetric matrix. Defaults to 1. 1
diag bool Whether to include the diagonal. True

Returns

Name Type Description
indices :obj:numpy.ndarray The indices for the upper triangle.

impute_values

factor_analyzer.factor_analyzer_utils.impute_values(x, how='mean')

Impute np.nan values with the mean or median, or drop the containing rows.

Parameters

Name Type Description Default
x array - like An array to impute. required
how str Whether to impute the ‘mean’ or ‘median’. Defaults to ‘mean’. 'mean'

Returns

Name Type Description
x numpy.ndarray The array, with the missing values imputed or with rows dropped.

inv_chol

factor_analyzer.factor_analyzer_utils.inv_chol(x, logdet=False)

Calculate matrix inverse using Cholesky decomposition.

Optionally, calculate the log determinant of the Cholesky.

Parameters

Name Type Description Default
x array - like The matrix to invert. required
logdet bool Whether to calculate the log determinant, instead of the inverse. Defaults to False. False

Returns

Name Type Description
tuple Tuple[np.ndarray, Optional[float]] - chol_inv (array-like): The inverted matrix. - chol_logdet (array-like or None): The log determinant, if logdet was True, otherwise None.

merge_variance_covariance

factor_analyzer.factor_analyzer_utils.merge_variance_covariance(
    variances,
    covariances=None,
)

Merge variances and covariances into a single variance-covariance matrix.

Parameters

Name Type Description Default
variances array - like The variances that will be used to fill the diagonal of the square matrix. required
covariances array - like or None The flattened input matrix that will be used to fill the lower and upper diagonal of the square matrix. If None, then only the variances will be used. Defaults to None. None

Returns

Name Type Description
variance_covariance :obj:numpy.ndarray The variance-covariance matrix.

partial_correlations

factor_analyzer.factor_analyzer_utils.partial_correlations(x)

Compute partial correlations between variable pairs.

This is a python port of the pcor() function implemented in the ppcor R package, which computes partial correlations for each pair of variables in the given array, excluding all other variables.

Parameters

Name Type Description Default
x array - like An array containing the feature values. required

Returns

Name Type Description
pcor :obj:numpy.ndarray An array containing the partial correlations of of each pair of variables in the given array, excluding all other variables.

smc

factor_analyzer.factor_analyzer_utils.smc(corr_mtx, sort=False)

Calculate the squared multiple correlations.

This is equivalent to regressing each variable on all others and calculating the r-squared values.

Parameters

Name Type Description Default
corr_mtx array - like The correlation matrix used to calculate SMC. required
sort bool Whether to sort the values for SMC before returning. Defaults to False. False

Returns

Name Type Description
smc numpy.ndarray The squared multiple correlations matrix.

unique_elements

factor_analyzer.factor_analyzer_utils.unique_elements(seq)

Get first unique instance of every list element, while maintaining order.

Parameters

Name Type Description Default
seq list - like The list of elements. required

Returns

Name Type Description
seq list The updated list of elements.