factor_analyzer.factor_analyzer_utils

factor_analyzer.factor_analyzer_utils

Utility functions, used primarily by the confirmatory factor analysis module.

Functions

Name	Description
apply_impute_nan	Apply a function to impute np.nan values with the mean or the median.
commutation_matrix	Calculate the commutation matrix.
corr	Calculate the correlation matrix.
cov	Calculate the covariance matrix.
covariance_to_correlation	Compute cross-correlations from the given covariance matrix.
duplication_matrix	Calculate the duplication matrix.
duplication_matrix_pre_post	Transform given input symmetric matrix using pre-post duplication.
fill_lower_diag	Fill the lower diagonal of a square matrix, given a 1-D input array.
get_first_idxs_from_values	Get the indexes for a given value.
get_free_parameter_idxs	Get the free parameter indices from the flattened matrix.
get_symmetric_lower_idxs	Get the indices for the lower triangle of a symmetric matrix.
get_symmetric_upper_idxs	Get the indices for the upper triangle of a symmetric matrix.
impute_values	Impute np.nan values with the mean or median, or drop the containing rows.
inv_chol	Calculate matrix inverse using Cholesky decomposition.
merge_variance_covariance	Merge variances and covariances into a single variance-covariance matrix.
partial_correlations	Compute partial correlations between variable pairs.
smc	Calculate the squared multiple correlations.
unique_elements	Get first unique instance of every list element, while maintaining order.

apply_impute_nan

factor_analyzer.factor_analyzer_utils.apply_impute_nan(x, how='mean')

Apply a function to impute np.nan values with the mean or the median.

Parameters

Name	Type	Description	Default
x	array - `like`	The 1-D array to impute.	required
how	str	Whether to impute the ‘mean’ or ‘median’. Defaults to ‘mean’.	`'mean'`

Returns

Name	Type	Description
x	numpy.ndarray	The array, with the missing values imputed.

commutation_matrix

factor_analyzer.factor_analyzer_utils.commutation_matrix(p, q)

Calculate the commutation matrix.

This matrix transforms the vectorized form of the matrix into the vectorized form of its transpose.

Parameters

Name	Type	Description	Default
p	int	The number of rows.	required
q	int	The number of columns.	required

Returns

Name	Type	Description
commutation_matrix	:obj:`numpy.ndarray`	The commutation matrix

References

https://en.wikipedia.org/wiki/Commutation_matrix

corr

factor_analyzer.factor_analyzer_utils.corr(x)

Calculate the correlation matrix.

Parameters

Name	Type	Description	Default
x	array - `like`	A 1-D or 2-D array containing multiple variables and observations. Each column of x represents a variable, and each row a single observation of all those variables.	required

Returns

Name	Type	Description
r	numpy.ndarray	The correlation matrix of the variables.

cov

factor_analyzer.factor_analyzer_utils.cov(x, ddof=0)

Calculate the covariance matrix.

Parameters

Name	Type	Description	Default
x	array - `like`	A 1-D or 2-D array containing multiple variables and observations. Each column of x represents a variable, and each row a single observation of all those variables.	required
ddof	int	Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. Defaults to 0.	`0`

Returns

Name	Type	Description
r	numpy.ndarray	The covariance matrix of the variables.

covariance_to_correlation

factor_analyzer.factor_analyzer_utils.covariance_to_correlation(m)

Compute cross-correlations from the given covariance matrix.

This is a port of R cov2cor() function.

Parameters

Name	Type	Description	Default
m	array - `like`	The covariance matrix.	required

Returns

Name	Type	Description
retval	numpy.ndarray	The cross-correlation matrix.

Raises

Name	Type	Description
	ValueError	If the input matrix is not square.

duplication_matrix

factor_analyzer.factor_analyzer_utils.duplication_matrix(n=1)

Calculate the duplication matrix.

A function to create the duplication matrix (Dn), which is the unique n2 × n(n+1)/2 matrix which, for any n × n symmetric matrix A, transforms vech(A) into vec(A), as in Dn vech(A) = vec(A).

Parameters

Name	Type	Description	Default
n	int	The dimension of the n x n symmetric matrix. Defaults to 1.	`1`

Returns

Name	Type	Description
duplication_matrix	:obj:`numpy.ndarray`	The duplication matrix.

Raises

Name	Type	Description
	ValueError	If `n` is not a positive integer greater than 1.

References

https://en.wikipedia.org/wiki/Duplication_and_elimination_matrices

duplication_matrix_pre_post

factor_analyzer.factor_analyzer_utils.duplication_matrix_pre_post(x)

Transform given input symmetric matrix using pre-post duplication.

Parameters

Name	Type	Description	Default
x	array - `like`	The input matrix.	required

Returns

Name	Type	Description
out	:obj:`numpy.ndarray`	The transformed matrix.

Raises

Name	Type	Description
	AssertionError	If `x` is not symmetric.

fill_lower_diag

factor_analyzer.factor_analyzer_utils.fill_lower_diag(x)

Fill the lower diagonal of a square matrix, given a 1-D input array.

Parameters

Name	Type	Description	Default
x	array - `like`	The flattened input matrix that will be used to fill the lower diagonal of the square matrix.	required

Returns

Name	Type	Description
out	:obj:`numpy.ndarray`	The output square matrix, with the lower diagonal filled by x.

References

[1] https://stackoverflow.com/questions/51439271/ convert-1d-array-to-lower-triangular-matrix

get_first_idxs_from_values

factor_analyzer.factor_analyzer_utils.get_first_idxs_from_values(
    x,
    eq=1,
    use_columns=True,
)

Get the indexes for a given value.

Parameters

Name	Type	Description	Default
x	array - `like`	The input matrix.	required
eq	str or int	The given value to find. Defaults to 1.	`1`
use_columns	bool	Whether to get the first indexes using the columns. If `False`, then use the rows instead. Defaults to `True`.	`True`

Returns

Name	Type	Description
tuple	Tuple[List[int], List[int]]	- row_idx (list): A list of row indexes. - col_idx (list): A list of column indexes.

get_free_parameter_idxs

factor_analyzer.factor_analyzer_utils.get_free_parameter_idxs(x, eq=1)

Get the free parameter indices from the flattened matrix.

Parameters

Name	Type	Description	Default
x	array - `like`	The input matrix.	required
eq	str or int	The value that free parameters should be equal to. `np.nan` fields will be populated with this value. Defaults to 1.	`1`

Returns

Name	Type	Description
idx	:obj:`numpy.ndarray`	The free parameter indexes.

get_symmetric_lower_idxs

factor_analyzer.factor_analyzer_utils.get_symmetric_lower_idxs(n=1, diag=True)

Get the indices for the lower triangle of a symmetric matrix.

Parameters

Name	Type	Description	Default
n	int	The dimension of the n x n symmetric matrix. Defaults to 1.	`1`
diag	bool	Whether to include the diagonal.	`True`

Returns

Name	Type	Description
indices	:obj:`numpy.ndarray`	The indices for the lower triangle.

get_symmetric_upper_idxs

factor_analyzer.factor_analyzer_utils.get_symmetric_upper_idxs(n=1, diag=True)

Get the indices for the upper triangle of a symmetric matrix.

Parameters

Name	Type	Description	Default
n	int	The dimension of the n x n symmetric matrix. Defaults to 1.	`1`
diag	bool	Whether to include the diagonal.	`True`

Returns

Name	Type	Description
indices	:obj:`numpy.ndarray`	The indices for the upper triangle.

impute_values

factor_analyzer.factor_analyzer_utils.impute_values(x, how='mean')

Impute np.nan values with the mean or median, or drop the containing rows.

Parameters

Name	Type	Description	Default
x	array - `like`	An array to impute.	required
how	str	Whether to impute the ‘mean’ or ‘median’. Defaults to ‘mean’.	`'mean'`

Returns

Name	Type	Description
x	numpy.ndarray	The array, with the missing values imputed or with rows dropped.

inv_chol

factor_analyzer.factor_analyzer_utils.inv_chol(x, logdet=False)

Calculate matrix inverse using Cholesky decomposition.

Optionally, calculate the log determinant of the Cholesky.

Parameters

Name	Type	Description	Default
x	array - `like`	The matrix to invert.	required
logdet	bool	Whether to calculate the log determinant, instead of the inverse. Defaults to False.	`False`

Returns

Name	Type	Description
tuple	Tuple[np.ndarray, Optional[float]]	- chol_inv (array-like): The inverted matrix. - chol_logdet (array-like or None): The log determinant, if logdet was True, otherwise None.

merge_variance_covariance

factor_analyzer.factor_analyzer_utils.merge_variance_covariance(
    variances,
    covariances=None,
)

Merge variances and covariances into a single variance-covariance matrix.

Parameters

Name	Type	Description	Default
variances	array - `like`	The variances that will be used to fill the diagonal of the square matrix.	required
covariances	array - `like` or None	The flattened input matrix that will be used to fill the lower and upper diagonal of the square matrix. If None, then only the variances will be used. Defaults to `None`.	`None`

Returns

Name	Type	Description
variance_covariance	:obj:`numpy.ndarray`	The variance-covariance matrix.

partial_correlations

factor_analyzer.factor_analyzer_utils.partial_correlations(x)

Compute partial correlations between variable pairs.

This is a python port of the pcor() function implemented in the ppcor R package, which computes partial correlations for each pair of variables in the given array, excluding all other variables.

Parameters

Name	Type	Description	Default
x	array - `like`	An array containing the feature values.	required

Returns

Name	Type	Description
pcor	:obj:`numpy.ndarray`	An array containing the partial correlations of of each pair of variables in the given array, excluding all other variables.

smc

factor_analyzer.factor_analyzer_utils.smc(corr_mtx, sort=False)

Calculate the squared multiple correlations.

This is equivalent to regressing each variable on all others and calculating the r-squared values.

Parameters

Name	Type	Description	Default
corr_mtx	array - `like`	The correlation matrix used to calculate SMC.	required
sort	bool	Whether to sort the values for SMC before returning. Defaults to False.	`False`

Returns

Name	Type	Description
smc	numpy.ndarray	The squared multiple correlations matrix.

unique_elements

factor_analyzer.factor_analyzer_utils.unique_elements(seq)

Get first unique instance of every list element, while maintaining order.

Parameters

Name	Type	Description	Default
seq	list - `like`	The list of elements.	required

Returns

Name	Type	Description
seq	list	The updated list of elements.