微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

因子分析:从 FactorAnalyzer.loadings_

如何解决因子分析:从 FactorAnalyzer.loadings_

我开始熟悉 Python 的 FactorAnalyzer 进行因子分析。

需要帮助打印带有因子系数和显着性参数的因子标签

我使用 loadings_ 但它的输出非常混乱;

数据源(我下载到我的笔记本上):https://vincentarelbundock.github.io/Rdatasets/datasets.html(基于性格评估的 BFI 数据集)。

import pandas as pd
from sklearn.datasets import load_iris
from factor_analyzer import FactorAnalyzer
import matplotlib.pyplot as plt


df = pd.read_csv("../Downloads/bfi.csv")
df.head()

# Dropping unnecessary columns
df.drop(['gender','education','age'],axis=1,inplace=True)

# Dropping missing values rows
df.dropna(inplace=True)

基于特征值,只有 6 个因子是显着的(值 >1.0)。因为我不能得到好的输出,我不知道这些因素。

fa = FactorAnalyzer( method='minres',n_factors=6,rotation="varimax")

fa.fit(df)

print(fa.loadings_)

输出

数组([[-0.02290301,-0.03247244,0.03316871,-0.03809335,0.00379506,0.10374847], [ 0.09939617,0.06047379,0.02669442,-0.53078469,-0.12030937,0.16363839], [ 0.03176731,0.259875,0.1402256,0.64656946,0.05577021,-0.09704963], [-0.00525556,0.40884857,0.10953353,0.5870038,0.01618433,0.03914857], [-0.07926603,0.25534237,0.22930809,0.39176034,-0.13629257,0.03340065], [-0.14364476,0.4910488,0.0856494,0.45108989,0.00911123,0.10588827], [ 0.00562295,0.12364715,0.54015018,0.00422137,0.18345833,0.13879815], [ 0.08435816,0.10650466,0.65249593,0.05653766,0.0792028,0.20858043], [-0.03394649,0.0497959,0.54587749,0.10028627,-0.0123717,0.05447959], [ 0.23161662,0.0089893,-0.67278538,-0.08998026,-0.15345088,0.226977], [ 0.29340234,-0.1436436,-0.55970426,-0.04706994,0.0256143,0.09577898], [ 0.05310218,-0.52147723,0.02649196,-0.09054497,-0.05928098,0.33201867], [ 0.26318891,-0.62292324,-0.11075758,-0.07455019,-0.03044005,0.29120361], [ 0.00119,0.63056485,0.07741736,0.15388275,0.21421252,0.09215221], [-0.14723885,0.68281775,0.10390412,0.2065131,-0.13327166,-0.03773659], [ 0.02197833,0.50438366,0.31238313,0.04844782,0.18521834,-0.11350852], [ 0.79096653,0.033469,-0.04001445,-0.19151604,-0.07737848,-0.16815916], [ 0.77708495,-0.01765921,-0.02173671,-0.15558624,0.00764293,-0.19939099], [ 0.72818732,-0.03614561,-0.0674602,-0.02313414,-0.01532483,0.02192578], [ 0.59778566,-0.2770728,-0.1837043,0.01861508,0.06451108,0.18288879], [ 0.53479082,-0.11293748,-0.04097176,0.09644977,-0.1645811,0.11185692], [-0.00891931,0.3023176,0.10733112,-0.00134206,0.46434464,0.16741622], [ 0.16146455,0.02029611,-0.10051682,0.04691938,-0.50064301,0.08416413], [ 0.0196248,0.40211954,0.07042896,0.06363394,0.54784203,0.12081641], [ 0.22872114,-0.0926477,-0.03000306,0.14801512,0.34628284,0.20228616],[ 0.06801995,0.00091956,-0.06223948,-0.05313796,-0.57993276,0.10662123]])

这是我的问题。我想不出让输出可用的方法(类似这样):

 Factor 1.  Factor 2. Factor 3.  factor 4.  Factor 5.  Factor 6

A1.    value 1.   value 2.  value 2.   value 4.   value 5.   Value 6

A1 是第一个变量。总共有 26 个以上的变量 * 在下降之前)。有 2700 条记录。

  1. 如何以可用的方式打印 loadings_ 的输出

  2. 如何仅打印带有我选择的标签的因子(在我的例子中是 6 个)?

  3. fa.get_factor_variance() 的相同输出问题

    fa.get_factor_variance()

(数组([2.76721162,2.72814014,2.07554605,1.6108362,1.46335442,0.62155903]), 数组([0.10643122,0.10492847,0.07982869,0.06195524,0.05628286,0.02390612]), 数组([0.10643122,0.21135968,0.29118838,0.35314362,0.40942648,0.43333259]))

感谢您的帮助!!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。