如何解决R中的主成分分析PCA使用函数principal,prcomp和factanal有什么区别?
正如标题所说,R中这三种成分分析之间有什么区别? 我在数据集中进行了大多数相关性分析(12个变量,我试图找出不同的变量之间如何相互关联),因此我使用prcomp()获得了这一点
dd.pca<-prcomp(data[,c(2:13)],rank.=7,retx=TRUE,center=TRUE,scale=TRUE)
dd.pca
Standard deviations (1,..,p=12):
[1] 1.9494527 1.5660193 1.1560267 1.0437762 0.9395919 0.8778826 0.7886269 0.6458763 0.4527430
[10] 0.4023696 0.3822033 0.3402867
Rotation (n x k) = (12 x 7):
PC1 PC2 PC3 PC4 PC5 PC6 PC7
V1 -0.38674247 0.33619538 -0.012743408 0.085100745 -0.136849321 0.14462859 -0.06125286
V2 -0.17734006 -0.04270515 0.264143192 0.074689422 0.904453106 0.25998244 0.05506560
V3 -0.44638494 -0.21800886 0.041582679 -0.004662845 -0.109609794 -0.03822321 -0.05055431
V4 -0.26084983 0.48442131 -0.095105178 0.016857435 -0.006188593 0.11718342 -0.10289103
V5 -0.12557886 0.19997473 0.019173470 -0.659806348 0.213936454 -0.56588640 -0.33722286
V6 -0.39964582 -0.32135726 -0.017956940 0.003251426 -0.099482008 -0.01274911 0.03767973
V7 -0.41644977 -0.28094245 0.040368096 0.082931594 -0.079058008 -0.03488048 0.02101225
V8 -0.31743788 -0.30156691 -0.194835860 -0.135504009 0.037934784 -0.13785366 0.13741551
V9 -0.04097898 0.11740607 0.396031199 0.614229211 0.022650174 -0.65628381 -0.02986520
V10 -0.30584797 0.44968244 -0.002163568 0.100461934 -0.082175041 0.20057857 -0.11597717
V11 0.05818889 -0.21338867 0.613081743 -0.109834369 -0.209532136 0.29021099 -0.61274513
V12 -0.05819242 0.16820293 0.588469694 -0.356154112 -0.191468168 0.02293186 0.67513759
直到显示出我应该获得的最终结果,并且与使用factanal()获得的结果极为相似
dd.fa <- factanal(data,factors = 7)
dd.fa
Call:
factanal(x = data,factors = 7)
Uniquenesses:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
0.005 0.645 0.005 0.151 0.441 0.512 0.569 0.005 0.088 0.005 0.399
V12
0.590
Loadings:
Factor1 Factor2 Factor3 Factor4 Factor5 Factor6 Factor7
V1 0.930 0.202 0.269
V2 0.153 0.494 -0.123 0.245
V3 0.307 0.922 0.132 0.156
V4 0.856 0.158 -0.258
V5 0.728 0.142
V6 0.681
V7 0.279 0.170 -0.154 0.541
V8 0.651 0.217 0.168 0.145 0.686
V9 0.130 0.881 0.275 0.173
V10 0.980 -0.140 -0.116
V11 -0.316 0.399 0.441 -0.376
V12 0.219 0.588
Factor1 Factor2 Factor3 Factor4 Factor5 Factor6 Factor7
SS loadings 2.779 1.907 1.013 0.916 0.743 0.652 0.574
Proportion Var 0.232 0.159 0.084 0.076 0.062 0.054 0.048
Cumulative Var 0.232 0.391 0.475 0.551 0.613 0.668 0.715
Test of the hypothesis that 7 factors are sufficient.
The chi square statistic is 5.03 on 3 degrees of freedom.
The p-value is 0.17
实际上,我只是被发现要使用principal来获取我应该得到的东西(我试图模拟通常是由另一个程序完成的结果,在这种程序中,这种分析称为R中的因素分析)
library(psych)
dd.p<-principal(data,nfactors=5,rotate="varimax")
print(dd.p,digits = 4)
Principal Components Analysis
Call: principal(r = data,nfactors = 5,rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
RC1 RC2 RC3 RC4 RC5 h2 u2 com
V1 0.3183 0.8754 0.0369 0.0312 0.0142 0.8703 0.1297 1.267
V2 0.1844 0.0509 0.0251 0.0295 0.9526 0.9455 0.0545 1.084
V3 0.9074 0.2226 0.0924 0.0095 0.0726 0.8868 0.1132 1.156
V4 -0.0241 0.9122 -0.0579 -0.0881 0.0525 0.8465 0.1535 1.035
V5 -0.0095 0.3150 0.2238 -0.6962 0.1976 0.6732 0.3268 1.821
V6 0.9302 0.0428 0.0162 0.0081 0.0453 0.8694 0.1306 1.010
V7 0.9133 0.1144 0.0460 0.0988 0.0941 0.8679 0.1321 1.082
V8 0.7777 -0.0368 -0.1567 -0.2067 0.0677 0.6780 0.3220 1.248
V9 -0.0727 0.2228 0.1914 0.7253 0.2090 0.6613 0.3387 1.548
V10 0.0813 0.9263 0.0288 0.0379 0.0387 0.8684 0.1316 1.024
V11 0.0920 -0.3631 0.7126 0.1751 0.0018 0.6788 0.3212 1.668
V12 -0.0711 0.2293 0.8009 -0.1239 0.0340 0.7156 0.2844 1.235
RC1 RC2 RC3 RC4 RC5
SS loadings 3.2888 2.8581 1.2776 1.1205 1.0165
Proportion Var 0.2741 0.2382 0.1065 0.0934 0.0847
Cumulative Var 0.2741 0.5122 0.6187 0.7121 0.7968
Proportion Explained 0.3440 0.2989 0.1336 0.1172 0.1063
Cumulative Proportion 0.3440 0.6429 0.7765 0.8937 1.0000
Mean item complexity = 1.3
Test of the hypothesis that 5 components are sufficient.
The root mean square of the residuals (RMSR) is 0.0697
with the empirical chi square 2502.683 with prob < 0
Fit based upon off diagonal values = 0.9524
现在...我的问题是...很奇怪。我知道这些都是可以用来解释不同变量如何“解释”数据集的方法...但是这三种方法之间有什么区别?也就是说,为什么我在每个站点中都不断得到不同的结果,而这三个都被称为“ PCA”却没有太大差异?当然,我使用的数据集在所有三个分析中都是相同的。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。