如何解决转换数据框以按自变量缩放
year vm Gebied Naam stadsdeel Wijk variable value
0 2014 T94 Zuidoost Bijlmer-Oost Bijlmer Oost (E,G,K) ipsl_risicoperceptie 93.891900
1 2015 T94 Zuidoost Bijlmer-Oost Bijlmer Oost (E,K) ipsl_risicoperceptie 94.510000
2 2016 T94 Zuidoost Bijlmer-Oost Bijlmer Oost (E,K) ipsl_risicoperceptie 97.575000
3 2017 T94 Zuidoost Bijlmer-Oost Bijlmer Oost (E,K) ipsl_risicoperceptie 96.877500
4 2018 T94 Zuidoost Bijlmer-Oost Bijlmer Oost (E,K) ipsl_risicoperceptie 97.175000
5 2019 T94 Zuidoost Bijlmer-Oost Bijlmer Oost (E,K) ipsl_risicoperceptie 100.487500
6 2014 A08 Centrum Centrum-Oost Weesperbuurt/Plantage ipsl_risicoperceptie 97.394115
7 2015 A08 Centrum Centrum-Oost Weesperbuurt/Plantage ipsl_risicoperceptie 96.160000
8 2016 A08 Centrum Centrum-Oost Weesperbuurt/Plantage ipsl_risicoperceptie 97.750000
9 2017 A08 Centrum Centrum-Oost Weesperbuurt/Plantage ipsl_risicoperceptie 98.820000
有多个变量,每个变量都有不同的尺度以及每年的最小值和最大值。我想转换这些值,使其每年匹配 1-100 的比例。因此该变量的最小值为 1,该年该变量的最大值为 100。您可以在下面看到变量的描述。
[12] df.groupby(['variable','year']).describe()
count mean std min 25% 50% 75% max
variable year
HICindex 2014 94.0 92.282244 26.901022 31.602504 75.826290 91.257552 111.136273 157.578866
2015 94.0 90.381516 29.872063 16.600000 70.397500 83.555000 108.947500 169.840000
2016 97.0 84.735893 27.558587 29.480000 63.180000 84.490000 103.760000 169.600000
2017 97.0 81.702208 26.291037 22.490000 59.990000 82.510000 95.110000 159.820000
2018 97.0 84.484390 28.148936 26.330000 68.710000 78.710000 96.960000 170.170000
2019 97.0 80.629880 26.166200 26.530000 64.170000 76.340000 99.140000 167.383333
HVCIndex 2014 94.0 102.252289 29.787177 53.111797 84.784686 100.954751 114.647216 214.428036
2015 94.0 96.295904 28.732603 34.280000 78.850000 94.195000 114.662500 199.820000
2016 97.0 92.988050 29.093444 44.560000 74.410000 88.240000 110.220000 187.810000
2017 97.0 86.563471 28.395480 33.730000 69.100000 82.060000 100.410000 195.920000
2018 97.0 77.429003 29.287222 18.580000 61.050000 73.590000 88.950000 216.780000
2019 97.0 80.240825 30.354648 19.610000 61.795000 76.830000 90.700000 239.510000
Ioverlast 2014 94.0 110.555446 59.265498 27.438722 83.847542 102.407453 122.734040 472.941648
2015 94.0 107.138076 60.195996 30.640000 77.475000 98.160000 120.705000 480.100000
2016 97.0 112.180086 68.316333 34.140000 81.300000 102.480000 117.910000 547.700000
2017 97.0 113.205696 67.792895 31.080000 77.510000 102.310000 120.880000 498.190000
2018 97.0 108.790326 71.469753 36.500000 72.070000 98.550000 116.740000 537.760000
2019 97.0 113.551065 66.994786 22.630000 81.410000 106.540000 125.210000 507.910000
正如您在上面看到的所有变量(总共 3/7)。都有不同的范围。我想对每年每个变量的范围进行标准化,以将它们全部映射到相同固定轴上的雷达图上。 pandas/sklearn 中的什么变换/缩放函数最适合按变量和年份分组并适当缩放?提前致谢!
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。