在某些列中传播具有非唯一值的数据框

如何解决在某些列中传播具有非唯一值的数据框

这是我正在处理的数据:

> data
         Segment Product               Value           Key
1   non-domestic      S1  517.50760307564053 Actuals Sales
2   non-domestic      S2  1235.3088913918129 Actuals Sales
3   non-domestic      S3  2141.6841816176966 Actuals Sales
4       domestic      S1 -958.38836859580044 Actuals Sales
5       domestic      S2 -1129.5593769492507 Actuals Sales
6       domestic      S3 -137.68477107274975 Actuals Sales
7  non-domestic       S1 -296.07559218703756 Quarter Sales
8  non-domestic       S2  1092.0390648120747 Quarter Sales
9  non-domestic       S3  1156.2866848179935 Quarter Sales
10     domestic       S1 -1975.0222255105061 Quarter Sales
11     domestic       S2 -2549.8125184965966 Quarter Sales
12     domestic       S3 -2608.2434152116011 Quarter Sales

我试图将它展开以获得一个 6 行 4 列 (Segment,Product,Actuals Sales,Quarter Sales) 且没有缺失值的表格

spread(data=data,key=Key,value=Value)

不幸的是,我得到的是这个。我了解这是因为列 SegmentProduct 中存在非唯一值。

         Segment Product       Actuals Sales       Quarter Sales
1       domestic      S1 -958.38836859580044                <NA>
2       domestic      S2 -1129.5593769492507                <NA>
3       domestic      S3 -137.68477107274975                <NA>
4      domestic       S1                <NA> -1975.0222255105061
5      domestic       S2                <NA> -2549.8125184965966
6      domestic       S3                <NA> -2608.2434152116011
7   non-domestic      S1  517.50760307564053                <NA>
8   non-domestic      S2  1235.3088913918129                <NA>
9   non-domestic      S3  2141.6841816176966                <NA>
10 non-domestic       S1                <NA> -296.07559218703756
11 non-domestic       S2                <NA>  1092.0390648120747
12 non-domestic       S3                <NA>  1156.2866848179935

你能帮我吗,我如何删除缺失的值并创建一个表,其中前两列中的值不重复?

这是可复制的示例:

> dput(data)
structure(list(Segment = c("non-domestic","non-domestic","domestic","non-domestic ","domestic ","domestic "),Product = c("S1","S2","S3","S1","S3"
),Value = c("517.50760307564053","1235.3088913918129","2141.6841816176966","-958.38836859580044","-1129.5593769492507","-137.68477107274975","-296.07559218703756","1092.0390648120747","1156.2866848179935","-1975.0222255105061","-2549.8125184965966","-2608.2434152116011"
),Key = c("Actuals Sales","Actuals Sales","Quarter Sales","Quarter Sales")),.Names = c("Segment","Product","Value","Key"),row.names = c(NA,-12L),class = "data.frame")

解决方法

删除不需要的空格 (trimws()) 并将强制转换为宽

library(data.table)
dcast(setDT(mydata),trimws(Segment) + Product ~ Key,value.var = "Value",fill = NA)
#         Segment Product       Actuals Sales       Quarter Sales
# 1:     domestic      S1 -958.38836859580044 -1975.0222255105061
# 2:     domestic      S2 -1129.5593769492507 -2549.8125184965966
# 3:     domestic      S3 -137.68477107274975 -2608.2434152116011
# 4: non-domestic      S1  517.50760307564053 -296.07559218703756
# 5: non-domestic      S2  1235.3088913918129  1092.0390648120747
# 6: non-domestic      S3  2141.6841816176966  1156.2866848179935
,

使用 <input type="text" class="form-control" [ngModel]="username" (input)="onUpdateUsername($event)"> 的基本 R 选项

reshape

给予

reshape(
  transform(data,Segment = trimws(Segment)),direction = "wide",idvar = c("Segment","Product"),timevar = "Key"
)
,

您的示例数据实际上包含一些空格,在删除这些后,pivot_wider 及其参数 id_cols 就像一个魅力

data <- structure(list(Segment = c("non-domestic","non-domestic","domestic","domestic"),Product = c("S1","S2","S3","S1","S3"
                                   ),Value = c("517.50760307564053","1235.3088913918129","2141.6841816176966","-958.38836859580044","-1129.5593769492507","-137.68477107274975","-296.07559218703756","1092.0390648120747","1156.2866848179935","-1975.0222255105061","-2549.8125184965966","-2608.2434152116011"
                                   ),Key = c("Actuals Sales","Actuals Sales","Quarter Sales","Quarter Sales")),.Names = c("Segment","Product","Value","Key"),row.names = c(NA,-12L),class = "data.frame")

library(tidyr)

data %>% pivot_wider(names_from = Key,values_from = Value,id_cols = c(Segment,Product))

#> # A tibble: 6 x 4
#>   Segment      Product `Actuals Sales`     `Quarter Sales`    
#>   <chr>        <chr>   <chr>               <chr>              
#> 1 non-domestic S1      517.50760307564053  -296.07559218703756
#> 2 non-domestic S2      1235.3088913918129  1092.0390648120747 
#> 3 non-domestic S3      2141.6841816176966  1156.2866848179935 
#> 4 domestic     S1      -958.38836859580044 -1975.0222255105061
#> 5 domestic     S2      -1129.5593769492507 -2549.8125184965966
#> 6 domestic     S3      -137.68477107274975 -2608.2434152116011

不过,如果您的实际数据还包含空格,您可以在旋转之前使用 stringr::str_trim()

data <- structure(list(Segment = c("non-domestic","non-domestic ","domestic ","domestic "),class = "data.frame")
library(tidyverse)
data %>% mutate(Segment = str_trim(Segment)) %>%
  pivot_wider(names_from = Key,Product))
#> # A tibble: 6 x 4
#>   Segment      Product `Actuals Sales`     `Quarter Sales`    
#>   <chr>        <chr>   <chr>               <chr>              
#> 1 non-domestic S1      517.50760307564053  -296.07559218703756
#> 2 non-domestic S2      1235.3088913918129  1092.0390648120747 
#> 3 non-domestic S3      2141.6841816176966  1156.2866848179935 
#> 4 domestic     S1      -958.38836859580044 -1975.0222255105061
#> 5 domestic     S2      -1129.5593769492507 -2549.8125184965966
#> 6 domestic     S3      -137.68477107274975 -2608.2434152116011

reprex package (v2.0.0) 于 2021 年 6 月 11 日创建

,

我会用 data.table 包来做,然后生成 2 个表然后合并它们。

希望此代码对您有所帮助。

library(data.table)

#"test" is your data frame input
test <- data.table(test)

a <- test[Key=="ActualsSales",.(Segment=Segment,Product=Product,ActualsSales=Value)]
b <- test[Key=="QuarterSales",QuarterSales=Value)]

output <- merge(a,b,by=c("Segment","Product"))
print(output)
,
qs <- df$Value[df$Key == 'Quarter Sales']
as <- df$Value[df$Key == 'Actuals Sales']
df$QS <- c(qs,rep(NA,length(qs)))
df$AS <- c(as,length(as)))
df$Key <- NULL

df <- df[complete.cases(df),]

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


使用本地python环境可以成功执行 import pandas as pd import matplotlib.pyplot as plt # 设置字体 plt.rcParams[&#39;font.sans-serif&#39;] = [&#39;SimHei&#39;] # 能正确显示负号 p
错误1:Request method ‘DELETE‘ not supported 错误还原:controller层有一个接口,访问该接口时报错:Request method ‘DELETE‘ not supported 错误原因:没有接收到前端传入的参数,修改为如下 参考 错误2:cannot r
错误1:启动docker镜像时报错:Error response from daemon: driver failed programming external connectivity on endpoint quirky_allen 解决方法:重启docker -&gt; systemctl r
错误1:private field ‘xxx‘ is never assigned 按Altʾnter快捷键,选择第2项 参考:https://blog.csdn.net/shi_hong_fei_hei/article/details/88814070 错误2:启动时报错,不能找到主启动类 #
报错如下,通过源不能下载,最后警告pip需升级版本 Requirement already satisfied: pip in c:\users\ychen\appdata\local\programs\python\python310\lib\site-packages (22.0.4) Coll
错误1:maven打包报错 错误还原:使用maven打包项目时报错如下 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-resources-plugin:3.2.0:resources (default-resources)
错误1:服务调用时报错 服务消费者模块assess通过openFeign调用服务提供者模块hires 如下为服务提供者模块hires的控制层接口 @RestController @RequestMapping(&quot;/hires&quot;) public class FeignControl
错误1:运行项目后报如下错误 解决方案 报错2:Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project sb 解决方案:在pom.
参考 错误原因 过滤器或拦截器在生效时,redisTemplate还没有注入 解决方案:在注入容器时就生效 @Component //项目运行时就注入Spring容器 public class RedisBean { @Resource private RedisTemplate&lt;String
使用vite构建项目报错 C:\Users\ychen\work&gt;npm init @vitejs/app @vitejs/create-app is deprecated, use npm init vite instead C:\Users\ychen\AppData\Local\npm-
参考1 参考2 解决方案 # 点击安装源 协议选择 http:// 路径填写 mirrors.aliyun.com/centos/8.3.2011/BaseOS/x86_64/os URL类型 软件库URL 其他路径 # 版本 7 mirrors.aliyun.com/centos/7/os/x86
报错1 [root@slave1 data_mocker]# kafka-console-consumer.sh --bootstrap-server slave1:9092 --topic topic_db [2023-12-19 18:31:12,770] WARN [Consumer clie
错误1 # 重写数据 hive (edu)&gt; insert overwrite table dwd_trade_cart_add_inc &gt; select data.id, &gt; data.user_id, &gt; data.course_id, &gt; date_format(
错误1 hive (edu)&gt; insert into huanhuan values(1,&#39;haoge&#39;); Query ID = root_20240110071417_fe1517ad-3607-41f4-bdcf-d00b98ac443e Total jobs = 1
报错1:执行到如下就不执行了,没有显示Successfully registered new MBean. [root@slave1 bin]# /usr/local/software/flume-1.9.0/bin/flume-ng agent -n a1 -c /usr/local/softwa
虚拟及没有启动任何服务器查看jps会显示jps,如果没有显示任何东西 [root@slave2 ~]# jps 9647 Jps 解决方案 # 进入/tmp查看 [root@slave1 dfs]# cd /tmp [root@slave1 tmp]# ll 总用量 48 drwxr-xr-x. 2
报错1 hive&gt; show databases; OK Failed with exception java.io.IOException:java.lang.RuntimeException: Error in configuring object Time taken: 0.474 se
报错1 [root@localhost ~]# vim -bash: vim: 未找到命令 安装vim yum -y install vim* # 查看是否安装成功 [root@hadoop01 hadoop]# rpm -qa |grep vim vim-X11-7.4.629-8.el7_9.x
修改hadoop配置 vi /usr/local/software/hadoop-2.9.2/etc/hadoop/yarn-site.xml # 添加如下 &lt;configuration&gt; &lt;property&gt; &lt;name&gt;yarn.nodemanager.res