如何解决使用 group by 聚合 R 中的数据并保留非 NA 的其他列的值
我想知道是否有人可以帮忙。我有以下数据集,其中一个 ID 是一家公司,该公司随着时间的推移雇佣了不同数量的 ID 重复人员。我们有 ID 的地址,但没有为每一行收集:
<!DOCTYPE html>
<html>
<canvas id="gameCanvas" width="800" height="600"></canvas>
<script>
var canvas;
var canvasContext;
var ballX = 50;
var ballY = 50;
var ballSpeedX = 10;
var ballSpeedY = 4;
var paddle1Y = 250;
var paddle2Y = 250;
const PADDLE_HEIGHT = 100;
const PADDLE_WIDTH = 10;
function calculateMousePos(evt) {
var rect = canvas.getBoundingClientRect();
var root = document.documentElement;
var mouseX = evt.clientX - rect.left - root.scrollLeft;
var mouseY = evt.clientY - rect.top - root.scrollTop;
return {
x:mouseX,y:mouseY
};
}
window.onload = function() {
canvas = document.getElementById('gameCanvas');
canvasContext = canvas.getContext('2d');
var framesPerSecond = 60;
setInterval( function(){
moveEverything();
drawEverything();
},1150/framesPerSecond );
canvas.addEventListener('mousemove',function(evt){
var mousePos = calculateMousePos(evt);
paddle2Y = mousePos.y-(PADDLE_HEIGHT/2);
})
}
function ballReset() {
ballX = canvas.width/2;
ballY = canvas.height/2;
}
function moveEverything() {
ballX = ballX + ballSpeedX;
ballY = ballY + ballSpeedY;
}
if(ballX < 0) {
if(ballY > paddle1Y &&
ballY < paddle1Y+PADDLE_HEIGHT){
ballSpeedX = -ballSpeedX;
} else {
ballReset()
}
}
if(ballX > canvas.width) {
if (ballY > paddle2Y &&
ballY < paddle2Y + PADDLE_HEIGHT) {
ballSpeedX = -ballSpeedX;
} else {
ballReset();
}
}
if(ballY < 0) {
ballSpeedY = -ballSpeedY;
}
if(ballY > canvas.height) {
ballSpeedY = -ballSpeedY;
}
function drawEverything() {
colorRect(0,canvas.width,canvas.height,'black');
colorRect(0,paddle1Y,PADDLE_WIDTH,PADDLE_HEIGHT,'white');
colorRect(canvas.width - PADDLE_WIDTH,paddle2Y,'white');
colorCircle(ballX,ballY,10,'white');
}
function colorCircle(centerX,centerY,radius,drawColor) {
canvasContext.fillStyle = drawColor;
canvasContext.beginPath();
canvasContext.arc(centerX,Math.PI*2,true);
canvasContext.fill();
}
function colorRect( leftX,topY,width,height,drawColor) {
canvasContext.fillStyle = drawColor;
canvasContext.fillRect(leftX,height);
}
</script>
</html>
我想分组 ID 并添加一列,显示一个 ID 雇佣的城市总数,以及一个显示地址 ID 的列。当我这样做时,因为地址中有缺失值,R 会自动为每个可能有缺失值的 ID 选择第一行。因此,结果应该如下:
ID Address Number of hiring
1 5
2 Montreal 2
3 3
4 Helsinki 4
1 London 1
2 3
3 dubai 5
我正在尝试在 R 中使用 dplyr
解决方法
您可以为每个 Address
选择第一个非空 ID
:
library(dplyr)
df %>%
group_by(ID) %>%
summarise(Address = Address[Address != ''][1],total_hiring = sum(Number_of_hiring,na.rm =TRUE))
# ID Address total_hiring
# <int> <chr> <int>
#1 1 London 6
#2 2 Montreal 5
#3 3 Dubai 8
#4 4 Helsinki 4
数据
df <- structure(list(ID = c(1L,2L,3L,4L,1L,3L),Address = c("","Montreal","","Helsinki","London","Dubai"),Number_of_hiring = c(5L,5L)),class = "data.frame",row.names = c(NA,-7L))
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。