如何解决如何使用data.table在组之间找到线性回归?
我正在尝试找到以下数据集的所有可用组之间的线性回归。
library(data.table)
dt <- data.table(time = c(rep(rep(1:100,times = 1),4),rep(1:30,times = 1)),group = c(rep(c("a","b","c","d"),each = 100),rep("e",30)),value = rnorm(430))
dt[]
time group value
1: 1 a 0.1625954
2: 2 a -1.2288462
3: 3 a -0.1628570
4: 4 a 1.0597886
5: 5 a -1.1828334
---
426: 26 e -1.3762654
427: 27 e 0.3761436
428: 28 e -1.6982330
429: 29 e 0.1940263
430: 30 e -0.4631258
输出应该类似于
group1 group2 regression
a b 1.2
a c 0.3
b c 0.5
d a 4.3
...
我正在寻找仅使用 data.table 库的解决方案。
解决方法
使用新数据,我们可以将数据按“分组”var dogs = [];
var about;
var button;
var canvas;
var exit;
var toggle = true;
var w = window.innerWidth;
var h = window.innerHeight;
//function preload() {
//for(let i=0; i<11; i++) {
// dogs[i] = //loadImage(`Images/Batch/dog${i}.jpg`);
//}
//about = select('.about-container');
//about.style('display','none');
//}
function setup() {
canvas = createCanvas(w,h);
frameRate(5);
angleMode(DEGREES);
for (let i = 0; i < 11; i++) {
loadImage(`Images/Batch/dog${i}.jpg`,img => dogs.push(img));
}
about = select('.about-container');
//about.style('display','none');
button = select('.about-button-text');
button.mousePressed(showAbout);
exit = select('.exit');
exit.mousePressed(hideAbout);
//noLoop();
}
function draw() {
fill(255);
textSize(25);
let angle = random(-45,45);
rotate(angle);
//index = random(dogs);
//index.width = w/3;
let img = random(dogs);
let x = random(w);
let y = random(h);
let aspect = img.height/img.width;
//image(index,x,y);
image(img,y,w/3,(w/3)*aspect);
}
function mousePressed() {
if (toggle) {
print('toggle off')
noLoop();
toggle = false;
} else {
print('toggle on')
loop();
toggle = true;
}
}
function showAbout() {
//about.show();
about.style('display','block');
}
function hideAbout() {
about.hide();
}
function windowResized() {
print(`Resize ${w}x${h}`);
resizeCanvas(w,h);
}
//window.onresize = function() {
// assigns new values for width and height variables
// w = window.innerWidth;
// h = window.innerHeight;
// canvas.size(w,h);
// draw();
// }
成split
。然后,在list
的{{1}}上使用combn
进行成对组合,提取names
元素(list
,list
),检查是否有 s1
个常见的“时间”(s2
)。使用基于 any
的条件,即如果有共同元素,然后将 intersect
应用于相应的“值”列,创建一个带有汇总 length
和组名称的 data.table和 lm
coef
元素
rbind
-输出
list
如果我们想要完整的组合,请使用 library(data.table)
lst1 <- split(dt,dt$group)
rbindlist(combn(names(lst1),2,FUN = function(x) {
s1 <- lst1[[x[1]]]
s2 <- lst1[[x[2]]]
i1 <- intersect(s1$time,s2$time)
if(length(i1) > 0) na.omit(s1[s2,on = .(time)][,. (group1 = first(s1$group),group2 = first(s2$group),regression = lm(i.value ~ value)$coef[2])])
else
data.table(group1 = first(s1$group),regression = NA_real_)},simplify = FALSE))
或 group1 group2 regression
1: a b 0.03033996
2: a c 0.06391242
3: a d -0.09138112
4: a e -0.27738183
5: b c 0.05663270
6: b d 0.05481604
7: b e 0.27789495
8: c d -0.13987978
9: c e 0.16388299
10: d e 0.12380720
(来自 expand.grid
CJ
-输出
data.table
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。