微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

在对数转换数据上拟合线性模型,其中 n% 的数据低于线

如何解决在对数转换数据上拟合线性模型,其中 n% 的数据低于线

我想将模型拟合到假定为 y = alpha*x^beta 形式相关的数据。 我的数据如下所示:

enter image description here

并且可以使用此 dput 进行复制:

structure(list(y = c(15.8999997973442,34.4999990463257,60.0000017285347,234.099998548627,15.3000003099442,89.8999990224838,30,28.9999990463257,370.600006774068,80.2999995946884,91.3000009059906,39.9000015258789,71.0999984741211,6.20000004768372,8.99999995529652,38.0000007152557,17.5000001490116,29.400000333786,125.399999916553,4.80000007152557,0.899999976158142,40.0999994277954,2.5,45.8000001907349,133.599999904633,6.09999990463257,70.7999984622002,17.5,38.2999992370605,33.4000001698732,44.3000001907349,0.800000011920929,90.7999993562698,29.5,0.5,130.800000190735,195.300004005432,0.300000011920929,27.8999991416931,3.70000004768372,1,4.79999995231628,14.4999996423721,46.599998831749,3.3999999165535,7.40000009536743,18.5,37.6999998092651,24.800000667572,34.9000000953674,92.7000005245209,13.1999998092651,21.400000333786,110.799999713898,0.699999988079071,44.3999996185303,20.8999996185303,73.0000009536743,86.5000005364418,101.599999248981,32.3000005036592,4.1000000834465,167.699998855591,65.4999992847443,15.0999998152256,0.200000002980232,30.0999995470047,30.5,37.6999995708466,92.7999982833862,83.5999986678362,24.7000007629395,127.699999332428,25,27.8000001907349,29.6999999582767,62.800000667572,37.9999990463257,9.10000009834766,33.8000000119209,15.5000000298023,292.299997776747,15.9999995231628,68.3000026345253,28,30.3999996185303,20,5),x = c(3L,2L,6L,22L,4L,13L,7L,5L,1L,3L,9L,11L,8L,16L,1L)),row.names = c("494","7","476","478","462","68","357","397","105","216","53","248","366","338","478.1","190","119","147","371","418","231","208","19","337","408","90","44","488","435","13","249","434","419","408.1","209","120","47","526","82","84","3","1","485","278","15","414","467","459","137","105.1","425","492","532","170","68.1","429","347","491","29","215","151","316","352","116","465","237","376","513","472","186","453","504","157","261","403","434.1","469","333","83","417","301","242","46","234","487","278.1","134","183","19.1","288","98","411","434.2","117","375","5","356","313","356.1","359"),class = "data.frame")

我知道有很多(真的很好!!)类似问题的答案,例如:

https://stats.stackexchange.com/questions/61747/linear-vs-nonlinear-regression?rq=1

Fitting logarithmic curve in R

Exponential curve fitting in R

但是由于某种原因,我无法理解它。

我认为要做的是以下内容。我想在两个变量的对数转换空间中拟合线性模型。因为对数变换空间中的线性模型就像非变换空间中的指数模型?!我知道有很多关于错误分布的假设。让我们暂时把它们放在一边,因为这实际上更多地是关于对装配机制的理解。 我还想确保只有 n% 的数据低于拟合线。这似乎是分位数回归的完美案例。所以我做了以下事情:

plot(df$x,df$y)
# fit a linear quantile regression to the data
library(quantreg)
lm =rq(log(y) ~ log(x),data=df,tau = .05)
pr = predict(lm)
lines(exp(pr))

但我得到的是以下内容

enter image description here

虽然我期待的是:

enter image description here

对于这些不好的例子和对基本主题的完全误解,我真的很抱歉。但也许有人对我在这里没有得到的东西有所了解。

更新

我的意思是 R 中的 mammals 数据是这样的

# log transformed data
hist(log(df$body))
plot(log(brain) ~ log(body),mammals)
lm_log = lm(log(brain) ~ (log(body)),mammals) 
qr_log = rq(log(brain) ~ (log(body)),mammals,tau = .05) 
abline(lm_log)
abline(qr_log)

# using the linear model fitted on the log-transformed variables to predict and plot
# in the untransformed plot
new_data = data.frame(body = seq(min(df$body),max(df$body)),by=.5)
pr = predict(lm_log,newdata=new_data)
pr_qr = predict(qr_log,newdata=new_data)

plot(brain ~ body,mammals)
lines(exp(pr),col="green")
lines(exp(pr_qr),col="blue")

这给出了这个情节

enter image description here

解决方法

如果你只想要中线,我建议如下:

ggplot(data = df,aes(x=x,y=y)) + geom_point() + geom_quantile(quantiles = 0.5)

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。