向量到 R - 编程之家

如何解决向量到 R

我正在尝试创建一个函数，该函数接受一个向量并创建两个滑动矩阵，如下所示：

Input,Output
[d01,d02,d03,d04,d05,d06,d07],[d08,d09,d10,d11,d12,d13,d14]
[d02,d07,d08],[d09,d14,d15]
...

我尝试将 Python 代码改编为 R，但我遇到了一些问题并且找不到错误（我不习惯 R）

这是 R 代码：

create_dataset = function(data,n_input,n_out){
        datax = c()
        dataY = c()
        in_start = 0
        for (i in 1:range(length(data))) {
                #define the end of the input sequence
                in_end = in_start + n_input
                out_end = in_end + n_out
                        if(out_end <= length(data)){
                                x_input = data[in_start:in_end,1]
                                X = append(x_input)
                                y = append(data[in_end:out_end],1)
                        }
                #move along one time step
                in_start = in_start + 1
        }
        
   X; Y
}

调用此函数时出现此错误

> create_dataset(data,n_input = 5,n_out = 5)
Error in data[in_start:in_end,1] : incorrect number of dimensions
In addition: Warning message:
In 1:range(length(data)) :
  numerical expression has 2 elements: only the first used

编辑：

添加我尝试适应 R 的 Python 代码

# convert history into inputs and outputs
def to_supervised(train,n_out):
    X,y = list(),list()
    in_start = 0
    # step over the entire history one time step at a time
    for _ in range(len(data)):
        # define the end of the input sequence
        in_end = in_start + n_input
        out_end = in_end + n_out
        # ensure we have enough data for this instance
        if out_end <= len(data):
            x_input = data[in_start:in_end,0]
            x_input = x_input.reshape((len(x_input),1))
            X.append(x_input)
            y.append(data[in_end:out_end,0])
        # move along one time step
        in_start += 1
    return array(X),array(y)

解决方法

这里有两种方法。另见Lagging time series data

1) 通常在 R 中，采用整个对象方法而不是迭代索引。现在，假设输入 v、k1 和 k2，我们将 e 计算为具有 k1+k2 列的滑动矩阵。那么前 k1 列是第一个矩阵，其余列是第二个。

# inputs
v <- 1:12   # 1,2,...,12
k1 <- k2 <- 3

k <- k1 + k2
e <- embed(v,k)[,k:1]

ik1 <- 1:k1
e[,ik1]
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    2    3    4
## [3,]    3    4    5
## [4,]    4    5    6
## [5,]    5    6    7
## [6,]    6    7    8
## [7,]    7    8    9

e[,-ik1]
##      [,]    4    5    6
## [2,]    5    6    7
## [3,]    6    7    8
## [4,]    7    8    9
## [5,]    8    9   10
## [6,]    9   10   11
## [7,]   10   11   12

2) 关于问题中的 R 代码：

在 R 中，range 函数接受一个向量输入并返回一个包含最小值和最大值的 2 元素向量，因此它不是 for 循环中想要的，请改用 seq_along
R 中的索引从 1 而不是 0 开始
函数的返回值必须是单个对象。我们返回一个包含两个元素的矩阵列表。
迭代追加到对象在 R 中效率低下。这可以通过预分配结果或不使用循环来解决；但是，我们在下面没有解决这个问题，因为我们在上面的 (1) 中已经有了更好的实现。
问题代码中变量的命名不一致

虽然这整个方法不是通常编写 R 软件的方式，但为了进行最小的更改以使其工作，我们可以编写以下内容。

# data is plain vector,n_input and n_out are scalars
# result is 2 element list of matrices
create_dataset = function(data,n_input,n_out){
        X <- matrix(nrow = 0,ncol = n_input)
        Y <- matrix(nrow = 0,ncol = n_out)
        in_start <- 0
        for (i in seq_along(data)) {
                #define the end of the input sequence
                in_end <- in_start + n_input
                out_end <- in_end + n_out
                        if(out_end <= length(data)){
                                X <- rbind(X,data[(in_start+1):in_end])
                                Y <- rbind(Y,data[(in_end+1):out_end])
                        }
                #move along one time step
                in_start = in_start + 1
        }
        
   list(X,Y)
}

# inputs defined in (1)
create_dataset(v,k1,k2)

给出矩阵的这两个元素列表：

[[1]]
     [,3]
[1,]    1    2    3
[2,]    2    3    4
[3,]    3    4    5
[4,]    4    5    6
[5,]    5    6    7
[6,]    6    7    8
[7,]    7    8    9

[[2]]
     [,]    4    5    6
[2,]    5    6    7
[3,]    6    7    8
[4,]    7    8    9
[5,]    8    9   10
[6,]    9   10   11
[7,]   10   11   12