如何对矩阵进行加权随机抽样 - MatLab

如何解决如何对矩阵进行加权随机抽样 - MatLab

我想从一个 excel 文件（“DataTab”，请参见图 1）开始，从 m × n 矩阵创建一个加权样本。

第一列 (UGT) 表示矩阵的 ID，B-F 列表示与每个 UGT 的变量“fi”相关联的概率。 “fi”是一个具有值 10、20、30、40、50 的标量：这意味着 UGT 1101 有 50% 的概率具有值 10 或 20，UGT 1102 的概率可以是 30 (50%) 或 40 ( 50%），依此类推。

我使用了仅适用于标量的“randsample 函数”，我无法将其用于我的情况。

try:
    status,stdout,stderr = g.execute(
        ['git','worktree','add','--no-checkout','foo','HEAD'],env=env,with_extended_output=True)
    print(stdout)
except Exception as e:
    logging.exception(e)
    # more stuff

我将整个代码与“Fv2”变量一起正确工作，而不是与“Fv1”一起工作，这是我的问题的目标。

fi = [10,20,30,40,50];
p = [0.15,0.20,0.30,0.10,0.25];
n=10000;
sample = randsample (fi,n,true,p);
hist(sample,fi);
bar(fi,n*p,0.7);

“Fv1”变量必须是这样的（没有第一行，我仅作为示例显示以更好地理解）：

Image 2 - final risults

有人可以帮我吗？

@obchardon 我在 R2019b 版本上运行后必须在这里回答：

enter image description here enter image description here

问题是“p”来自一个excel文件，所以不能写成标量。我怎样才能解决这个问题？此外，第一列中“fi”的频率不考虑输入概率。使用此代码，频率受到尊重，但您应该以某种方式为每个 ugt 更改“p”。

Nsample = 10; 
DataTab = xlsread('Scenari_stabilita_R6.xlsx','S1','A2:f6');
Ugt   = csvread('raster_ugt.acs'); 
UgtV   = reshape(Ugt,[],1);
MRas   = [UgtV];
MCycle  = MRas(~idxNaN,:);
a=unique(MCycle(:,4));
idxDataTab=ismember(DataTab(:,1),a);
DataTab2 = DataTab(idxDataTab,:);
nUGT = length(a);    
fi = [10,50];
Fv = [];
    for i = 1:nUGT
        Fv1 = randsample (fi,Nsample,DataTab2(i,:));
        %Fv2 = (DataTab2(i,3)-DataTab2(i,2)).*rand(Nsample,1) +
        %DataTab2(i,2); % this line calculates uniform distribution and
        %it has to be modified into weighted sampling
        Fv = [Fv,Fv1];
    end

解决方法

一种简单的方法是 cumsum() 您的概率向量，在 n 和 0 之间生成 1 随机数并检查累积向量从哪个位置开始变小或等于随机数：

% INPUT
n  = 10
fi = [10,20,30,40,50]       % given number
p  = [0.3,0.3,0.2,0.1,0.1
      0.5,0.5,0 ].' % given probability

% COMPUTATION
sp = size(p)     
rd = rand(n,1,sp(2))                   % 10x1x2 random number between 0 and 1.
cp = reshape(cumsum(p),sp(1),sp(2))  % cumulative probability
[~,ind] = max(rd<=cp,[],2)             % Get the position where rd start to be smaller or equal to cp.                     
res = squeeze(fi(ind))                 % Extract the corresponding value

与

我们只需要为 rd 和 cp 添加一个 3rd 维度，以便它们可以被广播。

% Broadcasting along the first and second dimension
rd = 10 x 1 x 2
     ↓    ↓
cp = 1  x 5 x 2