微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

CoPurchase Recommendation 引擎返回 NaN

如何解决CoPurchase Recommendation 引擎返回 NaN

我的数据集包含大约 10 万个条目和 800 个产品。当我尝试预测可能的匹配项时,它会为最流行的产品返回 NaN(因此它们应该有最多的条目)。

我还将 ProductId/copurchaseProductId (Guid) 转换为字符串以使用它们。

谁能指出我是否做错了什么,或者我的数据集是否太小。

var mlContext = new MLContext();

IDataView traindata = mlContext.Data.LoadFromEnumerable(data: productEntries);

// Your data is already encoded so all you need to do is specify options for MatrixFactorizationTrainer with a few extra hyper parameters
// LossFunction,Alpha,Lambda and a few others like K and C as shown below and call the trainer. 
MatrixFactorizationTrainer.Options options = new MatrixFactorizationTrainer.Options();
options.MatrixColumnIndexColumnName = nameof(ProductEntry.ProductIdEncoded);
options.MatrixRowIndexColumnName = nameof(ProductEntry.copurchaseProductIdEncoded);
options.LabelColumnName = nameof(ProductEntry.Label);
options.LossFunction =  MatrixFactorizationTrainer.LossFunctionType.SquareLossOneClass;
options.Alpha = 0.01;
options.Lambda = 0.025;

// For better results use the following parameters
options.ApproximationRank = 100;
options.C = 0.00001;

var dataProcessLine = mlContext.Transforms.Conversion.MapValuetoKey(outputColumnName: nameof(ProductEntry.ProductIdEncoded),inputColumnName: nameof(ProductEntry.ProductId))
.Append(mlContext.Transforms.Conversion.MapValuetoKey(outputColumnName: nameof(ProductEntry.copurchaseProductIdEncoded),inputColumnName: nameof(ProductEntry.copurchaseProductId)));


// Step 4: Call the MatrixFactorization trainer by passing options.
var est = dataProcessLine.Append( mlContext.Recommendation().Trainers
.MatrixFactorization(options: options)) ;

// STEP 5: Train the model fitting to the DataSet
ITransformer model = est.Fit(input: traindata);


var predictionEngine = mlContext.Model.CreatePredictionEngine<ProductEntry,copurchasePrediction>(transformer: model);

//Manual test of the prediction
var allProducts = Products.Where(p => p.ActiveState > 0).ToList();
foreach (var popularProduct in mostPopularProducts.Take(5))
{
var product = allProducts.Where(p  => p.Id == popularProduct.Id ).FirstOrDefault();

var label = SplitByLanguageHelper.Split(product.Title);


var top5 = allProducts.Where(p => p.Id != product.Id)
    .Select(p => Prediction.GetPrediction(predictionEngine,product.Id,p.Id))
    .OrderByDescending(p => p.score)
    .Take(5).ToList();
var result = top5.Select(prediction => new
{
    score = prediction.score,OrigProductIdLabel= SplitByLanguageHelper.Split(allProducts.Where(dl => dl.Id == prediction.ProductId).FirstOrDefault().Title),coproductIdLabel = SplitByLanguageHelper.Split(allProducts.Where(dl => dl.Id == prediction.copurchaseProductId).FirstOrDefault().Title)

}).ToList();//all return a NaN score :(

result.Dump($"Predictions from {SplitByLanguageHelper.Split(product.Title)}");

}

public static class Prediction
{
    public static ProductcopurchasePrediction GetPrediction(PredictionEngine<ProductEntry,copurchasePrediction> predictionEngine,Guid productId,Guid copurchaseProductId)
{
        copurchasePrediction prediction = predictionEngine.Predict(
    new ProductEntry { ProductId = productId.ToString(),copurchaseProductId = copurchaseProductId.ToString() });

        return new ProductcopurchasePrediction
        {
            ProductId = productId,copurchaseProductId = copurchaseProductId,score = prediction.score
    };
}
}


public class copurchasePrediction
{
/// <summary>
/// Gets or sets the score.
/// </summary>
/// <value>The score.</value>
public float score { get; set; }
}

public class ProductEntry
{
/// <summary>
/// Gets or sets the co purchase product identifier.
/// </summary>
/// <value>The co purchase product identifier.</value>
//[KeyType(262111)]
//[NoColumn] 
public string copurchaseProductId { get; set; }

[KeyType(262111)]
public UInt32 copurchaseProductIdEncoded { get; set; }

public float Label { get; set; }

public string ProductId { get; set; }

[KeyType(262111)]
public UInt32 ProductIdEncoded { get; set; }

public override string ToString()
{
    return $"Prod: {ProductId},copurchase: {copurchaseProductId}"; 
}
}

public class ProductcopurchasePrediction
{

public Guid copurchaseProductId { get; set; }


public Guid ProductId { get; set; }

public float score { get; set; }
}

public static class SplitByLanguageHelper
{
public static string Split(string text)
{
    if (string.IsNullOrEmpty(text)) return "";

    int firstChar = text.IndexOf("<NL>");
    int lastChar = text.IndexOf("</NL>");

    return text.Substring(firstChar + 4,lastChar - (firstChar + 4));
}

}

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?