为Google ML课程的推荐系统将tf1转换为tf2,造成损失

如何解决为Google ML课程的推荐系统将tf1转换为tf2,造成损失

enter image description here

我想将用tf1编写的笔记本转换为tf2,更具体地说,将 here中的softmax推荐人模型转换为:

这是来自ML课程的,以前的步骤介绍了the logic

我在Mac OS X上有tensorflow版本'2.3.1'

import pandas as pd
import numpy as np
from zipfile import ZipFile
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from pathlib import Path
import matplotlib.pyplot as plt
tf.config.experimental_run_functions_eagerly(True)

# tf.random.set_seed(42)
# np.random.seed(42)

# from urllib.request import urlretrieve
# import zipfile
# urlretrieve("http://files.grouplens.org/datasets/movielens/ml-100k.zip","movielens.zip")
# zip_ref = zipfile.ZipFile('movielens.zip',"r")
# zip_ref.extractall()
# print("Done. Dataset contains:")
# print(zip_ref.read('ml-100k/u.info'))

# Load each data set (users,movies,and ratings).
users_cols = ['user_id','age','sex','occupation','zip_code']
users = pd.read_csv(
    'ml-100k/u.user',sep='|',names=users_cols,encoding='latin-1')

ratings_cols = ['user_id','movie_id','rating','unix_timestamp']
ratings = pd.read_csv(
    'ml-100k/u.data',sep='\t',names=ratings_cols,encoding='latin-1')

# The movies file contains a binary feature for each genre.
genre_cols = [
    "genre_unkNown","Action","Adventure","Animation","Children","Comedy","Crime","Documentary","Drama","Fantasy","Film-Noir","Horror","Musical","Mystery","Romance","Sci-Fi","Thriller","War","Western"
]
movies_cols = [
    'movie_id','title','release_date',"video_release_date","imdb_url"
] + genre_cols
movies = pd.read_csv(
    'ml-100k/u.item',names=movies_cols,encoding='latin-1')

# Since the ids start at 1,we shift them to start at 0.
users["user_id"] = users["user_id"].apply(lambda x: str(x-1))
movies["movie_id"] = movies["movie_id"].apply(lambda x: str(x-1))
movies["year"] = movies['release_date'].apply(lambda x: str(x).split('-')[-1])
ratings["movie_id"] = ratings["movie_id"].apply(lambda x: str(x-1))
ratings["user_id"] = ratings["user_id"].apply(lambda x: str(x-1))
ratings["rating"] = ratings["rating"].apply(lambda x: float(x))

# Compute the number of movies to which a genre is assigned.
genre_occurences = movies[genre_cols].sum().to_dict()

# Since some movies can belong to more than one genre,we create different
# 'genre' columns as follows:
# - all_genres: all the active genres of the movie.
# - genre: randomly sampled from the active genres.
def mark_genres(movies,genres):
  def get_random_genre(gs):
    active = [genre for genre,g in zip(genres,gs) if g==1]
    if len(active) == 0:
      return 'Other'
    return np.random.choice(active)
  def get_all_genres(gs):
    active = [genre for genre,gs) if g==1]
    if len(active) == 0:
      return 'Other'
    return '-'.join(active)
  movies['genre'] = [
      get_random_genre(gs) for gs in zip(*[movies[genre] for genre in genres])]
  movies['all_genres'] = [
      get_all_genres(gs) for gs in zip(*[movies[genre] for genre in genres])]

mark_genres(movies,genre_cols)

# Create one merged DataFrame containing all the movielens data.
movielens = ratings.merge(movies,on='movie_id').merge(users,on='user_id')

# Utility to split the data into training and test sets.
def split_dataframe(df,holdout_fraction=0.1):
  """Splits a DataFrame into training and test sets.
  Args:
    df: a dataframe.
    holdout_fraction: fraction of dataframe rows to use in the test set.
  Returns:
    train: dataframe for training
    test: dataframe for testing
  """
  test = df.sample(frac=holdout_fraction,replace=False,random_state=42)
  train = df[~df.index.isin(test.index)]
  return train,test

rated_movies = (ratings[["user_id","movie_id"]]
                .groupby("user_id",as_index=False)
                .aggregate(lambda x: list(x)))


BATCH_SIZE = 32
EMbedDING_SIZE = 35

class softmaxRecommender(keras.Model):
    def __init__(self,feature_columns,**kwargs):
        super(softmaxRecommender,self).__init__(**kwargs)
        self.movie_embedding = layers.DenseFeatures(feature_columns['movie'])
        self.other_features = layers.DenseFeatures(feature_columns['other'])
        self.hidden_layer = layers.Dense(EMbedDING_SIZE,kernel_initializer=keras.initializers.Truncatednormal(
            stddev=1. / np.sqrt(EMbedDING_SIZE) / 10.))
        # self.linear = layers.Dense(num_movies,use_bias=True,activation='softmax')
        # self.softmax_activation = keras.layers.Activation('softmax')


    def call(self,inputs,train=True,**kwargs):
        V = self.movie_embedding({'movie_id': inputs['movie_id']})
        inputs.pop('movie_id')
        other_fts = self.other_features(inputs)
        U = tf.concat([V,other_fts],axis=1)
        U = self.hidden_layer(U)
        logits = tf.matmul(U,self.movie_embedding.get_weights()[0],transpose_b=True)
        # logits = self.linear(logits)
        # logits = self.softmax_activation(logits)
        return logits


def custom_loss(labels,logits):
    # labels = tf.reshape(labels,[-1])
    # labels_org = labels.copy()
    labels = select_random(labels)
    # labels = tf.cast(labels,'float32')
    labels = tf.reshape(labels,[-1,1])
    # cce = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    # preds = tf.argmax(logits,axis=1)
    # preds = tf.cast(preds,'float32')
    # loss = cce(y_true=labels,y_pred=logits)
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
        logits=logits,labels=labels))
    return loss




years_dict = {
    movie: year for movie,year in zip(movies["movie_id"],movies["year"])
}
genres_dict = {
    movie: genres.split('-')
    for movie,genres in zip(movies["movie_id"],movies["all_genres"])
}

def select_random(x):
  """Selectes a random elements from each row of x."""
  def to_float(x):
    return tf.cast(x,tf.float32)
  def to_int(x):
    return tf.cast(x,tf.int64)
  batch_size = tf.shape(x)[0]
  rn = tf.range(batch_size)
  nnz = to_float(tf.math.count_nonzero(x >= 0,axis=1))
  rnd = tf.random.uniform([batch_size])
  ids = tf.stack([to_int(rn),to_int(nnz * rnd)],axis=1)
  return to_int(tf.gather_nd(x,ids))

def make_embedding_col(key,embedding_dim):
  categorical_col = tf.feature_column.categorical_column_with_vocabulary_list(
      key=key,vocabulary_list=list(set(list(movies[key].values))),num_oov_buckets=0)
  return tf.feature_column.embedding_column(
      categorical_column=categorical_col,dimension=embedding_dim,initializer=keras.initializers.Truncatednormal(
                                             stddev=1. / np.sqrt(embedding_dim) / 10.),# default initializer: trancated normal with stddev=1/sqrt(dimension)
      combiner='mean')

def make_dataset(ratings):
  """Creates a batch of examples.
  Args:
    ratings: A DataFrame of ratings such that examples["movie_id"] is a list of
      movies rated by a user.
    batch_size: The batch size.
  """
  def pad(x,fill):
    return pd.DataFrame.from_dict(x).fillna(fill).values

  movie = []
  year = []
  genre = []
  label = []
  for movie_ids in ratings["movie_id"].values:
    movie.append(movie_ids)
    genre.append([x for movie_id in movie_ids for x in genres_dict[movie_id]])
    year.append([years_dict[movie_id] for movie_id in movie_ids])
    label.append([int(movie_id) for movie_id in movie_ids])

  features = {
      "movie_id": pad(movie,""),"year": pad(year,"genre": pad(genre,}

  y = pad(label,-1)
  # y = select_random(y)
  # print('y.nunique:',len(np.unique(y.numpy())))

  return features,y


train_rated_movies,test_rated_movies = split_dataframe(rated_movies)
x_train,y_train = make_dataset(train_rated_movies)
x_val,y_val = make_dataset(test_rated_movies)


# create two separate DenseFeatures in model to be able to access movie_embeddings
feature_cols = {
    'movie': [make_embedding_col("movie_id",35)],'other': [make_embedding_col("genre",3),make_embedding_col("year",2)]
}


model = softmaxRecommender(feature_columns=feature_cols)
model.compile(
    loss=custom_loss,optimizer=keras.optimizers.SGD(.1,clipnorm=1)
    # loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),clipnorm=1)
)

# batch_x1 = {
#     'movie_id':x_train['movie_id'][:32],#     'year':x_train['year'][:32],#     'genre':x_train['genre'][:32]
#            }
#
# batch_y1 = y_train[:32]
#
# logits = model(batch_x1)
# loss = custom_loss(batch_y1,logits)
#
# batch_x2 = {
#     'movie_id':x_train['movie_id'][32:64],#     'year':x_train['year'][32:64],#     'genre':x_train['genre'][32:64]
#            }
#
# batch_y2 = y_train[32:64]
#
# logits = model(batch_x2)
# loss = custom_loss(batch_y2,logits)

history = model.fit(
    x=x_train,y=y_train,batch_size=32,epochs=10,verbose=2,validation_data=(x_val,y_val),)

上述带有{_1}}且具有custom_loss的模型会导致梯度爆炸。

我还尝试在模型类中添加softmax激活,并在对tf.nn.softmax_cross_entropy_with_logits函数中的标签执行keras.losses.SparseCategoricalCrossentropy(from_logits=True)时使用select_random一次,而不是每批处理一次。这种方法使重量保持稳定。如果我使用make_dataset而未在logit中添加softmax激活,则损失会再次爆炸。这可能是因为矩阵中的所有值都太小了,以至于softmax赋予了它们相等的概率。

我对keras.losses.SparseCategoricalCrossentropy及其keras.Model方法的理解可能是错误的,并且/或者是我的call()层应用程序。我知道随着损失成倍增加,可能是数据集出了问题,但是数据准备步骤是直接从原始代码中复制的。

非常感谢您的帮助!

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其他元素将获得点击?
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。)
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbcDriver发生异常。为什么?
这是用Java进行XML解析的最佳库。
Java的PriorityQueue的内置迭代器不会以任何特定顺序遍历数据结构。为什么?
如何在Java中聆听按键时移动图像。
Java“Program to an interface”。这是什么意思?
Java在半透明框架/面板/组件上重新绘画。
Java“ Class.forName()”和“ Class.forName()。newInstance()”之间有什么区别?
在此环境中不提供编译器。也许是在JRE而不是JDK上运行?
Java用相同的方法在一个类中实现两个接口。哪种接口方法被覆盖?
Java 什么是Runtime.getRuntime()。totalMemory()和freeMemory()?
java.library.path中的java.lang.UnsatisfiedLinkError否*****。dll
JavaFX“位置是必需的。” 即使在同一包装中
Java 导入两个具有相同名称的类。怎么处理?
Java 是否应该在HttpServletResponse.getOutputStream()/。getWriter()上调用.close()?
Java RegEx元字符(。)和普通点?