回归网络的完整整数量化后的极高误差

如何解决回归网络的完整整数量化后的极高误差

我已经训练了一个具有64个节点的隐藏层的完全连接的神经网络。我正在测试Medical Cost数据集。使用原始精度模型，平均绝对误差为0.22063259780406952。使用量化为float16或integer quantization with float fallback的模型时，原始误差与低精度模型之间的差异永远不会超过0.1。但是，如果我执行full integer quantization，则错误会上升到不合理的程度。在这种情况下，它会跳到接近60。我不知道这是否是TensorFlow中的错误，或者我是否使用了错误的API，或者量化后这是否是合理的行为。任何帮助表示赞赏。显示转换和推断的代码如下所示：

预处理

import math
import pathlib
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import pandas as pd
from sklearn import preprocessing as pr
from sklearn.metrics import mean_absolute_error

url = 'insurance.csv'
column_names = ["age","sex","bmi","children","smoker","region","charges"]

dataset = pd.read_csv(url,names=column_names,header=0,na_values='?')

dataset = dataset.dropna()  # Drop rows with missing values
dataset['sex'] = dataset['sex'].map({'female': 2,'male': 1})
dataset['smoker'] = dataset['smoker'].map({'yes': 1,'no': 0})

dataset = pd.get_dummies(dataset,prefix='',prefix_sep='',columns=['region'])

# this is a trick to convert a dataframe to 2d array,scale it and
# convert back to dataframe
scaled_np = pr.StandardScaler().fit_transform(dataset.values)
dataset = pd.DataFrame(scaled_np,index=dataset.index,columns=dataset.columns)

训练和测试分组

train_dataset = dataset.sample(frac=0.8,random_state=0)
test_dataset = dataset.drop(train_dataset.index)

train_features = train_dataset.copy()
test_features = test_dataset.copy()

train_labels = train_features.pop('charges')
test_labels = test_features.pop('charges')

原始模型训练

def build_and_compile_model():
    model = keras.Sequential([
        layers.Dense(64,activation='relu',input_shape=(len(dataset.columns) - 1,)),layers.Dense(1)
    ])

    model.compile(loss='mean_absolute_error',optimizer=tf.keras.optimizers.Adam(0.001))
    return model


dnn_model = build_and_compile_model()
dnn_model.summary()

dnn_model.fit(train_features,train_labels,validation_split=0.2,verbose=0,epochs=100)

print("Original error = {}".format(
    dnn_model.evaluate(test_features,test_labels,verbose=0)))

转换为低精度模型

converter = tf.lite.TFLiteConverter.from_keras_model(dnn_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

def representative_data_gen():
    for input_value in tf.data.Dataset.from_tensor_slices(
            train_features.astype('float32')).batch(1).take(100):
        yield [input_value]


converter.representative_dataset = representative_data_gen

# Full Integer Quantization
# Ensure that if any ops can't be quantized,the converter throws an error
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# Set the input and output tensors to uint8 (APIs added in r2.3)
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

tflite_model_quant = converter.convert()

dir_save = pathlib.Path(".")
file_save = dir_save / "model_16.tflite"
file_save.write_bytes(tflite_model_quant)

实例化TFLite模型

interpreter = tf.lite.Interpreter(model_path=str(file_save))
interpreter.allocate_tensors()

评估精度较低的模型

def evaluate_model(interpreter,test_images,test_labels):
    input_details = interpreter.get_input_details()[0]
    input_index = interpreter.get_input_details()[0]["index"]
    output_index = interpreter.get_output_details()[0]["index"]

    # Run predictions on every image in the "test" dataset.
    prediction_digits = []
    for test_image in test_images:
        if input_details['dtype'] == np.uint8:
            input_scale,input_zero_point = input_details['quantization']
            test_image = test_image / input_scale + input_zero_point

        test_image = np.expand_dims(test_image,axis=0).astype(input_details['dtype'])
        interpreter.set_tensor(input_index,test_image)

        # Run inference.
        interpreter.invoke()

        output = interpreter.get_tensor(output_index)
        prediction_digits.append(output[0])


    filtered_labels,correct_digits = map(
        list,zip(*[(x,y) for x,y in zip(test_labels,prediction_digits)
              if not math.isnan(y)]))
    return mean_absolute_error(filtered_labels,correct_digits)

print(evaluate_model(interpreter,test_features[:].values,test_labels))

解决方法

在进行量化（通常是和机器学习）时，需要注意数据的外观。对您拥有的数据进行一定程度的量化是否有意义？

在像您这样的回归问题的情况下，基本真理在[1121.8739;63770.42801]范围内，并且一些输入数据也处于浮动状态，则很可能使用该数据训练模型，然后对其进行量化整数不会产生好的结果。

您训练了模型以输出[1121.8739;63770.42801]范围内的值，并且在int8中进行量化后，它将只能输出[-127;128]范围内的值，不带小数点。显然，当您将量化模型的结果与真实性进行比较时，误差将跃升至顶峰。

如果仍然要应用量化，该怎么办？您需要在量化集的范围内移动数据。在您的情况下，以仍然有意义的方式将float32数据转换为int8。您将看到实际用例中的性能大幅下降。毕竟，在存在回归问题的情况下，您从大约 2500万个可能的输出值域中转移（假设尾数为23位，指数为8位，请参见Single Precision Floating Point和{{3 }}）到具有 256 （2 ^ 8）个可能输出的域。

但是真的天真 方法可能是应用以下转换：

def scale_down_data(data):
  max_value = data.max()
  min_value = data.min()
  # normalizing between -128 and 127
  scaled_down = 255*((data-min_value)/(max_value-min_value)) -128
  return scaled_down.astype(np.int8)

在实践中，最好查看数据的分布，并进行变换以为您提供更大范围的密集数据。您也不想将回归范围限制在训练集的范围内。而且您需要对不在量化域中的每个输入或输出进行分析。