Tensorflow模型的预处理图像，而不是Pytorch预处理

如何解决Tensorflow模型的预处理图像，而不是Pytorch预处理

当输入图像得到此预处理时，我有一个pytorch resnet101编码器模型：

import torchvision as tv
from PIL import Image

data_transforms = tv.transforms.Compose([
    tv.transforms.Resize((224,224)),tv.transforms.ToTensor(),tv.transforms.Normalize(mean=[0.485,0.456,0.406],std=[0.229,0.224,0.225])])
img = Image.open(img_path)
img = img.convert('RGB')
img = data_transforms(img)
img = torch.FloatTensor(img)
img = img.unsqueeze(0)
print(img)

pytorch image tensor

在这种情况下，编码器的输入形状为[1，3，224，224]，此图片使用ImageNet的均值和std标准化。 现在我将该模型导出到tensorflow，那么，如何对tf-model进行相同的图像预处理？

我试图做这样的事情：

from PIL import Image

img = Image.open(img_path)
img = img.convert('RGB')
img = tf.keras.preprocessing.image.img_to_array(img)
img = tf.image.resize(tf_img,(224,224))
img = tf.keras.applications.resnet.preprocess_input(img)# now shape is [224,224,3]
img = tf.reshape(img,[1,3,224])
print(img)

tensorflow image tensor

但是我确定我做错了，因为对于一张图像来说，炬和张量张量看起来非常不同，并且输出结果与一种编码器模型完全不同。

任何人都可以帮忙，我应该在tf预处理中解决什么？

解决方法

此：

img = Image.open(img_path)
img = img.convert('RGB')

可以替换为

image = tf.io.read_file(filename=filepath)
image = tf.image.decode_jpeg(image,channels=3) #or decode_png

此外，unsqueeze和squeeze的对立是expand_dims：

  img = tf.expand_dims(img,axis=0)

一切正常，只需确保

tf.keras.applications.resnet.preprocess_input(img) `and` data.transforms()

产生所需/必要的转换。

对于照片，我非常确定您在PyTorch的情况下错过了/255.0，在TensorFlow的情况下错过了255.0除法。

实际上，当深入研究Keras后端时，您可以看到，当您调用预处理功能时，它将在此处调用此功能：

def _preprocess_numpy_input(x,data_format,mode):
  """Preprocesses a Numpy array encoding a batch of images.

  Arguments:
    x: Input array,3D or 4D.
    data_format: Data format of the image array.
    mode: One of "caffe","tf" or "torch".
      - caffe: will convert the images from RGB to BGR,then will zero-center each color channel with
          respect to the ImageNet dataset,without scaling.
      - tf: will scale pixels between -1 and 1,sample-wise.
      - torch: will scale pixels between 0 and 1 and then
          will normalize each channel with respect to the
          ImageNet dataset.

  Returns:
      Preprocessed Numpy array.
  """
  if not issubclass(x.dtype.type,np.floating):
    x = x.astype(backend.floatx(),copy=False)

  if mode == 'tf':
    x /= 127.5
    x -= 1.
    return x
  elif mode == 'torch':
    x /= 255.
    mean = [0.485,0.456,0.406]
    std = [0.229,0.224,0.225]
  else:
    if data_format == 'channels_first':
      # 'RGB'->'BGR'
      if x.ndim == 3:
        x = x[::-1,...]
      else:
        x = x[:,::-1,...]
    else:
      # 'RGB'->'BGR'
      x = x[...,::-1]
    mean = [103.939,116.779,123.68]
    std = None

  # Zero-center by mean pixel
  if data_format == 'channels_first':
    if x.ndim == 3:
      x[0,:,:] -= mean[0]
      x[1,:] -= mean[1]
      x[2,:] -= mean[2]
      if std is not None:
        x[0,:] /= std[0]
        x[1,:] /= std[1]
        x[2,:] /= std[2]
    else:
      x[:,:] -= mean[0]
      x[:,1,:] -= mean[1]
      x[:,2,:] -= mean[2]
      if std is not None:
        x[:,:] /= std[0]
        x[:,:] /= std[1]
        x[:,:] /= std[2]
  else:
    x[...,0] -= mean[0]
    x[...,1] -= mean[1]
    x[...,2] -= mean[2]
    if std is not None:
      x[...,0] /= std[0]
      x[...,1] /= std[1]
      x[...,2] /= std[2]
  return x

Keras和TensorFlow中用于ResNet50预处理的默认mode参数令人惊讶地不是tf而是caffe。

因此，对图像进行的预处理在else分支上（我在后面添加else分支和代码，以便您可以遵循转换并查看丢失的内容）：

 else:
    if data_format == 'channels_first':
      # 'RGB'->'BGR'
      if x.ndim == 3:
        x = x[::-1,123.68]
    std = None
# Zero-center by mean pixel
  if data_format == 'channels_first':
    if x.ndim == 3:
      x[0,2] /= std[2]
  return x

说明是：

caffe：将图像从RGB转换为BGR，然后将每个颜色通道零居中关于ImageNet数据集，无需缩放。

Tensorflow模型的预处理图像，而不是Pytorch预处理

如何解决Tensorflow模型的预处理图像，而不是Pytorch预处理

解决方法

相关推荐