Tesserocr UnicodeDecodeError:

如何解决Tesserocr UnicodeDecodeError:

所以我被这个错误困住了：UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

这是代码。我知道这很草率，但我只是想弄清楚如何解决问题。

def unzip(resp: ClientResponse):
    """Reading and unpacking archive in memory"""
    img_buff = BytesIO()   #creating in-memory buffer
    with ZipFile(BytesIO(await resp.content.read())) as unziped_pages:  #unzipping the archive in memory
        for page in unziped_pages.namelist():
            with Image.open(BytesIO(unziped_pages.read(page))) as im:  # type: Image.Image   #creating from bytes and extracting pictures one by one
                im.save(img_buff,format='JPEG',quality=100) # saving pictures as bytes to in-memory buffer
                result = img_buff.getvalue() # getting the bytes for each picture
                with PyTessBaseAPI() as api:    # < ---- this is where error starts
                    api.SetimageFile(result)
                    print(api.GetUTF8Text())
                    print(api.AllWordConfidences())

这里是这个函数中发生的事情：

我收到了 .zip 文件的回复
因为我没有在光盘上保存任何内容，所以我使用 BytesIO 将所有内容保存在内存中（老实说我不太明白为什么我必须使用 BytesIO ，因为 resp.content.read() 是类型 {{1 }} 也一样。）
我正在使用 bytes 再次创建 Pillow 对象。
我正在将图像保存到缓冲区，以便稍后使用
通过宣布 BytesIO，我将枕头 obj 转换为字节。

然后我收到此错误。

可能是因为图片上的文字是俄语？那我该怎么办？

先谢谢你！！

Tesserocr UnicodeDecodeError:

如何解决Tesserocr UnicodeDecodeError:

相关推荐