Pytesseract 检测乱码

如何解决Pytesseract 检测乱码

我有一个简单的 PyTesseract 脚本，它在 discord bot 中运行以检测图像中的文本。然而，当给定 here 时，它返回 ['ESC es Sum Ls a ns ay','on','','Sa eon','Lape een ne eeren eee eserees','omeereer ee ate erence ecco at arte','Ue te eect eet rac contac',' ','ree Cee ed','ema eect eens','\x0c'] 我的代码是

im = cv2.imread(attachment.filename)
            config = ('-l eng --oem 1 --psm 3')
            text = PyTesseract.image_to_string(im,config=config)
            text = text.split('\n')

解决方法

感谢巴尼的回答，但我所做的是

            image = Image.open(attachment.filename)
            if image.mode == 'RGBA':
                r,g,b,a = image.split()
                rgb_image = Image.merge('RGB',(r,b))

                inverted_image = PIL.ImageOps.invert(rgb_image)

                r2,g2,b2 = inverted_image.split()

                final_transparent_image = Image.merge('RGBA',(r2,b2,a))

                final_transparent_image.save(attachment.filename)

            else:
                inverted_image = PIL.ImageOps.invert(image)
                inverted_image.save(attachment.filename)
            im = cv2.imread(attachment.filename)
            text = pytesseract.image_to_string(im)

基本上反转颜色/颜色并将其更改为RGBA。我从中得到了完美的读数！