使用 Apple Video Toolkit 进行 H264 解码

如何解决使用 Apple Video Toolkit 进行 H264 解码

我正在尝试使用 Apple Video ToolBox 和 openh264 的组合来让 H264 流媒体应用程序在各种平台上运行。有一个用例不起作用，我找不到任何解决方案。当源在运行 MacOS High Sierra 的 2011 iMac 上使用视频工具箱并且接收器是运行 Big Sur 的 MacBook Pro 时。

在接收器上，解码图像大约是 3/4 绿色。如果我在编码之前将图像缩小到原始图像的 1/8 左右，那么它工作正常。如果我在 MacBook 上捕获帧，然后在 iMac 上的测试程序中运行完全相同的解码软件，那么它可以正常解码。在 Macbook 上做同样的事情（测试程序的相同图像）再次给出 3/4 绿色。在较慢的 Windows 机器上从 openh264 编码器接收时，我遇到了类似的问题。我怀疑这与时间处理有关，但真的不太了解 H264 无法解决它。我注意到的一件事是，解码调用返回时没有错误代码，但大约有 70% 的时间返回 NULL 像素缓冲区。

解码部分的“胆量”是这样的（从GitHub上的demo修改而来）

void didDecompress(void *decompressionOutputRefCon,void *sourceFrameRefCon,Osstatus status,VTDecodeInfoFlags infoFlags,CVImageBufferRef pixelBuffer,CMTime presentationTimeStamp,CMTime presentationDuration )
{
    CVPixelBufferRef *outputPixelBuffer = (CVPixelBufferRef *)sourceFrameRefCon;
    *outputPixelBuffer = CVPixelBufferretain(pixelBuffer);
}

 void initVideoDecodetoolBox ()
    {
        if (!decodeSession)
        {
            const uint8_t* parameterSetPointers[2] = { mSPS,mPPS };
            const size_t parameterSetSizes[2] = { mSPSSize,mPPSSize };
            Osstatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kcfAllocatorDefault,2,//param count
                                                                                  parameterSetPointers,parameterSetSizes,4,//nal start code size
                                                                                  &formatDescription);
            if(status == noErr)
            {
                CFDictionaryRef attrs = NULL;
                const void *keys[] = { kCVPixelBufferPixelFormatTypeKey,kVTDecompressionPropertyKey_RealTime };
                uint32_t v = kCVPixelFormatType_32BGRA;
                const void *values[] = { CFNumberCreate(NULL,kcfNumberSInt32Type,&v),kcfBooleanTrue };
                attrs = CFDictionaryCreate(NULL,keys,values,NULL,NULL);
                VTDecompressionOutputCallbackRecord callBackRecord;
                callBackRecord.decompressionOutputCallback = didDecompress;
                callBackRecord.decompressionOutputRefCon = NULL;
                status = VTDecompressionSessionCreate(kcfAllocatorDefault,formatDescription,attrs,&callBackRecord,&decodeSession);
                CFRelease(attrs);
            }
            else
            {
                NSLog(@"IOS8VT: reset decoder session Failed status=%d",status);
            }
        }
    }

CVPixelBufferRef decode ( const char *NALBuffer,size_t NALSize )
    {
        CVPixelBufferRef outputPixelBuffer = NULL;
        if (decodeSession && formatDescription )
        {
            // The NAL buffer has been stripped of the NAL length data,so this has to be put back in
            MemoryBlock buf ( NALSize + 4);
            memcpy ( (char*)buf.getData()+4,NALBuffer,NALSize );
            *((uint32*)buf.getData()) = CFSwapInt32HostToBig ((uint32)NALSize);
            
            CMBlockBufferRef blockBuffer = NULL;
            Osstatus status  = CMBlockBufferCreateWithMemoryBlock(kcfAllocatorDefault,buf.getData(),NALSize+4,kcfAllocatorNull,&blockBuffer);
            
            if(status == kCMBlockBufferNoErr)
            {
                CMSampleBufferRef sampleBuffer = NULL;
                const size_t sampleSizeArray[] = {NALSize + 4};
                status = CMSampleBufferCreateReady(kcfAllocatorDefault,blockBuffer,1,sampleSizeArray,&sampleBuffer);
                
                if (status == kCMBlockBufferNoErr && sampleBuffer)
                {
                    VTDecodeFrameFlags flags = 0;VTDecodeInfoFlags flagOut = 0;
                    
                    // The default is synchronous operation.
                    // Call didDecompress and call back after returning.
                    Osstatus decodeStatus = VTDecompressionSessionDecodeFrame ( decodeSession,sampleBuffer,flags,&outputPixelBuffer,&flagOut );

                    if(decodeStatus != noErr)
                    {
                        DBG ( "decode Failed status=" + String ( decodeStatus) );
                    }
                    CFRelease(sampleBuffer);
                }
                CFRelease(blockBuffer);
            }
        }
        return outputPixelBuffer;
    }

注意：NAL 块没有 00 00 00 01 分隔符，因为它们在具有显式长度字段的块中流式传输。

解码在所有平台上都可以正常工作，并且编码流可以使用 openh264 正常解码。

解决方法

好吧，我终于找到了答案，所以我将把它留在这里以供后人使用。事实证明，Video Toolkit 解码函数期望将所有属于同一帧的 NAL 块复制到单个 SampleBuffer 中。较旧的 Mac 为应用程序提供了单个关键帧，这些关键帧被拆分为单独的 NAL 块，然后应用程序通过网络单独发送这些 NAL 块。不幸的是，这意味着第一个 NAL 块将被处理，在可能少于图片的四分之一的情况下，其余的将被丢弃。您需要做的是找出哪些 NAL 是同一帧的一部分，并将它们捆绑在一起。不幸的是，这需要您部分解析 PPS 和帧本身，这并非微不足道。非常感谢这里的帖子 at the Apple site 让我走上了正轨。