来源:http://blog.csdn.net/wangruihit/article/details/46550853css
VideoToolbox是iOS平台在iOS8以后开放的一个Framework,提供了在iOS平台利用硬件实现H264编解码的能力。git
这套接口的合成主要我一我的参与,花费了四五天的时间,中间主要参考了WWDC 2014 513关于hardware codec的视频教程,github
OpenWebrtc的vtenc/vtdec模块,web
chromium的一部分代码xcode
https://src.chromium.org/svn/trunk/src/content/common/gpu/media/vt_video_decode_accelerator.cc,session
https://chromium.googlesource.com/chromium/src/media/+/cea1808de66191f7f1eb48b5579e602c0c781146/cast/sender/h264_vt_encoder.cc
app
还有stackoverflow的一些帖子,如 ide
http://stackoverflow.com/questions/29525000/how-to-use-videotoolbox-to-decompress-h-264-video-stream svn
http://stackoverflow.com/questions/24884827/possible-locations-for-sequence-picture-parameter-sets-for-h-264-stream
性能
另外还有apple forum的帖子如:
https://devforums.apple.com/message/1063536#1063536
中间须要注意的是,
1,YUV数据格式
Webrtc传递给Encoder的是数据是I420,对应VT里的kCVPixelFormatType_420YpCbCr8Planar,若是VT使用
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange格式(即NV12),那么须要将I420转换为NV12再进行编码。转换可以使用libyuv库。
I420格式有3个Planar,分别存放YUV数据,而且数据连续存放。相似:YYYYYYYYYY.......UUUUUU......VVVVVV......
NV12格式只有2个Planar,分别存放YUV数据,首先是连续的Y数据,而后是UV数据。相似YYYYYYYYY......UVUVUV......
选择使用I420格式编码仍是NV12进行编码,取决于在初始化VT时所作的设置。设置代码以下:
- CFMutableDictionaryRef source_attrs = CFDictionaryCreateMutable (NULL, 0, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
-
- CFNumberRef number;
-
- number = CFNumberCreate (NULL, kCFNumberSInt16Type, &codec_settings->width);
- CFDictionarySetValue (source_attrs, kCVPixelBufferWidthKey, number);
- CFRelease (number);
-
- number = CFNumberCreate (NULL, kCFNumberSInt16Type, &codec_settings->height);
- CFDictionarySetValue (source_attrs, kCVPixelBufferHeightKey, number);
- CFRelease (number);
-
- OSType pixelFormat = kCVPixelFormatType_420YpCbCr8Planar;
- number = CFNumberCreate (NULL, kCFNumberSInt32Type, &pixelFormat);
- CFDictionarySetValue (source_attrs, kCVPixelBufferPixelFormatTypeKey, number);
- CFRelease (number);
-
- CFDictionarySetValue(source_attrs, kCVPixelBufferOpenGLESCompatibilityKey, kCFBooleanTrue);
-
- OSStatus ret = VTCompressionSessionCreate(NULL, codec_settings->width, codec_settings->height, kCMVideoCodecType_H264, NULL, source_attrs, NULL, EncodedFrameCallback, this, &encoder_session_);
- if (ret != 0) {
- WEBRTC_TRACE(webrtc::kTraceError, webrtc::kTraceVideoCoding, -1,
- "vt_encoder::InitEncode() fails to create encoder ret_val %d",
- ret);
- return WEBRTC_VIDEO_CODEC_ERROR;
- }
-
- CFRelease(source_attrs);
2,VT编码出来的数据是AVCC格式,须要转换为Annex-B格式,才能回调给Webrtc。主要区别在于数据开头是长度字段仍是startCode,具体见stackoverflow的帖子。
同理,编码时,须要将webrtc的Annex-B格式转换为AVCC格式。
Annex-B:StartCode + Nalu1 + StartCode + Nalu2 + ...
AVCC :Nalu1 length + Nalu1 + Nalu2 length + Nalu2 + ...
注意⚠:AVCC格式中的length字段须要是big endian顺序。length字段的长度可定制,通常为1/2/4byte,须要经过接口配置给解码器。
3,建立VideoFormatDescription
解码时须要建立VTDecompressionSession,须要一个VideoFormatDescription参数。
建立VideoFormatDescription须要首先从码流中获取到SPS和PPS,而后使用以下接口建立VideoFormatDescription
- CM_EXPORT
- OSStatus CMVideoFormatDescriptionCreateFromH264ParameterSets(
- CFAllocatorRef allocator,
- size_t parameterSetCount,
- const uint8_t * constconst * parameterSetPointers,
- const size_tsize_t * parameterSetSizes,
- int NALUnitHeaderLength,
- CMFormatDescriptionRef *formatDescriptionOut )
- __OSX_AVAILABLE_STARTING(__MAC_10_9,__IPHONE_7_0);
4,判断VT编码出来的数据是不是keyframe
这个代码取自OpenWebrtc from Ericsson
- static bool
- vtenc_buffer_is_keyframe (CMSampleBufferRef sbuf)
- {
- bool result = FALSE;
- CFArrayRef attachments_for_sample;
-
- attachments_for_sample = CMSampleBufferGetSampleAttachmentsArray (sbuf, 0);
- if (attachments_for_sample != NULL) {
- CFDictionaryRef attachments;
- CFBooleanRef depends_on_others;
-
- attachments = (CFDictionaryRef)CFArrayGetValueAtIndex (attachments_for_sample, 0);
- depends_on_others = (CFBooleanRef)CFDictionaryGetValue (attachments,
- kCMSampleAttachmentKey_DependsOnOthers);
- result = (depends_on_others == kCFBooleanFalse);
- }
-
- return result;
- }
4,SPS和PPS变化后判断VT是否还能正确解码
经过下面的接口判断是否须要须要更新VT
- VT_EXPORT Boolean
- VTDecompressionSessionCanAcceptFormatDescription(
- <span style="white-space:pre"> </span>VTDecompressionSessionRef<span style="white-space:pre"> </span>session,
- <span style="white-space:pre"> </span>CMFormatDescriptionRef<span style="white-space:pre"> </span>newFormatDesc ) __OSX_AVAILABLE_STARTING(__MAC_10_8,__IPHONE_8_0);
5,PTS
PTS会影响VT编码质量,通常状况下,duration参数表示每帧数据的时长,用样点数表示,通常视频采样频率为90KHz,帧率为30fps,则duration就是sampleRate / frameRate = 90K/30 = 3000.
而pts表示当前帧的显示时间,也用样点数表示,即 n_samples * sampleRate / frameRate.
- VT_EXPORT OSStatus
- VTCompressionSessionEncodeFrame(
- VTCompressionSessionRef session,
- CVImageBufferRef imageBuffer,
- CMTime presentationTimeStamp,
- CMTime duration,
- CFDictionaryRef frameProperties,
- voidvoid * sourceFrameRefCon,
- VTEncodeInfoFlags *infoFlagsOut
6,编码选项
- kVTCompressionPropertyKey_AllowTemporalCompression
- kVTCompressionPropertyKey_AllowFrameReordering
TemporalCompression控制是否产生P帧。
FrameReordering控制是否产生B帧。
7,使用自带的PixelBufferPool提升性能。
建立VTSession以后会自动建立一个PixelBufferPool,用作循环缓冲区,下降频繁申请释放内存区域形成的额外开销。
- VT_EXPORT CVPixelBufferPoolRef
- VTCompressionSessionGetPixelBufferPool(
- VTCompressionSessionRef session ) __OSX_AVAILABLE_STARTING(__MAC_10_8, __IPHONE_8_0);
- CV_EXPORT CVReturn CVPixelBufferPoolCreatePixelBuffer(CFAllocatorRef allocator,
- CVPixelBufferPoolRef pixelBufferPool,
- CVPixelBufferRef *pixelBufferOut) __OSX_AVAILABLE_STARTING(__MAC_10_4,__IPHONE_4_0);
中间还有不少不少的细节,任何一处错误都是致使千奇百怪的crash/编码或解码失败等
多看看我提供的那几个连接,会颇有帮助。
通过测试,iOS8 硬件编解码效果确实很好,比OpenH264出来的视频质量更清晰,而且能轻松达到30帧,码率控制的精确性也更高。