如何解决如何在不丢失音频同步的情况下使用 AVMutableComposition 组合多个视频?
我正在尝试编写可以组合 N 个视频并将它们并排放置的视频导出器。下面的例子是 3 个视频。 视频是由演讲者的 WebRTC 流录制的,每个视频的帧速率略有不同,也可能以可变帧速率录制,因为 RTC 视频有时会降低速率。
问题是在将 hstack 中的视频并排合并时,音频与其对应的视频不同步(一段时间后音频不再跟随说话者的嘴唇)
我还尝试将 ffmpeg
的单个视频预导出为恒定帧速率,但仍然会发生不同步。如果我尝试将 hstack
过滤器与 ffmpeg
一起使用,也会发生类似的音频不同步,所以最后我放弃了尝试......与 AVFoundation
结合。
关于如何使音频与组合视频同步的任何建议?
func hstackVideos() {
let videoPaths: [String] = [
"path/to/video1.mp4","path/to/video2.mp4","path/to/video3.mp4",]
let composition = AVMutableComposition()
let assetInfos: [(AVURLAsset,AVAssetTrack,AVMutableCompositionTrack,AVMutableCompositionTrack)] = videoPaths.map {
let asset = AVURLAsset(url: URL(fileURLWithPath: $0))
let track = composition.addMutableTrack(withMediaType: AVMediaType.video,preferredTrackID: kCMPersistentTrackID_Invalid)!
let videoAssetTrack = asset.tracks(withMediaType: .video)[0]
try! track.insertTimeRange(videoAssetTrack.timeRange,of: videoAssetTrack,at: CMTime.zero)
let audioTrack = composition.addMutableTrack(withMediaType: .audio,preferredTrackID: kCMPersistentTrackID_Invalid)!
let audioAssetTrack = asset.tracks(withMediaType: .audio)[0]
try! audioTrack.insertTimeRange(audioAssetTrack.timeRange,of: audioAssetTrack,at: CMTime.zero)
return (asset,videoAssetTrack,track,audioAssetTrack,audioTrack)
}
let stackComposition = AVMutableVideoComposition()
stackComposition.renderSize = CGSize(width: 512,height: 288)
stackComposition.frameDuration = CMTime(seconds: 1/30,preferredTimescale: 600)
// stackComposition.frameDuration = assetInfos[0].1.minFrameDuration
var i = 0
let instructions: [AVMutableVideoCompositionLayerInstruction] = assetInfos.map { (asset,assetTrack,compTrack,_,_) in
let lInst = AVMutableVideoCompositionLayerInstruction(assetTrack: compTrack)
let w: CGFloat = 512/CGFloat(assetInfos.count)
let inRatio = assetTrack.naturalSize.width / assetTrack.naturalSize.height
let cropRatio = w / 288
let scale: CGFloat
if inRatio < cropRatio {
scale = w / assetTrack.naturalSize.width
} else {
scale = 288 / assetTrack.naturalSize.height
}
lInst.setCropRectangle(CGRect(x: w/scale,y: 0,width: w/scale,height: 288/scale),at: CMTime.zero)
let transform = CGAffineTransform(scaleX: scale,y: scale)
let t2 = transform.concatenating(CGAffineTransform(translationX: -w + CGFloat(i)*w,y: 0))
lInst.setTransform(t2,at: CMTime.zero)
i += 1
return lInst
}
let inst = AVMutableVideoCompositionInstruction()
inst.timeRange = CMTimeRange(start: CMTime.zero,duration: assetInfos[0].0.duration)
inst.layerInstructions = instructions
stackComposition.instructions = [inst]
let exporter = AVAssetExportSession(asset: composition,presetName: AVAssetExportPresetHighestQuality)!
let outPath = "path/to/finalVideo.mp4"
let outUrl = URL(fileURLWithPath: outPath)
try? FileManager.default.removeItem(at: outUrl)
exporter.outputURL = outUrl
exporter.videoComposition = stackComposition
exporter.outputFileType = .mp4
exporter.shouldOptimizeforNetworkUse = true
let group = dispatchGroup()
group.enter()
exporter.exportAsynchronously(completionHandler: {
switch exporter.status {
case .completed:
print("SUCCESS!")
if exporter.error != nil {
print("Error: \(String(describing: exporter.error))")
print("Description: \(exporter.description)")
}
group.leave()
case .exporting:
let progress = exporter.progress
print("Progress: \(progress)")
case .Failed:
print("Error: \(String(describing: exporter.error))")
print("Description: \(exporter.description)")
group.leave()
default:
break
}
})
group.wait()
}
[更新 29/07/2021]
我检查了音频和视频轨道的输入和输出持续时间。结果如下(以秒为单位): 输入视频:
- 视频 1:(视频轨道:1086.586,音频轨道:1086.483)
- 视频 2:(视频轨道:1086.534,音频轨道:1086.473)
- 视频 3:(视频轨道:1086.5,音频轨道:1086.483)
输出视频具有三个显着修改持续时间的音轨: (视频轨:1086.5855,a1轨:1079.208,a2轨:1083.88266666666666,a3轨:1086.5855)
我还注意到源和目标 音频 轨道的 nominalFrameRate
存在细微差异:(源速率:46.786236、46.561222、46.762463,目标速率:46.874996、46.875、46.875)。这可以解释持续时间差异,尽管我不知道音频中的帧速率是什么以及导出器为什么要更改它。
我也尝试使用 AVMutableAudioMix
但同步问题仍然存在。
似乎输入视频对音轨的持续时间使用了某种缩放,这些音轨在放入合成时会丢失。任何建议如何获得这些?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。