微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

从设备摄像头捕获多个帧并将它们发送到学习模型?

如何解决从设备摄像头捕获多个帧并将它们发送到学习模型?

我正在尝试开发一个移动应用来识别手语,使用以下代码使用 MediaPipe 从设备摄像头捕获帧以进行识别。

# Tracks and renders pose + hands + face landmarks.

# GPU buffer. (GpuBuffer)
input_stream: "input_video"

# GPU image with rendered results. (GpuBuffer)
output_stream: "output_video"

# Throttles the images flowing downstream for flow control. It passes through
# the very first incoming image unaltered,and waits for downstream nodes
# (calculators and subgraphs) in the graph to finish their tasks before it
# passes through another image. All images that come in while waiting are
# dropped,limiting the number of in-flight images in most part of the graph to
# 1. This prevents the downstream nodes from queuing up incoming images and data
# excessively,which leads to increased latency and memory usage,unwanted in
# real-time mobile applications. It also eliminates unnecessarily computation,# e.g.,the output produced by a node may get dropped downstream if the
# subsequent nodes are still busy processing prevIoUs inputs.
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:output_video"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
  node_options: {
    [type.googleapis.com/mediapipe.FlowLimiterCalculatorOptions] {
      max_in_flight: 1
      max_in_queue: 1
      # Timeout is disabled (set to 0) as first frame processing can take more
      # than 1 second.
      in_flight_timeout: 0
    }
  }
}

node {
  calculator: "SlrLandmarkGpu"
  input_stream: "IMAGE:throttled_input_video"
  output_stream: "POSE_LANDMARKS:pose_landmarks"
  output_stream: "POSE_ROI:pose_roi"
  output_stream: "POSE_DETECTION:pose_detection"
  output_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
  output_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
}

# Gets image size.
node {
  calculator: "ImagePropertiesCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  output_stream: "SIZE:image_size"
}

# Converts pose,hands landmarks to a render data vector.
node {
  calculator: "SlrTrackingToRenderData"
  input_stream: "IMAGE_SIZE:image_size"
  input_stream: "POSE_LANDMARKS:pose_landmarks"
  input_stream: "POSE_ROI:pose_roi"
  input_stream: "LEFT_HAND_LANDMARKS:left_hand_landmarks"
  input_stream: "RIGHT_HAND_LANDMARKS:right_hand_landmarks"
  output_stream: "RENDER_DATA_VECTOR:render_data_vector"
}

# Draws annotations and overlays them on top of the input images.
node {
  calculator: "AnnotationOverlayCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  input_stream: "VECTOR:render_data_vector"
  output_stream: "IMAGE_GPU:output_video_pre"
}

node {
  calculator: "SlrDetectionGpu"
  input_stream: "IMAGE:output_video_pre"
  output_stream: "MEANINGSIGNAL:slr_meaning_render_data"
}

# Draws annotations and overlays them on top of the input images.
node {
  calculator: "AnnotationOverlayCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  input_stream: "slr_meaning_render_data"
  input_stream: "VECTOR:render_data_vector"
  output_stream: "IMAGE_GPU:output_video"
}

代码允许我捕获单个帧,但我无法将其更改为捕获多个帧。我想将其更改为每秒捕获 6 帧 (30fps / 5),然后将它们分组以将它们发送回学习模型进行识别。我怎么能做出这种改变?我已经尝试过,但无法将其更改为捕获多个帧,并且没有关于如何执行此操作的想法。欢迎提供任何帮助,不胜感激。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。