微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

如何在 Android 应用程序中正确实现 MediaPipe Side Packets?您将如何使用 Iris Mediapipe 解决方案推断虹膜方向?

如何解决如何在 Android 应用程序中正确实现 MediaPipe Side Packets?您将如何使用 Iris Mediapipe 解决方案推断虹膜方向?

我希望有人可以帮助我提供一些想法或指导我使用 Mediapipe 使用 Iris .aar 创建自定义 Android 应用程序的一些进一步阅读材料,我已经倾注了官方 MediaPipe 文档,但找到了它有点受限制,现在我正在努力取得进展。我一直在尝试为 Iris 模型添加预期的 Side Packet 并尝试实时提取特定的地标坐标。

我的目标是创建一个开源注视方向驱动的文本到语音键盘,用于辅助功能,使用修改后的 MediaPipe Iris 解决方案来推断用户的注视方向以控制应用程序,我将非常感谢您对此的任何帮助.

这是我目前的发展计划和进展:

  1. 从命令行设置 Mediapipe 并构建示例完成
  2. 生成用于面部检测和虹膜跟踪的 .aars 完成
  3. 设置 Android Studio 以构建 Mediapipe 应用完成
  4. 使用 .aar 构建和测试人脸检测示例应用完成
  5. 修改人脸检测示例以使用 Iris .aar IN PROGRESS
  6. 输出虹膜和眼睛边缘之间的坐标以及它们之间的距离,以实时估计方向。或者修改图表和计算器以在可能的情况下为我推断并重建 .aar
  7. 将注视方向集成到应用中的控制方案中。
  8. 在实施初始控制后扩展应用功能

到目前为止,我已经使用以下构建文件生成了 Iris .aar, 我构建的 .aar 是否包含子图和主图的计算器,还是我需要在我的 AAR 构建文件添加其他内容

.aar 构建文件

load("//mediapipe/java/com/google/mediapipe:mediapipe_aar.bzl","mediapipe_aar")
mediapipe_aar(
name = "mp_iris_tracking_aar",calculators = ["//mediapipe/graphs/iris_tracking :iris_tracking_gpu_deps"],)

目前我有一个 android studio 项目,其中包含以下资产和前面提到的 Iris .aar。

Android Studio Assets:
iris_tracking_gpu.binarypb
face_landmark.tflite
iris_landmark.tflite
face_detection_front.tflite

现在,我只是尝试按原样构建它,以便我更好地了解该过程并可以验证我的构建环境设置是否正确。我已经成功构建并测试了文档中列出的人脸检测示例,这些示例可以正确运行,但是在修改项目以利用 iris .aar 时,它可以正确构建但在运行时崩溃,但例外情况是:需要侧包“focal_length_pixel”但未提供。

我尝试根据媒体管道代表中的 Iris 示例将焦距代码添加到 onCreate,但我不知道如何修改它以使用 Iris .aar,是否还有其他文档我可以通过阅读为我指明正确的方向吗?

我需要将此代码段(我认为)集成到人脸检测示例的修改代码中,但不确定如何集成。谢谢你的帮助:)

    float focalLength = cameraHelper.getFocalLengthPixels();
    if (focalLength != Float.MIN_VALUE) {
    Packet focalLengthSidePacket = processor.getPacketCreator().createFloat32(focalLength);
    Map<String,Packet> inputSidePackets = new HashMap<>();
    inputSidePackets.put(FOCAL_LENGTH_STREAM_NAME,focalLengthSidePacket);
    processor.setInputSidePackets(inputSidePackets);
    }
    haveAddedSidePackets = true;
Modified Face Tracking AAR example:
package com.example.iristracking;

// copyright 2019 The MediaPipe Authors.
//
// Licensed under the Apache License,Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,software
// distributed under the License is distributed on an "AS IS" BASIS,// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

import android.graphics.SurfaceTexture;
import android.os.Bundle;
import android.util.Log;
import java.util.HashMap;
import java.util.Map;
import androidx.appcompat.app.AppCompatActivity;
import android.util.Size;
import android.view.SurfaceHolder;
import android.view.SurfaceView;
import android.view.View;
import android.view.ViewGroup;
import com.google.mediapipe.components.CameraHelper;
import com.google.mediapipe.components.CameraXPreviewHelper;
import com.google.mediapipe.components.ExternalTextureConverter;
import com.google.mediapipe.components.FrameProcessor;
import com.google.mediapipe.components.PermissionHelper;
import com.google.mediapipe.framework.AndroidAssetUtil;
import com.google.mediapipe.framework.Packet;
import com.google.mediapipe.glutil.EglManager;

/** Main activity of MediaPipe example apps. */
public class MainActivity extends AppCompatActivity {
private static final String TAG = "MainActivity";
private boolean haveAddedSidePackets = false;

private static final String FOCAL_LENGTH_STREAM_NAME = "focal_length_pixel";
private static final String OUTPUT_LANDMARKS_STREAM_NAME = "face_landmarks_with_iris";

private static final String BINARY_GRAPH_NAME = "iris_tracking_gpu.binarypb";
private static final String INPUT_VIDEO_STREAM_NAME = "input_video";
private static final String OUTPUT_VIDEO_STREAM_NAME = "output_video";
private static final CameraHelper.CameraFacing CAMERA_FACING = CameraHelper.CameraFacing.FRONT;

// Flips the camera-preview frames vertically before sending them into FrameProcessor to be
// processed in a MediaPipe graph,and flips the processed frames back when they are displayed.
// This is needed because OpenGL represents images assuming the image origin is at the bottom-left
// corner,whereas MediaPipe in general assumes the image origin is at top-left.
private static final boolean FLIP_FRAMES_VERTICALLY = true;

static {
    // Load all native libraries needed by the app.
    System.loadLibrary("mediapipe_jni");
    System.loadLibrary("opencv_java3");
}

// {@link SurfaceTexture} where the camera-preview frames can be accessed.
private SurfaceTexture previewFrameTexture;
// {@link SurfaceView} that displays the camera-preview frames processed by a MediaPipe graph.
private SurfaceView previewdisplayView;

// Creates and manages an {@link EGLContext}.
private EglManager eglManager;
// Sends camera-preview frames into a MediaPipe graph for processing,and displays the processed
// frames onto a {@link Surface}.
private FrameProcessor processor;
// Converts the GL_TEXTURE_EXTERNAL_OES texture from Android camera into a regular texture to be
// consumed by {@link FrameProcessor} and the underlying MediaPipe graph.
private ExternalTextureConverter converter;

// Handles camera access via the {@link CameraX} Jetpack support library.
private CameraXPreviewHelper cameraHelper;


@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);

    previewdisplayView = new SurfaceView(this);
    setupPreviewdisplayView();

    // Initialize asset manager so that MediaPipe native libraries can access the app assets,e.g.,// binary graphs.
    AndroidAssetUtil.initializeNativeAssetManager(this);

    eglManager = new EglManager(null);
    processor =
            new FrameProcessor(
                    this,eglManager.getNativeContext(),BINARY_GRAPH_NAME,INPUT_VIDEO_STREAM_NAME,OUTPUT_VIDEO_STREAM_NAME);
    processor.getVideoSurfaceOutput().setFlipY(FLIP_FRAMES_VERTICALLY);

    PermissionHelper.checkAndRequestCameraPermissions(this);


}

@Override
protected void onResume() {
    super.onResume();
    converter = new ExternalTextureConverter(eglManager.getContext());
    converter.setFlipY(FLIP_FRAMES_VERTICALLY);
    converter.setConsumer(processor);
    if (PermissionHelper.cameraPermissionsGranted(this)) {
        startCamera();
    }
}

@Override
protected void onPause() {
    super.onPause();
    converter.close();
}

@Override
public void onRequestPermissionsResult(
        int requestCode,String[] permissions,int[] grantResults) {
    super.onRequestPermissionsResult(requestCode,permissions,grantResults);
    PermissionHelper.onRequestPermissionsResult(requestCode,grantResults);
}

private void setupPreviewdisplayView() {
    previewdisplayView.setVisibility(View.GONE);
    ViewGroup viewGroup = findViewById(R.id.preview_display_layout);
    viewGroup.addView(previewdisplayView);

    previewdisplayView
            .getHolder()
            .addCallback(
                    new SurfaceHolder.Callback() {
                        @Override
                        public void surfaceCreated(SurfaceHolder holder) {
                            processor.getVideoSurfaceOutput().setSurface(holder.getSurface());
                        }

                        @Override
                        public void surfaceChanged(SurfaceHolder holder,int format,int width,int height) {
                            // (Re-)Compute the ideal size of the camera-preview display (the area that the
                            // camera-preview frames get rendered onto,potentially with scaling and rotation)
                            // based on the size of the SurfaceView that contains the display.
                            Size viewSize = new Size(width,height);
                            Size displaySize = cameraHelper.computedisplaySizefromViewSize(viewSize);

                            // Connect the converter to the camera-preview frames as its input (via
                            // previewFrameTexture),and configure the output width and height as the computed
                            // display size.
                            converter.setSurfaceTextureAndAttachToGLContext(
                                    previewFrameTexture,displaySize.getWidth(),displaySize.getHeight());
                        }

                        @Override
                        public void surfaceDestroyed(SurfaceHolder holder) {
                            processor.getVideoSurfaceOutput().setSurface(null);
                        }
                    });
}

private void startCamera() {
    cameraHelper = new CameraXPreviewHelper();
    cameraHelper.setonCameraStartedListener(
            surfaceTexture -> {
                previewFrameTexture = surfaceTexture;
                // Make the display view visible to start showing the preview. This triggers the
                // SurfaceHolder.Callback added to (the holder of) previewdisplayView.
                previewdisplayView.setVisibility(View.VISIBLE);
            });
    cameraHelper.startCamera(this,CAMERA_FACING,/*surfaceTexture=*/ null);

}
}

解决方法

override fun onResume() {
        super.onResume()
        converter = ExternalTextureConverter(eglManager?.context,NUM_BUFFERS)

        if (PermissionHelper.cameraPermissionsGranted(this)) {
            var rotation: Int = 0
            if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.R) {
                rotation = this.display!!.rotation
            } else {
                rotation = this.windowManager.defaultDisplay.rotation
            }

            converter!!.setRotation(rotation)
            converter!!.setFlipY(FLIP_FRAMES_VERTICALLY)

            startCamera(rotation)

            if (!haveAddedSidePackets) {
                val packetCreator = mediapipeFrameProcessor!!.getPacketCreator();
                val inputSidePackets = mutableMapOf<String,Packet>()

                focalLength = cameraHelper?.focalLengthPixels!!
                Log.i(TAG_MAIN,"OnStarted focalLength: ${cameraHelper?.focalLengthPixels!!}")
                inputSidePackets.put(
                    FOCAL_LENGTH_STREAM_NAME,packetCreator.createFloat32(focalLength.width.toFloat())
                )
                mediapipeFrameProcessor!!.setInputSidePackets(inputSidePackets)
                haveAddedSidePackets = true

                val imageSize = cameraHelper!!.imageSize
                val calibrateMatrix = Matrix()
                calibrateMatrix.setValues(
                    floatArrayOf(
                        focalLength.width * 1.0f,0.0f,imageSize.width / 2.0f,focalLength.height * 1.0f,imageSize.height / 2.0f,1.0f
                    )
                )
                val isInvert = calibrateMatrix.invert(matrixPixels2World)
                if (!isInvert) {
                    matrixPixels2World = Matrix()
                }
            }
            converter!!.setConsumer(mediapipeFrameProcessor)
        }
    }`

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。