
来自 Intel OpenVINO 工具包的深度学习推理引擎后端是 OpenCV DNN 支持的后端之一。在之前的文章中提到,ARM CPU 支持最近已通过专用的 ARM CPU 插件添加到推理引擎中。让我们回顾一下 OpenCV DNN 模块如何利用推理引擎和此插件在 ARM CPU 上运行 DL 网络。
在OpenCV wiki中提到了几种关于如何使用 OpenCV 配置推理引擎的选项。我们将从头开始构建所有组件:OpenVINO、ARM CPU 插件、OpenCV,然后在 Raspberry Pi 上运行 YOLOv4-tiny 推理。
OpenVINO 和 OpenCV 交叉编译
我们将在 x86 平台上的 Docker 容器中交叉编译带有插件和 OpenCV 的 OpenVINO。这使我们能够加快编译速度 - Raspberry Pi 上的原生编译过程需要一段时间。
首先,我们将创建一个带有配置的构建环境的 Docker 镜像,其中包含 OpenCV 和 OpenVINO 依赖项,并运行构建脚本。为此,请使用以下内容创建 Dockerfile
FROM debian:buster USER root RUN echo 'deb http://deb.debian.org/debian unstable main' > /etc/apt/sources.list.d/unstable.list RUN dpkg --add-architecture armhf && \ apt-get update && \ apt-get install -y --no-install-recommends \ build-essential \ crossbuild-essential-armhf \ python3-dev \ python3-pip \ python3-numpy/unstable \ git-lfs \ scons \ wget \ xz-utils \ cmake \ libusb-1.0-0-dev:armhf \ libgtk-3-dev:armhf \ libavcodec-dev:armhf \ libavformat-dev:armhf \ libswscale-dev:armhf \ libgstreamer1.0-dev:armhf \ libgstreamer-plugins-base1.0-dev:armhf \ libpython3-dev:armhf && \ rm -rf /var/lib/apt/lists/* COPY arm_build.sh /arm_build.sh RUN mkdir /arm WORKDIR /arm/ CMD ["sh", "/arm_build.sh"]
在构建 Docker 镜像之前,我们需要创建构建脚本。此脚本包含 3 个部分
- 克隆 OpenCV、OpenVINO 和 OpenVINO contrib 存储库
- 使用 ARM CPU 插件构建 OpenVINO
- 使用 IE 后端支持构建 OpenCV
创建一个名为“arm_build.sh”的文件,并将以下内容添加到脚本中
#!/bin/sh set -x fail() { echo $1 exit 1 } git clone --recurse-submodules --shallow-submodules --depth 1 https://github.com/opencv/opencv.git && \ git clone --recurse-submodules --shallow-submodules --depth 1 https://github.com/openvinotoolkit/openvino.git && \ git clone --recurse-submodules --shallow-submodules --depth 1 https://github.com/openvinotoolkit/openvino_contrib.git || \ fail "Failed to clone source repositories" cd /arm/openvino && \ mkdir openvino_install && mkdir openvino_build && cd openvino_build && \ cmake -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX="../openvino_install" \ -DCMAKE_TOOLCHAIN_FILE="../cmake/arm.toolchain.cmake" \ -DTHREADING=SEQ \ -DIE_EXTRA_MODULES=/arm/openvino_contrib/modules/arm_plugin \ -DTHREADS_PTHREAD_ARG="-pthread" .. \ -DCMAKE_INSTALL_LIBDIR=lib \ -DCMAKE_CXX_FLAGS=-latomic \ -DENABLE_TESTS=OFF -DENABLE_BEH_TESTS=OFF -DENABLE_FUNCTIONAL_TESTS=OFF && \ make --jobs=$(nproc --all) && make install && \ cp /arm/openvino/bin/armv7l/Release/lib/libarmPlugin.so \ /arm/openvino/openvino_install/deployment_tools/inference_engine/lib/armv7l/ || \ fail "OpenVINO build failed" cd /arm/opencv && mkdir opencv_install && mkdir opencv_build && cd opencv_build && \ PYTHONVER=`ls /usr/include | grep "python3.*"` && \ cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_LIST=imgcodecs,videoio,highgui,dnn,python3 \ -DCMAKE_INSTALL_PREFIX="../opencv_install" \ -DOPENCV_CONFIG_INSTALL_PATH="cmake" \ -DCMAKE_TOOLCHAIN_FILE="../platforms/linux/arm-gnueabi.toolchain.cmake" \ -DWITH_IPP=OFF \ -DBUILD_TESTS=OFF \ -DBUILD_PERF_TESTS=OFF \ -DOPENCV_ENABLE_PKG_CONFIG=ON \ -DPYTHON3_PACKAGES_PATH="../opencv_install/python" \ -DPKG_CONFIG_EXECUTABLE="/usr/bin/arm-linux-gnueabihf-pkg-config" \ -DBUILD_opencv_python2=OFF -DBUILD_opencv_python3=ON \ -DPYTHON3_INCLUDE_PATH="/usr/include/${PYTHONVER}" \ -DPYTHON3_NUMPY_INCLUDE_DIRS="/usr/lib/python3/dist-packages/numpy/core/include" \ -DPYTHON3_LIMITED_API=ON \ -DOPENCV_SKIP_PYTHON_LOADER=ON \ -DENABLE_NEON=ON \ -DCPU_BASELINE="NEON" \ -DWITH_INF_ENGINE=ON \ -DWITH_NGRAPH=ON \ -Dngraph_DIR="/arm/openvino/openvino_build/ngraph" \ -DINF_ENGINE_RELEASE=2021030000 \ -DInferenceEngine_DIR="/arm/openvino/openvino_build" \ -DINF_ENGINE_LIB_DIRS="/arm/openvino/bin/armv7l/Release/lib" \ -DINF_ENGINE_INCLUDE_DIRS="/arm/openvino/inference-engine/include" \ -DCMAKE_FIND_ROOT_PATH="/arm/openvino" \ -DENABLE_CXX11=ON .. && \ make --jobs=$(nproc --all) && make install || \ fail "OpenCV build failed"
现在我们准备好构建镜像并运行它了
docker image build -t cross_armhf . mkdir arm && docker container run -u `id -u`:`id -g` --rm -t -v $PWD/arm:/arm cross_armhf
构建脚本完成后,您可以在 arm/opencv/opencv_install 目录中找到 OpenCV 工件,在 arm/openvino/openvino_install 目录中找到 OpenVINO 工件。我们需要将 OpenCV 和 OpenVINO 工件复制到目标 ARM 平台。如果目标 ARM 平台具有 SSH 接口,则可以使用 scp 工具:
scp -r arm/{opencv/opencv_install/,openvino/openvino_install} <user>@<host>:<path>
运行应用程序
为了评估 ARM CPU 插件,我们可以使用以下 Python 应用程序来检测对象
import cv2 as cv import numpy as np import os import argparse def draw_boxes(image, boxes, confidences, class_ids, idxs): if len(idxs) > 0: for i in idxs.flatten(): # extract bounding box coordinates left, top = boxes[i][0], boxes[i][1] width, height = boxes[i][2], boxes[i][3] # draw bounding box and label cv.rectangle(image, (left, top), (left + width, top + height), (0, 255, 0)) label = "%s: %.2f" % (classes[class_ids[i]], confidences[i]) cv.putText(image, label, (left, top - 5), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0)) return image def make_prediction(net, layer_names, labels, frame, conf_threshold, nms_threshold): boxes = [] confidences = [] class_ids = [] frame_height, frame_width = frame.shape[:2] # create a blob from a frame blob = cv.dnn.blobFromImage(frame, 1 / 255.0, (416, 416), swapRB=True, crop=False) net.setInput(blob) outputs = net.forward(layer_names) # extract bounding boxes, confidences and class ids for output in outputs: for detection in output: # extract the scores, class id and confidence scores = detection[5:] class_id = np.argmax(scores) confidence = scores[class_id] # consider the predictions that are above the threshold if confidence > conf_threshold: center_x = int(detection[0] * frame_width) center_y = int(detection[1] * frame_height) width = int(detection[2] * frame_width) height = int(detection[3] * frame_height) # get top left corner coordinates left = int(center_x - (width / 2)) top = int(center_y - (height / 2)) boxes.append([left, top, width, height]) confidences.append(float(confidence)) class_ids.append(class_id) idxs = cv.dnn.NMSBoxes(boxes, confidences, conf_threshold, nms_threshold) return boxes, confidences, class_ids, idxs parser = argparse.ArgumentParser() parser.add_argument('--model', default='yolov4-tiny.weights', help='Path to a binary file of model') parser.add_argument('--config', default='yolov4-tiny.cfg', help='Path to network configuration file') parser.add_argument('--classes', default='coco.names', help='Path to label file') parser.add_argument('--conf_threshold', type=float, default=0.5, help='Confidence threshold') parser.add_argument('--nms_threshold', type=float, default=0.3, help='Non-maximum suppression threshold') parser.add_argument('--input', help='Path to video file') parser.add_argument('--output', default='', help='Path to directory for output video file') args = parser.parse_args() # load names of classes classes = open(args.classes).read().rstrip('\n').split('\n') # load a network net = cv.dnn.readNet(args.config, args.model) net.setPreferableBackend(cv.dnn.DNN_BACKEND_INFERENCE_ENGINE) layer_names = net.getUnconnectedOutLayersNames() cap = cv.VideoCapture(args.input) # define the codec and create VideoWriter object if args.output != '': input_file_name = os.path.basename(args.input) output_file_name, output_file_format = os.path.splitext(input_file_name) output_file_name += '-output' if output_file_format != '': fourcc = int(cap.get(cv.CAP_PROP_FOURCC)) else: output_file_format = '.mp4' fourcc = cv.VideoWriter_fourcc(*'mp4v') output_file_path = args.output + output_file_name + output_file_format fps = cap.get(cv.CAP_PROP_FPS) frame_size = (int(cap.get(cv.CAP_PROP_FRAME_WIDTH)), int(cap.get(cv.CAP_PROP_FRAME_HEIGHT))) out = cv.VideoWriter(output_file_path, fourcc, fps, frame_size) while cv.waitKey(1) < 0: hasFrame, frame = cap.read() if not hasFrame: break boxes, confidences, class_ids, idxs = make_prediction(net, layer_names, classes, frame, args.conf_threshold, args.nms_threshold) frame = draw_boxes(frame, boxes, confidences, class_ids, idxs) if args.output != '': out.write(frame) else: cv.imshow('object detection', frame)
演示应用程序中没有与 ARM 平台相关的代码。默认 CPU 目标和检测到的 ARM 架构告诉推理引擎使用 ARM CPU 插件进行推理。如果在 x86 平台上运行相同的应用程序,推理引擎将选择 mklDNN 后端进行模型推理。
在运行应用程序之前,我们需要下载预训练的 YOLOv4-tiny 模型和视频文件
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4-tiny.cfg wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/coco.names wget https://raw.githubusercontent.com/intel-iot-devkit/sample-videos/master/people-detection.mp4
此外,我们需要定义 LD_LIBRARY_PATH 和 PYTHONPATH 环境变量
export PYTHONPATH=$PYTHONPATH:<artifacts_dir>/opencv_install/python/ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<artifacts_dir>/opencv_install/lib/:<artifacts_dir>/openvino_install/deployment_tools/ngraph/lib/: <artifacts_dir>/openvino_install/deployment_tools/inference_engine/lib/armv7l/
最后,我们可以运行应用程序
python3 object_detection.py --model yolov4-tiny.weights --config yolov4-tiny.cfg --classes coco.names --input people-detection.mp4
如果平台上无法显示窗口,您可以使用 –output 标志将输出保存到视频文件中
python3 object_detection.py --model yolov4-tiny.weights --config yolov4-tiny.cfg --classes coco.names --input people-detection.mp4 --output ./
结论
在本文中,您学习了如何将 ARM CPU 插件与 OpenCV 一起使用,并通过运行 YOLO 对象检测演示对其进行验证。如果您在使用 ARM CPU 插件时遇到任何问题,请随时在OpenVINO contrib 存储库(插件所在位置)中提交工单。