口罩识别python论文（识别率惊人的GitHub口罩检测）

小君 2023-07-13 03:22:09 418

口罩识别python论文（识别率惊人的GitHub口罩检测）我们采用：测试环境昨天在 GitHub 上看到一个有趣的开源项目，它能检测我们是否有戴口罩，跑起程序测试后，发现识别率挺高的，也适应不同环境，于是分享给大家。首先感谢 AIZOOTech 的开源项目 —— FaceMaskDetection，以下为该项目的 GitHub 地址：https://github.com/AIZOOTech/FaceMaskDetection

口罩识别python论文（识别率惊人的GitHub口罩检测）(1)

作者 | 一颗小树x，CSDN 博主

责编 | 唐小引

封图 | CSDN 下载自东方 IC

出品 | CSDN 博客

昨天在 GitHub 上看到一个有趣的开源项目，它能检测我们是否有戴口罩，跑起程序测试后，发现识别率挺高的，也适应不同环境，于是分享给大家。

首先感谢 AIZOOTech 的开源项目 —— FaceMaskDetection，以下为该项目的 GitHub 地址：

https://github.com/AIZOOTech/FaceMaskDetection

口罩识别python论文（识别率惊人的GitHub口罩检测）(2)

测试环境

我们采用：

Windows 系统；
软件：PyCharm；
使用模型：TensorFlow。

先看一下效果：

口罩识别python论文（识别率惊人的GitHub口罩检测）(3)

检测出帅气的胡歌没有带口罩。红色框框是圈出人脸部分，上方的字体：NoMask ，准确率 1 （即有 100% 把握认为没带口罩）。

如果在多人的情况下，能检测出来吗？如下图所示。

口罩识别python论文（识别率惊人的GitHub口罩检测）(4)

不错不错，这个模型能同时检测多人的，并且准确高。

有人带口罩，有人没带口罩，能检测出来吗？

口罩识别python论文（识别率惊人的GitHub口罩检测）(5)

哇，这个模型很棒。检测出带口罩大叔，和两个没带口罩的小伙子。

大家可以先在网页体验一下：

https://aizoo.com/face-mask-detection.html

口罩识别python论文（识别率惊人的GitHub口罩检测）(6)

接下来，我们具体分析一下这个项目：

支持 5 大主流深度学习框架（PyTorch、TensorFlow、MXNet、Keras 和 Caffe），已经写好接口了；可以根据自身的环境选择合适的框架，比如：TensorFlow；所有模型都在 models 文件夹下。
公开了近 8000 张的人脸口罩数据和模型，数据集来自于 WIDER Face 和 MAFA 数据集重新修改了标注并进行了校验（主要是 MAFA 和 WIDER Face 的人脸位置定义不一样，所以进行了修改标注）并将其开源出来。

口罩识别python论文（识别率惊人的GitHub口罩检测）(7)

模型结构

在本项目中使用了 SSD 类型的架构，为了让模型可以实时的跑在浏览器以及终端设备上，将模型设计的非常小，只有 101.5 万个参数。模型结构在本文附录部分。

本模型输入大小为 260x260，主干网络只有 8 个卷积层，加上定位和分类层，一共只有 24 层（每层的通道数目基本都是 32\64\128），所以模型特别小，只有 101.5 万参数。模型对于普通人脸基本都能检测出来，但是对于小人脸，检测效果肯定不如大模型。

网页使用了 Tensorflow.js 库，所以模型是完全运行在浏览器里面的。运行速度的快慢，取决于电脑配置的高低。

模型在五个卷积层上接出来了定位分类层，其大小和 anchor 设置信息如下表。

口罩识别python论文（识别率惊人的GitHub口罩检测）(8)

口罩识别python论文（识别率惊人的GitHub口罩检测）(9)

工程包目录结构分析

GitHub 工程包下载：

https://github.com/AIZOOTech/FaceMaskDetection

下载完 FaceMaskDetection 压缩包后，解压后如下图：

口罩识别python论文（识别率惊人的GitHub口罩检测）(10)

口罩识别python论文（识别率惊人的GitHub口罩检测）(11)

如何运行程序？

以 TensorFlow 模型为例子，代码中 TensorFlow 版本应该是 1.x；

如果是 TensorFlow 版本是 2.x 的朋友，对应函数修改为 tf.compat.v1.xxxx 使函数与 1.x 版本兼容。

如果想运行图片：

python tenforflow_infer.py --img-path /path/to/your/img

比如，img 目录中作者放了一些图片的，选择 demo2.jpg。

python tenforflow_infer.py --img-path img/demo2.jpg

运行结果：

口罩识别python论文（识别率惊人的GitHub口罩检测）(12)

如果想运行运行视频：

python tenforflow_infer.py --img-mode 0 --video-path /path/to/video

/path/to/video 为视频所在的路径视频名。

如果想实时使用摄像头检测：

python tenforflow_infer.py --img-mode 0 --video-path 0

这里的 0 ，代表在电脑中设备号；0 默认为电脑自带的摄像头。

如果想使用外接摄像头，可以改为 1 （比如外接上一个 USB 摄像头）。

这里看一下 tenforflow_infer.py 代码：

# -*- coding:utf-8 -*- import cv2 import time import argparse import numpy as np from PIL import Image from keras.models import model_from_json from utils.anchor_generator import generate_anchors from utils.anchor_decode import decode_bbox from utils.nms import single_class_non_max_suppression from load_model.tensorflow_loader import load_tf_model tf_inference #sess graph = load_tf_model('FaceMaskDetection-master\models\face_mask_detection.pb') sess graph = load_tf_model('models\face_mask_detection.pb') # anchor configuration feature_map_sizes = [[33 33] [17 17] [9 9] [5 5] [3 3]] anchor_sizes = [[0.04 0.056] [0.08 0.11] [0.16 0.22] [0.32 0.45] [0.64 0.72]] anchor_ratios = [[1 0.62 0.42]] * 5 # generate anchors anchors = generate_anchors(feature_map_sizes anchor_sizes anchor_ratios) #用于推断，批大小为1，模型输出形状为[1，N，4]，因此将锚点的dim扩展为[1，anchor_num，4] anchors_exp = np.expand_dims(anchors axis=0) id2class = {0: 'Mask' 1: 'NoMask'} def inference(image conf_thresh=0.5 iou_thresh=0.4 target_shape=(160 160) draw_result=True show_result=True): ''' 检测推理的主要功能 # ：param image：3D numpy图片数组 # ：param conf_thresh：分类概率的最小阈值。 # ：param iou_thresh：网管的IOU门限 # ：param target_shape：模型输入大小。 # ：param draw_result：是否将边框拖入图像。 # ：param show_result：是否显示图像。 ''' # image = np.copy(image) output_info = height width _ = image.shape image_resized = cv2.resize(image target_shape) image_np = image_resized / 255.0 # 归一化到0~1 image_exp = np.expand_dims(image_np axis=0) y_bboxes_output y_cls_output = tf_inference(sess graph image_exp) # remove the batch dimension for batch is always 1 for inference. y_bboxes = decode_bbox(anchors_exp y_bboxes_output)[0] y_cls = y_cls_output[0] # 为了加快速度，请执行单类NMS，而不是多类NMS。 bbox_max_scores = np.max(y_cls axis=1) bbox_max_score_classes = np.argmax(y_cls axis=1) # keep_idx是nms之后的活动边界框。 keep_idxs = single_class_non_max_suppression(y_bboxes bbox_max_scores conf_thresh=conf_thresh iou_thresh=iou_thresh) for idx in keep_idxs: conf = float(bbox_max_scores[idx]) class_id = bbox_max_score_classes[idx] bbox = y_bboxes[idx] # 裁剪坐标，避免该值超出图像边界。 xmin = max(0 int(bbox[0] * width)) ymin = max(0 int(bbox[1] * height)) xmax = min(int(bbox[2] * width) width) ymax = min(int(bbox[3] * height) height) if draw_result: if class_id == 0: color = (0 255 0) else: color = (255 0 0) cv2.rectangle(image (xmin ymin) (xmax ymax) color 2) cv2.putText(image "%s: %.2f" % (id2class[class_id] conf) (xmin 2 ymin - 2) cv2.FONT_HERSHEY_SIMPLEX 1 color) output_info.append([class_id conf xmin ymin xmax ymax]) if show_result: Image.fromarray(image).show return output_info def run_on_video(video_path output_video_name conf_thresh): cap = cv2.VideoCapture(video_path) height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT) width = cap.get(cv2.CAP_PROP_FRAME_WIDTH) fps = cap.get(cv2.CAP_PROP_FPS) fourcc = cv2.Videowriter_fourcc(*'XVID') #writer = cv2.VideoWriter(output_video_name fourcc int(fps) (int(width) int(height))) total_frames = cap.get(cv2.CAP_PROP_FRAME_COUNT) if not cap.isOpened: raise ValueError("Video open failed.") return status = True idx = 0 while status: start_stamp = time.time status img_raw = cap.read img_raw = cv2.cvtColor(img_raw cv2.COLOR_BGR2RGB) read_frame_stamp = time.time if (status): inference(img_raw conf_thresh iou_thresh=0.5 target_shape=(260 260) draw_result=True show_result=False) cv2.imshow('image' img_raw[: : ::-1]) cv2.waitKey(1) inference_stamp = time.time # writer.write(img_raw) write_frame_stamp = time.time idx = 1 print("%d of %d" % (idx total_frames)) print("read_frame:%f infer time:%f write time:%f" % (read_frame_stamp - start_stamp inference_stamp - read_frame_stamp write_frame_stamp - inference_stamp)) # writer.release if __name__ == "__main__": parser = argparse.ArgumentParser(description="Face Mask Detection") parser.add_argument('--img-mode' type=int default=0 help='set 1 to run on image 0 to run on video.') #这里设置为1：检测图片；还是设置为0：视频文件（实时图像数据）检测 parser.add_argument('--img-path' type=str help='path to your image.') parser.add_argument('--video-path' type=str default='0' help='path to your video `0` means to use camera.') # parser.add_argument('--hdf5' type=str help='keras hdf5 file') args = parser.parse_args if args.img_mode: imgPath = args.img_path #img = cv2.imread("imgPath") img = cv2.imread(imgPath) img = cv2.cvtColor(img cv2.COLOR_BGR2RGB) inference(img show_result=True target_shape=(260 260)) else: video_path = args.video_path if args.video_path == '0': video_path = 0 run_on_video(video_path '' conf_thresh=0.5)