KLI

Accurate Bounding Box Prediction for Single-Shot Object Detection

Metadata Downloads
Abstract
Accurate single-shot object detection is an extremely challenging task in real environments because of complex scenes, occlusion, ambiguities, blur, and shadow, i.e., these factors are called uncertainty problem. It leads to unreliable labeling of bounding box annotation and makes detectors arduous to learn bounding box localization. Previous methods viewed the ground truth box coordinates as a rigid distribution omitting localization uncertainty in real datasets. This article proposes a novel bounding box encoding algorithm integrated into the single-shot detector (BBENet) to consider the flexible distribution of bounding box localization. First, discretized ground truth labels are generated by decomposing each object’s boundary into multiple boundaries. The new representation of ground truth boxes is more arbitrary and flexible to cover any case of complex scenes. During training, the detector directly learns discretized box locations instead of continuous domain. Second, the bounding box encoding algorithm reorganizes bounding box predictions to be more accurate. Furthermore, another problem in existing methods is inconsistency in estimating detection quality. The single-shot detection consists of classification and localization tasks, but the popular detectors consider the classification score as the final detection quality. Thus, it lacks localization quality and hinders the overall performance because both tasks have a positive correlation. To overcome this problem, BBENet introduces detection quality by combining the localization and classification quality to rank detection during nonmaximum suppression. The localization quality is computed based on how uncertain the predicted boxes are, which is a new perspective in detection literature. The proposed BBENet is evaluated on three benchmark datasets, i.e., MS-COCO, Pascal VOC, and CrowdHuman. Without bells and whistles, BBENet outperforms the existing methods by a large margin with comparable speed, achieving the state-of-the-art single-shot detector.
Author(s)
Xuan-Thuy VoKang-Hyun Jo
Issued Date
2022
Type
Article
Keyword
Convolutional neural networks (CNNs)detection qualitylocalization qualitylocalization uncertaintyobject detection
DOI
10.1109/TII.2021.3138336
URI
https://oak.ulsan.ac.kr/handle/2021.oak/14661
Publisher
IEEE Transactions on Industrial Informatics
Language
영어
ISSN
1551-3203
Citation Volume
18
Citation Number
9
Citation Start Page
5961
Citation End Page
5971
Appears in Collections:
Medicine > Nursing
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.