KLI

Pedestrian Head Detection and Tracking via Global Vision Transformer

Metadata Downloads
Abstract
In recent years, pedestrian detection and tracking have significant progress in both performance and latency. However, detecting and tracking pedestrian human-body in highly crowded environments is a complicated task in the computer vision field because pedestrians are partly or fully occluded by each other. That needs much human effort for annotation works and complex trackers to identify invisible pedestrians in spatial and temporal domains. To alleviate the aforementioned problems, previous methods tried to detect and track visible parts of pedestrians (e.g., heads, pedestrian visible-region), which achieved remarkable performances and can enlarge the scalability of tracking models and data sizes. Inspired by this purpose, this paper proposes simple but effective methods to detect and track pedestrian heads in crowded scenes, called PHDTT (Pedestrian Head Detection and Tracking with Transformer). Firstly, powerful encoder-decoder Transformer networks are integrated into the tracker, which learns relations between object queries and image global features to reason about detection results in each frame, and also matches object queries and track objects between adjacent frames to perform data association instead of further motion predictions, IoU-based methods, and Re-ID based methods. Both components are formed into single end-to-end networks that simplify the tracker to be more efficient and effective. Secondly, the proposed Transformer-based tracker is conducted and evaluated on the challenging benchmark dataset CroHD. Without bells and whistles, PHDTT achieves 60.6 MOTA, which outperforms the recent methods by a large margin. Testing videos are available at https://bit.ly/3eOPQ2d.
Author(s)
Xuan-Thuy VoVan-Dung HoangDuy-Linh NguyenKang-Hyun Jo
Issued Date
2022
Type
Article
Keyword
Pedestrian Head DetectionPedestrian Head TrackingVision TransformerCrowded ScenesSurveillance Systems
DOI
10.1007/978-3-031-06381-7_11
URI
https://oak.ulsan.ac.kr/handle/2021.oak/14913
Publisher
Communications in Computer and Information Science
Language
영어
ISSN
1865-0929
Citation Volume
1578
Citation Number
1
Citation Start Page
155
Citation End Page
167
Appears in Collections:
Medicine > Nursing
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.