KLI

뉴스 인터뷰 비디오 시퀀스의 오버레이 텍스트 기반 자동 인물 색인 및 검색 시스템의 설계 및 구현

Metadata Downloads
Abstract
With the advent of the digital age, a vast amount of video data has been created by consumers and professionals over the last few decades. And the advances in the data capturing, storage, and communication technologies have made vast amounts of video data available to consumer and professional applications. The tremendous increase in the use of video data entails a need to develop effective methods to manage these multimedia resources by their content. In response to such demands, many researchers have been motivated to develop powerful indexing systems to ensure easy access to the relevant information, navigation, and organization in the vast repositories of video data.
Recognizing the overlay text embedded in images and videos provides high-level semantic clues which enhance tremendously the automatic image and video indexing. These texts contain a more concise and direct description of the content of the video. Therefore, the overlay text plays an important role in the automated content analysis systems such as the scene understanding, indexing, browsing, and retrieval.
Especially, the overlay text in the broadcasting news video sequences provides more meaningful of the content than any other type of videos. The detection and recognition of the overlay text have become a hot topic in news video analysis such as identification of person or place, name of the new-worthy event, date of the event, stock market, other news statistics, and news summaries.
This dissertation proposes a novel approach to extract meaningful content information from the broadcasted news video sequences by collaborative integration of image understanding and natural language processing. As an actual example, we developed a person browser system that associates faces and overlaid name texts in videos. This is given news videos as a knowledge source, then automatically extracts face and name text association as content information. The proposed framework consists of the text detection module, the face detection module, and the person indexing module.
For the preprocessing step, the proposed system makes the sub-clip based on the beginning frame for only focusing on the frames with overlay text. In the text detection module, the system executes overlay text detection and separates the name text line. And the system processes detection and extraction of the overlay text, and text recognition by optical character recognition (OCR). In the face detection module, the face thumbnail is extracted. The face detection module makes the representative thumbnail of the interviewee. And the person indexing module generates automatically the index metadata by named entity recognition (NER). And finally, the person indexing database is automatically made by combining the recognized text with the face thumbnail.
The successful results of person information extraction reveal that the proposed methodology of integrated use of image understanding techniques and natural language processing technique is headed in the right direction to achieve our goal of accessing real contents of multimedia information.
Author(s)
이상희
Issued Date
2019
Awarded Date
2019-08
Type
Dissertation
Keyword
Overlay textNewsIndexingRetrieveNamed entity
URI
https://oak.ulsan.ac.kr/handle/2021.oak/6585
http://ulsan.dcollection.net/common/orgView/200000224447
Alternative Author(s)
Sanghee Lee
Affiliation
울산대학교
Department
일반대학원 전기전자정보시스템공학과
Advisor
조강현
Degree
Doctor
Publisher
울산대학교 일반대학원 전기전자정보시스템공학과
Language
eng
Rights
울산대학교 논문은 저작권에 의해 보호받습니다.
Appears in Collections:
Electricity Electronics & Computer Engineering > 2. Theses (Ph.D)
Authorize & License
  • Authorize공개
Files in This Item:

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.