Key Frame Extraction for Text Based Video Retrieval Using Maximally Stable Extremal Regions

This paper presents a new approach for text-based video content retrieval system. The proposed scheme consists of three main processes that are key frame extraction, text localization and keyword matching. For the key-frame extraction, we proposed a Maximally Stable Extremal Region (MSER) based feat...

Full description

Saved in:
Bibliographic Details
Main Authors: Werachard Wattanarachothai (Author), Karn Patanukhom (Author)
Format: Book
Published: European Alliance for Innovation (EAI), 2015-04-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_b83a3714e37245a7ac4f9996164adfff
042 |a dc 
100 1 0 |a Werachard Wattanarachothai  |e author 
700 1 0 |a Karn Patanukhom  |e author 
245 0 0 |a Key Frame Extraction for Text Based Video Retrieval Using Maximally Stable Extremal Regions 
260 |b European Alliance for Innovation (EAI),   |c 2015-04-01T00:00:00Z. 
500 |a 10.4108/icst.iniscom.2015.258410 
500 |a 2032-9253 
520 |a This paper presents a new approach for text-based video content retrieval system. The proposed scheme consists of three main processes that are key frame extraction, text localization and keyword matching. For the key-frame extraction, we proposed a Maximally Stable Extremal Region (MSER) based feature which is oriented to segment shots of the video with different text contents. In text localization process, in order to form the text lines, the MSERs in each key frame are clustered based on their similarity in position, size, color, and stroke width. Then, Tesseract OCR engine is used for recognizing the text regions. In this work, to improve the recognition results, we input four images obtained from different pre-processing methods to Tesseract engine. Finally, the target keyword for querying is matched with OCR results based on an approximate string search scheme. The experiment shows that, by using the MSER feature, the videos can be segmented by using efficient number of shots and provide the better precision and recall in comparison with a sum of absolute difference and edge based method. 
546 |a EN 
690 |a cbvr 
690 |a text-based video retrieval 
690 |a key frame extraction 
690 |a shot boundary 
690 |a mser 
690 |a Education 
690 |a L 
690 |a Technology 
690 |a T 
655 7 |a article  |2 local 
786 0 |n EAI Endorsed Transactions on e-Learning, Vol 2, Iss 7, Pp 1-9 (2015) 
787 0 |n http://eudl.eu/doi/10.4108/icst.iniscom.2015.258410 
787 0 |n https://doaj.org/toc/2032-9253 
856 4 1 |u https://doaj.org/article/b83a3714e37245a7ac4f9996164adfff  |z Connect to this object online.