Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Zayene, Oussama; Touj, Sameh Masmoudi; Hennebert, Jean; Ingold, Rolf; Essoukri Ben Amara, Najoua

doi:10.1049/iet-cvi.2017.0468

Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Zayene, Oussama; Touj, Sameh Masmoudi; Hennebert, Jean; Ingold, Rolf; Essoukri Ben Amara, Najoua

2018

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Cite

Files

Abstract

This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.

Details

Title

Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Author(s)

Zayene, Oussama (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia ; DIVA Group, University of Fribourg, Switzerland)
Touj, Sameh Masmoudi (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia)
Hennebert, Jean (School of Engineering and Architecture (HEIA-FR), HES-SO University of Applied Sciences Western Switzerland)
Ingold, Rolf (DIVA Group, University of Fribourg, Switzerland)
Essoukri Ben Amara, Najoua (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia)

Date

2018-08

Published in

IET Computer Vision

Volume

2018, vol. 12, no. 5, pp. 710-719

Pagination & equivalents

10 p.

DOI

https://doi.org/10.1049/iet-cvi.2017.0468

ISSN

1751-9632

Article Type

scientifique

Faculty

Ingénierie et Architecture

School

HEIA-FR

Institute

iCoSys- Institut d’intelligence artificielle et systèmes complexes

Record Appears in

Scientific Articles
Global

Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Files

Abstract

Details

Actions