Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Zayene, Oussama; Touj, Sameh Masmoudi; Hennebert, Jean; Ingold, Rolf; Essoukri Ben Amara, Najoua

doi:10.1049/iet-cvi.2017.0468

Zayene, Oussama; Touj, Sameh Masmoudi; Hennebert, Jean; Ingold, Rolf; Essoukri Ben Amara, Najoua

2018

Télécharger

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Résumé

This study presents a novel approach for Arabic video text recognition based on recurrent neural networks. In fact, embedded texts in videos represent a rich source of information for indexing and automatically annotating multimedia documents. However, video text recognition is a non-trivial task due to many challenges like the variability of text patterns and the complexity of backgrounds. In the case of Arabic, the presence of diacritic marks, the cursive nature of the script and the non-uniform intra/inter word distances, may introduce many additional challenges. The proposed system presents a segmentation-free method that relies specifically on a multi-dimensional long short-term memory coupled with a connectionist temporal classification layer. It is shown that using an efficient pre-processing step and a compact representation of Arabic character models brings robust performance and yields a low-error rate than other recently published methods. The authors’ system is trained and evaluated using the public AcTiV-R dataset under different evaluation protocols. The obtained results are very interesting. They also outperform current state-of-the-art approaches on the public dataset ALIF in terms of recognition rates at both character and line levels.

Détails

Titre Multi-dimensional long short-term memory networks for artificial Arabic text recognition in news video

Auteur(s)/ trice(s) Zayene, Oussama (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia ; DIVA Group, University of Fribourg, Switzerland)
Touj, Sameh Masmoudi (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia)
Hennebert, Jean (School of Engineering and Architecture (HEIA-FR), HES-SO University of Applied Sciences Western Switzerland)
Ingold, Rolf (DIVA Group, University of Fribourg, Switzerland)
Essoukri Ben Amara, Najoua (LATIS Laboratory, National Engineering School of Sousse, University of Sousse, Tunisia)

Date 2018-08

Publié dans IET Computer Vision

Volume 2018, vol. 12, no. 5, pp. 710-719

Pagination 10 p.

DOI https://doi.org/10.1049/iet-cvi.2017.0468

ISSN 1751-9632

Type d'article scientifique

Domaine Ingénierie et Architecture

Ecole HEIA-FR

Institut iCoSys - Institut des systèmes complexes

Le document apparaît dans Articles scientifiques
Global

Résumé

Détails

Actions