Frinken, Volkmar; Bunke, Horst (2009). Evaluating retraining rules for semi-supervised learning in neural network based cursive word recognition. In: 10th International Conference on Document Analysis and Recognition ICDAR 2009 (pp. 31-35). Washington, DC: IEEE Computer Society 10.1109/ICDAR.2009.18
05277801.pdf - Published Version
Available under License Publisher holds Copyright.
© 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Download (164kB) | Preview
Training a system to recognize handwritten words is a task that requires a large amount of data with their correct transcription. However, the creation of such a training set, including the generation of the ground truth, is tedious and costly. One way of reducing the high cost of labeled training data acquisition is to exploit unlabeled data, which can be gathered easily. Making use of both labeled and unlabeled data is known as semi-supervised learning. One of the most general versions of semi-supervised learning is self-training, where a recognizer iteratively retrains itself on its own output on new, unlabeled data. In this paper we propose to apply semi-supervised learning, and in particular self-training, to the problem of cursive, handwritten word recognition. The special focus of the paper is on retraining rules that define what data are actually being used in the retraining phase. In a series of experiments it is shown that the performance of a neural network based recognizer can be significantly improved through the use of unlabeled data and self-training if appropriate retraining rules are applied.
|Item Type:||Conference or Workshop Item (Paper)|
|Division/Institute:||08 Faculty of Science > Institute of Computer Science (INF)|
|UniBE Contributor:||Frinken, Volkmar and Bunke, Horst|
|Subjects:||000 Computer science, knowledge & systems
500 Science > 510 Mathematics
|Publisher:||IEEE Computer Society|
|Date Deposited:||04 Oct 2013 15:22|
|Last Modified:||07 Dec 2014 01:54|
|URI:||http://boris.unibe.ch/id/eprint/37085 (FactScience: 206719)|