Recognition of Handwritten Source Code Characters With Deep Neural Networks

Barış Kılıçlar; Metehan Makinacı

Open Access
Recognition of Handwritten Source Code Characters With Deep Neural Networks
Barış Kılıçlar¹, Metehan Makinacı²^*
¹Dokuz Eylul University, İzmir, Turkey
²Dokuz Eylul University, İzmir, Turkey
* Corresponding author: metehan.makinaci@deu.edu.tr

Presented at the 4th International Symposium on Innovative Approaches in Engineering and Natural Sciences (ISAS WINTER-2019 (ENS)), Samsun, Turkey, Nov 22, 2019

SETSCI Conference Proceedings, 2019, 9, Page (s): 431-436 , https://doi.org/10.36287/setsci.4.6.111

Published Date: 22 December 2019 | Pages Total View Count 593 Pages Total Download Count 16

Abstract

In this paper we presents an application of deep learning techniques to recognize handwritten source code characters. Although there are many works on the handwritten character recognition (HCR) problem, very few works have been done about the offline handwritten source code character recognition. The problem includes the recognition of source code specific characters. We designed and implemented an application, performing preprocessing, histogram based segmentation and normalization on the scanned documents of exam papers which include codes that were written in C programming language. Constructed dataset includes 7093 source code character samples. We enriched this dataset with character samples from the CROHME database by transforming them to offline samples. With resulting 95 classes of 16275 samples, we trained and tested several models of convolutional neural networks (CNN). CNN is a deep learning architecture which is shown to produce state-of-the-art performance rates for handwritten character recognition tasks as well as for various other computer vision applications. Experimental evaluations gave performance rates between 90.53% and 94.09%. We conclude that CNN based classifiers are powerful tools for handwritten source code character recognition task.

Keywords - Handwritten Character Recognition, Deep Neural Networks, Convolutional Neural Networks, Deep Learning, Handwritten Source Code Classification

References

[1] N. Arica and F. T. Yarman-Vural, “An overview of character recognition focused on off-line handwriting”, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 31, no. 2, pp. 216-233, May 2001.
[2] ZhiQiyu, Metoyer Ronald. “Recognizing Handwritten Source Code”, Proceedings of Graphics Interface 2017,p. 163 - 170.
[3] Shruthi A., Patel M. S., “Offline Handwritten Word Recognition using Multiple Features with SVM Classifier for Holistic Approach”, International Journal of Innovative Research in Computer and Communication Engineering., vol. 3, Issue 6, 2015.
[4] M.Jangid, K.Singh, R.Dhir, R.Rani, “Performance Comparison of Devanagari and written Numerals Recognition”,International Journal of Computer Applications, p.0975 – 8887, Volume 22– No.1, May 2011.
[5] B. Hussain, M. Kabuka, “A Novel Feature Recognition Neural Network and its Application to Character Recognition”, IEEE Trans. Pattern and Machine Intelligence, vol. 16, no. 1, p.99-106, 1994.
[6] C. Y. Liou, H.C. Yang, “Handprinted Character Recognition Based on Spatial Topology Distance Measurements”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol.18, no.9, p.941-945, 1996.
[7] El-Yacoubi, A., Sabourin, R., Gilloux, M. et Suen, C.Y.. 1999. “Off-line handwitten word recognition using hidden markov models”. In Knowledge-based intelligent techniques in character recognition. p. 191-230. USA : CRC Press Inc.
[8] M. Y. Chen, A. Kundu, J. Zhou, “Off-line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network”, IEEE Trans. Pattern Recognition and Machine Intelligence, vol.16, p.481-496, 1994.
[9] Ciresan, D.C., Meier, U., &Schmidhuber, J. “Multi-column deep neural networks for image classification”. IEEE Conference on Computer Vision and Pattern Recognition,2012, p. 3642-3649.
[10] Øivind Due Trier, Anil K. Jain, TorfinnTaxt, “Feature extraction methods for character recognition-A survey”, Pattern Recognition, Volume 29, Issue 4, 1996, p. 641-662, ISSN 0031-3203.
[11] Darmatasia and M. I. Fanany, "Handwriting recognition on form document using convolutional neural network and support vector machines (CNN-SVM)," 5th International Conference on Information and Communication Technology (ICoIC7), 2017, p. 1-6.
[12] David H Hubel and Torsten N Wiesel. “Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex.”The Journal of physiology, 160(1), p.106, 1962
[13] Kunihiko Fukushima. “Neocognitron: A hierarchical neural network capable of visual pattern recognition.”Neural networks, 1(2) p.119–130, 1988.
[14] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-based Learning Applied to Document Recognition”, Proc. of the IEEE, Vol. 86, Issue 11, 1998, p.2278–2324.
[15] Dominik Scherer, AdreasM¨uller, and Sven Behnke. “Evaluation of pooling operations in convolutional architectures for object recognition”. In International Conference on Artificial Neural Networks, 2010.p. 92-101.
[16] Y-Lan Boureau, Jean Ponce, and Yann LeCun. “A Theoretical Analysis of Feature Pooling in Visual Recognition”. In International Conference on Machine Learning, 2010.
[17] (2019) CROHME dataset. [Online]. Available: https://www.isical.ac.in/~crohme/CROHME_data.html