Development of images to braille conversion software for alphabets using template matching and rule based

(1)

DEVELOPMENT OF IMAGES TO BRAILLE CONVERSION SOFTWARE FOR ALPHABETS USING TEMPLATE MATCHING AND RULE BASED (TEXT TO 8 DOT BRAILLE WITH IMPLEMENTATION FOR DeBrailiso)

IZZATI FARRAHWAH BINTI MAZLAN

A report submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Computer Science

(Computer Systems &Networking)

Faculty Of Computer Systems & Software Engineering UNIVERSITI MALAYSIA PAHANG

MAY2O11 PERPUSTAKAAN

UNIVERSITI MALAYSIA PAHANG No. Perolehan No. Panggflan

0691431

1K

Tarikh

'p3 L3 0 NOV 2,012

(2)

ABSTRACT

An images to Braille Conversion Software/Hardware (Alphabets) is a system that helps Blind people to read available reading materials. The current system only can support uppercase English alphabets from A to Z using six dots. Hence, to complete the current system, we have developed a complete images to Braille conversion system to process and convert the images of normal texts into Braille dots (8 dots) taking into account (i) lower/upper case English alphabets, (ii) numbers and (iii) special characters/symbols all together. Particularly, this research focuses on the part one, which is the processing and conversion of lower/upper case alphabets. This system receives the input images, process and compares it with characters templates that have been stored in the database. The processing of images is done into five steps systematically as (i) binaiization and pixel inversion, (ii) noise removal. (iii) segmentation and clustering, 'iv) line identification and finally (v) character extraction. A Gill is develop to observe the processes and displayed Braille dots to ensure the accuracy. Followed by a hardware is develop to demonstrate the working ability to display the Braille dots in terms LED. This system is targeted to integrate with Numbers and Special Characters and then use with hand phone or camera as a handy plug-in/plug-out device. So that the Blind people can carry it anywhere anytime with them to capture picture and make it readable for them. They also will be able to read the received SMS in their hand phone. The system also will read the material aloud for additional convenience for this group of people so that they can read and listen as per need. The developed integrated system may offer a easy life for the Blind community by widening the reading materials in a affordable and convenient way.

(3)

1

Penukaran gambar ke Braille Software / Hardware (Alphabets) adalah sistem yang membantu orang buta untuk membaca bahan bacaan yang sedia. Sistem saat mi hanya boieh menyokong Bahasa Inggeris hurufabjad dan As ampai Z menggunakan enamtitik. iOleh kerana itu, untuk rneiengkapkan system saat mi, kami telah mengembangkan suatu gambar yang lengkap untuk system penukaran Braille untuk memproses dan menukar imej teks normal menjadi titik-titik Braille (8 titik) dengan rnempertimbangkan (i) iebih rendah / huruf huruf bahasa inggeris, (ii) nombor dan (iii) aksara khas / symbol sernua bersarna-sama. Khususnya, kajian ml menumpukan pada satu bahagian, yang merupakan pemprosesan dan penukaran yang lebih rendah I abjad huruf besar. Sistem iiii menerima input gambar, proses dan membandingkannya dengan watak template yang telah disimpan di dalam database .Pengoiahan citra dilakukan kedalam lima langkah sistematik sebagai binarization (i) dan inversi pixel, (ii) penghapusan hingar, (iii) segmentasi dan clustering, (iv) pengenalan ganis dan akhirnya (v) ekstraksi aksara. Sebuah GUI mengembangkan untuk mengamati proses dan dipaparkan Braille titik untuk mernastikan ketepatannya. Dilanjutkan dengan peranti keras adaiah mengembangkan untuk menunjukkan kemampuan bekerja untuk memaparkan titik-titik Braille dalam hal LED. Sistem mi bertujuan untuk mengintegrasikan dengan Angka dan Watak Khusus dan kemudian digunakan dengan hand phone atau kamera sebagai peranti plug-in/plug-out kepada. Sehingga orang yang buta dapa tnembawa di mana sahaja pada bila-bila saja dengan mereka untuk menangkap gambar dan membuainya dibaca bagi mereka. Mereka juga akan mampu membaca SMS yang diterima di telefon tangan mereka. Sistem ml juga - akan membaca bahan-bahan yang keras untuk keselesaan tam, bahan untuk sekumpulan orang sehingga mereka ioieh membaca dan mendengar sesuai keperluan. Sistem yang terintegrasi yang dibangunkan mungkin menawarkan kehidupan yang mudah bagi masyarakat buta dengan memperluaskan bahan bacaan dengan cara yang berpatutan dan selesa.

(4)

TABLE OF CONTENTS

CHAPTER TITLE PAGE

SUPERVISOR'S DECLARATION i STUDENT'SDECLARATION DEDICATION ill ACKNOWLEDGEMENT iv ABSTRACT v ABSTRAK vi

TABLE OF CONTENTS vii

LIST OF TABLES xi

LIST OF FIGURES xii

1 INTRODUCTION

1.1 introduction I

1.2 Problem Statement 2

1.3 Aim and Objectives 3

1.4 Scope 4

1.5 Study Module 4

1.6 Thesis organization

(5)

vi"

2 LITERATURE REVIEW

2.1 Introduction ₇

2.2 Optical Character Recognition 9

2.3 Pattern Recognition 11 2.4 Template matching 14 2.5 Text detection ₁₅ 2.6 Neural Networks ₁₇ 2.7 Character Recognition 18 2.8 Summary ₂₁ 3 METHODOLOGY 3.1 Introduction ₂₂ 3.2 System Planning ₂₄ 3.3 System design ₂₅

3.3.1 User Interface Description ₂₆

3.4 Project Implementation ₂₇ 3.4.1 Design Assumptions ₂₉ 3.4.2 Template Preparation 29 3.4.3 Work Flow ₃₀ 3.4.3.1 Text to Sound 32 3.4.4 Hardware Implementation ₃₂ 3.4.4.1 Device Planning 34

3.4.4.2 Calculating bit for Braille character 34

3.4.4.3 Design Layout 34

3.4.4.4 Schematic Design 35

3.4.4.5 Developing Device 35 3.4.4.6 Connecting with Parallel Port 35

3.4.4.7 Testing ₃₆

3.4.4.8 Implementation of Software to the 36 Device

3.5 Debugging and Testing 36

(6)

ix 4 IMPLEMENTATION 4.1 introduction ₃₈ 4.2 Text to Sound ₄₀ 4.3 Hardware Implementation 41 4.3.1 Device Planning ₄₂

4.3.2 Calculating bit for Braille character 42

4.3.3 Design Layout ₄₃

4.3.4 Schematic Design 44

4.3.5 Developing Device ₄₅

4.3.6 Connecting with Parallel port 45

4.3.7 Testing ₄₈

4.3.8 Implementation of Software to the Device 48

4.4 Summary ₄₉ 5 6 7 INTEGRATION 5.1 Introduction ₅₀

5.2 Module 1: Upper and Lower Case 51

5.3 Module 2: Upper and Lower Case ₅₆

5.4 Module 3: Upper and Lower Case 59

5.5 Summary ₆₀

RESULTS AND DISCUSSION

6.1 Introduction ₆₁

6.2 Results of the System ₆₄

6.2.1 Testing ₆₄

6.3 Summary of OCR Performance Evaluation 67

CONCLUSION

7.1 Project Summary 68

7.2 Achieved Objectives 69

(7)

6

(8)

LIST OF TABLES

TABLES NO TITLE PAGE

2.1 Comparison of Method used in character ₈

recognition

2.2 Example of pattern recognition applications 12

4.1 Description for each parallel port pin 47

5.1 _{Differentiation of text style} ₅₃

(9)

LIST OF FIGURES

FIGURE TITLE _PAGE

1.1 Task Distribution ₄

1.2 Task Distribution by Module 5

2.1 Steps in a character recognition system 10

2.2 Left: prototype from the sample page image. Right: the 14 resulting template

2.3 Samples of text images from database 16

2.4 Modelling of an Artificial Neural Network 18

2.5 Flowchart that display character learning step which is ₁₉ preprocess of character recognition steps

2.6 Discriminate Line ₂₀

2.7 Character Grouping Algorithm 21

3.1 System Development Life Cycle (SDLC) 23

3.2 Task Distribution by Organization ₂₄

3.3 Task Distribution by Module ₂₄

3.4 System Architecture 25

3.5 Data Identification for Lower Case ₂₆

3.6 Graphical User Interface ₂₆

3.7 Main Steps of OCR Algorithm ₂₈

3.8 _{Example of an A alphabet in binarize and pixel-inverted} ₃₀ order

3.9 System Work Flow ₃₁

3.10 _{Hardware Flow Chart} ₃₃

(10)

4.1 Flow Chart 39

4.2 Coding for Text to Speech 40

4.3 Hardware Flow Chart 41

4.4 Braille characters reading 43

4.5 Hardware Design Layout ₄₄

4.6 Device Schematic Design 44

4.7 25-way female D-type parallel port connecter 46

4.8 Old parallel port cable 46

4.9 LED Braille display system ₄₉

4.10 Prototype LED Braille device 49

5.1 Flow of Module 1 ₅₁

5.2 Creating a template from image 52

5.3 Creating image character: removing all unwanted fill 53

5.4 Angle Correction 55

5.5 Flow of Module 2 ₅₆

5.6 Flow of Module 3 59

5.7 Prototype device for Braille printer ₆₀

6.1 Text image of uppercase characters group with font type 62 of Aria! Black

6.2 Text image of lowercase characters with font type of 62 Aria! Black

6.3 Text Image of number characters 62

6.4 Text image of symbol characters 62

6.5 Text image multiple characters fro combination of ₆₃ uppercase characters and number characters

6.6 Text image multiple characters for combination of ₆₃ uppercase characters, number characters, and symbol

characters

6.7 Text image multiple characters for combination of 63 lowercase characters and symbol characters

6.8 Choosing images tising Open Image 65

6.9 Open image browsing window ₆₅

6.10 The result for system integration 65

6.11 _{Symbol of Braille character after Braille Simulation} ₆₆ button is clicked

(11)

CHAPTER 1

INTRODUCTION

1.1 Introduction

In character recognition, there are two types which is on-line character recognition and off-line character recognition. In Optical Character Recognition (OCR), using of off-line character recognition usually make the system recognizes the fixed static shaped of the character. But on-line character recognition more to handwriting that recognizes the dynamic motion in handwriting. For example on-line character recognition is used of gestures in the Penpoint ' OS. Dynamic character recognition, real-time character recognition, and Intelligent Recognition or ICR is also referred as on-line character recognition. Optical Character Recognition (OCR) is a system that required of calibration to read a specific font using translation of mechanical or electronic translation, it translate by scanning images of typewritten, handwritten or printed text into machine-encoded text. Nowadays, it was widely used for converting books and documents into electronic files. As if it is been computerized for an example, a record-keeping system in an office or maybe to

(12)

2 publish the text using the website. it is also now possible to edit text, search phrase of word, store more compactly, print or display of scanning artifacts, and to apply the techniques to text-to-speech, text mining and machine translation to it. This system need to be programmed according to images of each character that need to be translated, and usually works on a particular font at a time.

1.2 Problem statement

Like other people, blind people also want to read even though they cannot see. They can read by using their fingers with a special character of alphabet by rose of dots on stiff paper that called Braille. Nowadays there are device for blind peoples to use for them to read such as typewriters and printers that can print out the special characters which is Braille. With this they can use the computers to print out their materials in Braille.

In future, OCR systems can be integrated into devices such as mobile phones to convert any image file (captured by Camera/mobile phone or scanned by a scanner) to machine readable/editable format that can be printed or displayed on Braille. Presently, Samsung's Braille is available in the market to read and send real-time SMS only but unable to convert images or any other text files to Braille.

Presently there are not many convenient materials that are available for them to read. The available devices that are on market now contain limited functions for them in this new era. Furthermore, the devices are costly and most of it is still beyond the ability of individual to afford it. Also, many blind people prefer to useBraille, as they are already used to it.

Based on the reasons, developing a complete device or system is needed to ease the reading for blind people. The device will be portable and small so that it will be easy for them to use and to carry it wherever they go every time and to read

(13)

3

anything they want. The development process will consists of several stages which will start from reading the input which is an image file, processing it and to converting it to an editable format. Then the process will go through few stages until Braille characters are produce.

This project is continued from previous project by Yusser A. Taqi Al-Qazwini on "Early Stages towards the Development of a Device That Eases the Reading of Blind People". The previous project only considered 6 dots Braille and it covers only capital letters. There is no function that will support small letters, special symbols and numbers.

This thesis had been upgrade from the previous project and focus on developing system and device for Braille using 8 dots format and it cover for uppercase and lower case alphabets that usually use in 'computer symbol.

1.3 Aim and Objectives

The aim of this project is to completing Module 3 to develop a system and prototype device for converting text image to Braille characters that focus 'only for lower and uppercase alphabets.

To achieve the aim, the objectives are:

1. To investigate the latest suitable pattern recognition and latest device for upper and lower case alphabets.

2. To added text to speech function

3. To develop a prototype device of converting text image to Braille system using 8 dots for upper and lower case alphabets.

4. To integrate with Module 1 and 2 with upper and lower case alphabets, numbers and symbols.

(14)

4

1.4 Scope

I. This system will convert text image to Braille using suitable pattern recognition and latest device for upper and lower case alphabets.

2. Additional features with text to speech.

3. This system will support 8 dots Braille character.

4. This system will integrate with Module I and 2 with upper and lower case alphabets, numbers and symbols.

1.5 Study Module

How Blinded People Read?

Braille

T

_i Sound

'1

66 Dots Braille Format . 8 Dot Braille Format

• Uppercase and

Uppercase

j:1

Lowercase

11 1

Numbers

J

Special

___j IPhabets

J

Symbols

The Project

(15)

izzali Siti Hidayah Mohd Farraliiyah Bhlili MlsWan Selahuddin Bin BbdiMazlan Numbers Jaafar

Lower Case Symbols

Module 1 INPUT / ORIGINAL IMAGES TO CLEAR IMAGES;

SitiHl&ayahBinliMlswan _{Webcam, scanner, Image digitaJition,}

Bitjop,

Noise, angle, Gray, black and white, edge and segmentation

Module 2 CLEAR IMAGES TO TEXT;

MoM SoizhudtUn Bin _{Horizontal projection, Vertical projection, Template} Jaafar _{Matching and Rule Base}

Module 3 TEXT TO BRAILLE:

Izzad FazaJdyJa Bind A44.4 with souid an hardwart

Mazian

Figure 1.2 Task Distribution by Module

1.6 Thesis Organization

Thesis organization is about how to organize the thesis and general introduction to what will be describe in all of the chapters in the thesis. The explanation about all of the chapters is briefly summarized. Chapter 1 covers the background, problem statement, aims and objectives, scope, study module, and thesis organization for the project.

Chapter 2 discussed about the literature review. This chapter contains research information that is related to the project to get the general overview about the techniques, its applications, and other approaches by doing online surveying.

Chapter 3 covers the methodology. In this chapter, all of the techniques that will be use, all of the process that involve, and the development phase is discussed.

(16)

Chapter 4 is about implementation, where the code for the proposed system is explained in details, also including the assumptions taken during the code implementation.

Chapter 5 is about the integration of nodule by using upper and lower case. The result of combination is shown in this chapter.

Chapter 6 covers about the results and discussion of the system. It is based on the experimental result that had been done during the implementation.

Finally, chapter 7 is about conclusion and contribution of the project. Also discuss suggestion about future works based on the analysis and recommendation of the proposed method for what need to be repair and enhance for the future.

(17)

CILAPTER 2

LITERATURE REVIEW

2.1 Introduction

In this chapter presents the general view of Optical Character Recognition (OCR) and also about the pattern recognition, its approaches and its applications.

(18)

Table 2.1 Comparison of method used in character recognition. - rrcr oa iftbr PJX*-- ec V, bmw 2Jfl. 1WL Th1C 1tX) %iare 15 cS1r ]gwith NITh I / / I I T-zrd5, the OCorPrniled x x x X7. I / tYrdn Sthpl Uthng reed An Arab I / ' / ) I Rrcg?ith S'n 1. 6it1ror x I X x Con-caiion oi Inflethon 1ndiamLanuage tT.mg Mophoic,gjcai Pr

i— i— •— i— i— ¶ 1

Cbre

StxtUthg

Ij Agcthin

/

(19)

x

ith3 ]m37nttWFUl

K.kLgOCR uaUr

Anrmb4eU r- i-i yr-

T

- - -

___ Appic21iOfl io

Dgra&d

In the last decade that had been quoted by Nagy, Natker and Rice (1999), [2] quoted that the acceptance rates from readers on hand-printed digits and constrained alphanumeric fields have risen significantly (from readers usually run at high reject/error ratio). Furthermore, many researchers now view off-line and on-line cursive writing as the next challenge or turn to multi-lingual recognition in a variety of scripts [2].

2.2 Optical Character Recognition _{[1] 13}₁ [61

In a project of early stages towards the development of a device that eases the reading of blind people by Yusser A. Taqi Ai-Qazwin (2010), [1] had used optical character recognition (OCR) algorithm that is proposed to be integrated into a Braille system. In general, OCR technology is all about recognizes the characters of a source document using the optical properties of the equipment and media.

Furthermore, it improves the, accuracy of data collection and reduces the time required by human to enter the input of data. It requires hardware that will scan and convert the images org character data from the source document into digital form. There are three primary methods for OCR to read a source document which are by optically readable marks, bar codes, and optically readable characters that includes handwritten characters.

(20)

10 An OCR system typically includes of the following processing steps:

1. Gray-level scanning at a resolution, typically 300-1000 dots per inch, 2. Preprocessing:

i) Binarization (two-level thresholding), using a global or a locally adaptive method;

ii) Segmentation to isolate individual characters;

iii) (optional) conversion to another character representation (e.g. skeleton or contour curve);

3. Feature extraction;

4. Recognition using one or more classifiers;

5. Contextual verification or postprocessing.

PAPER DOCUMENT SCANNING GRAY SCALE IMAGE REPROCESSING SINGLE

FEATURE FEATURE EXTRACTION

VECTORS CLASSIFICATION CLASSIFIED CHARACTERS POSTPREPROCESSINc3 CLASSIFIED TEXT

(21)

11 OCR can convert typed and handwritten document into digital data that will scan the shape of a character, compare the character with pre-defined shape, and then convert the character into corresponding bit pattern for storage in the main memory of the computer, but this is for some of OCR readers. Another function of OCR is to allow printed documents to be stored in a computer, such as text, images or photographs.

For this research, OCR functioning by converting the scanned image files into editable text document. When the OCR software received its input which is the images, it will then process it and then compares the characters with a set of OCR fonts stored in its available database. After that, the characters that had been compared will then be translated into Braille characters by using Braille coder algorithm. In an image, there are letters or characters that are made by many of tiny dots which is pixels that then will altogether form a picture of text Screen readers and refreshable Braille displays that enable OCR to read the text contained images.

2.3 Pattern Recognition [41 [101

The main approaches that had been used in this study are pattern recognition. This approach recognizes unknown instances of objects by classify it, or assign it to one of the set of possible classes or labels. In this researches, pattern recognition is about how the machine observe the environment involve, learn about the distinguish patters of interest from their background, and make sound and reasonable decisions about the categories of the patterns.

In patterns recognition, there are several things that need to focus on which is about the variety similar symbols that must be recognized distinctly, then about the imbalance and paucity of training data available, and the impossibility of final

(22)

12 verification through spell check. All of this must be focused on importantly in order to get the perfect output.

In pattern recognition, there are many techniques that are used in wide variety of commercial applications. Table 2.2 below show the examples of pattern recognition applications that had been recently renewed because of the emerging applications which are challenging and also computationally more demanding nowadays.

Table 2.2 Examples of pattern recognition applications [4]

Problem Domain Application Input Pattern Pattern Classes

Bioinformatics Sequemcc analysis

_PNA/P!tin

_{Knowi, type}

_s

_of sequence genes! patterns

Data mining Searching for Points in Compact and

meaningful multidimensional well-separated

patterns space clusters

Document Internet search Text document Semantic

classification _{categories (e.g..}

business, sports, etc.) Document image Reading machine Document image Alphanumeric

analysis for the blind characters, words

Industrial Printed circuit Intensity or range Defective /

non-automation board inspection image _{defective nature of}

product Multimedia Internet search Video clip Video games (e.g.,

database retrieval _{action, dialogue,}

- etc.)

Personal Face, iris, Authorized users

(23)

13 Remote sensing Forecasting croMu1tispectral _{Land use}

yield image categories, growth

pattern of crops

Speech Telephone Speech waveform Spoken words

recognition directory enquiry without operator

assistance

In order to constructing a good classifier for a given problem is generally have to:

1. Select the pattern recognition approach whose assumptions best match the characteristics of the problem;

2. Use a set of training data that is large enough and representative of the problem;

3. Decide on the "right" features.

On deciding which pattern recognition approach will be the best matches the characteristics of the problem requires an understanding of both the assumptions implicitly made by each approach and a characterization of the problem.

In pattern recognition techniques, we can divide into two techniques which are structural and statistical techniques. We can assume that structural techniques is on how the parts fit together to form the whole by knowing and looking at the parts of an object. Whereas for the statistical techniques is about measured some properties of the whole and the values of properties which are varies between classes.

(24)

prototypes

eeeeeeee eeeeeae

14

2.4 Template Matching [1] [3][4][81

In the OCR algorithm that review from the researched, combines both template matching and number of pixels feature to achieve accurate results of the proposed OCR that shows the recognition accuracy ranges are between _97.5%_{and 100%.} Template matching is one of the simplest and earliest approaches in the pattern recognition and also fairly standard image processing technique but are varies significantly across languages in order to find the most similar form. More than that, template matching requires shorter time and does not need sample training.

A template is a 2D array which is constructed from a set of prototypes from the same class.

template

Figure2.2 Left: prototype from the sample page image. Right: the resulting template.

[8]

The template is then been matched against the image of each word in the pruned list. The matching function must be satisfying two criteria [3]:

1. It must produce low match error for words which are similar to the template.

2. It must produce high match error for words which are dissimilar.

Meanwhile, the process comprises three main steps which are: 1. _{Extracting all the characters from an image.}

2. _{Building a library that contains one representative for each character} class.