School of Science & Engineering
Department of Electrical & Computer Engineering
Non-Visual Representation of Complex Documents for Use in Digital Talking Books
Azadeh Nazemi
This thesis is presented for the Degree of Doctor of Philosophy
of Curtin University
Declaration
To the best of my knowledge and belief this thesis contains no material previously published by any other person except where due acknowledgment has been made. This thesis contains no material which has been accepted for the award of any other degree or diploma in any university.
Signature:
Parts of this thesis have been previously published as listed below:
1. Azadeh Nazemi, Cesar Ortega-Sanchez & Iain Murray. (2011). Digital Talking Book Player for the Visually Impaired Using FPGAs: Proceedings of the 2011 International Conference on Recongurable Computing and FPGAs, RECONFIG '11. (P.493-496 ): IEEE Computer Society Washing-ton Conference Publications, DC, USA ©2011 ISBN: 978-0-7695-4551-6. doi>10.1109/ReConFig.2011.28
2. Azadeh Nazemi, Iain Murray & Nazanin Mohammad. (2012). Mathspeak: An audio method for presenting mathematical formulae to blind students: Proceeding of the 5th International Conference on Human System Interac-tions (HSI).(p. 48-58).
3. Azadeh Nazemi & Iain Murray. (2012). A Novel Complete Reading Em-bedded System for the Vision Impaired: Proceeding of the 3rd Annual International Conference on Computer Science Education: Innovation and Technology (CSEIT 2012). Singapore.(Best Student Paper Award Winner). 4. Azadeh Nazemi & Iain Murray. (2013). A method to provide accessibility for visual components to vision impaired. International Journal of human computer interaction. 4(1), 2013.
5. Azadeh Nazemi & Iain Murray. (2013). Mathematical Formula Recog-nition and Transformation to a Linear Format Suitable for Vocalization. International journal on computer science and engineering. 5(9).
6. Azadeh Nazemi & Iain Murray. (2013). An Open Source Reading System For Print Disabilities. International journal of Information Technology and Computer Science ( IJITCS ).12(2).
7. Azadeh Nazemi, Iain Murray & David A. McMeekin. (2014). Layout Anal-ysis for Scanned PDF and Transformation to the Structured PDF Suitable for Vocalization and Navigation. Computer and Information Science.7(1), 162-172. Publisher: Canadian Centre of Science and Education.
8. Azadeh Nazemi Iain Murray & David A. McMeekin. (2014). Multilingual Text to Speech in embedded systems using RC8660. International Jour-nal of Computers & Technology.13(4), Publisher: Council For Innovative Research.
9. Azadeh Nazemi, Iain Murray & David A. McMeekin. (2014). A Method to Provide High Volume Transaction Outputs Accessibility to Vision Impaired Using Layout Analysis. Transaction on machine learning and articial in-telligence.2(3), Publisher: scholar publishing.
10. Azadeh Nazemi, Iain Murray & David A. McMeekin. (2014). Practical segmentation methods for logical and geometric layout analysis to improve scanned PDF accessibility to Vision Impaired. International Journal of Signal Processing, Image Processing and Pattern Recognition, August 2014 issue of IJSIP.
11. Azadeh Nazemi, Iain Murray & David A. McMeekin. (2014). Mathematical Information Retrieval (MIR) from Scanned PDF Documents and MathML Conversion, Transactions on Computer Vision and Application. Volume 7(10.12.2014) IPSJ.
12. Azadeh Nazemi, Iain Murray & David A. McMeekin. (2014). A Method to Implement DAISY Online Delivery Protocol. Proceeding of the 8th Interna-tional Conference on Telecommunication System, Services, and Application (TSSA2014).| Kuta Bali, IEEE Conference Publications.
13. Azadeh Nazemi, Iain Murray & David A. McMeekin. (2015). Unbalanced Chemical Equations Conversion to Mark-up Format and Representation to Vision Impaired Students. International Journal Computer Applications in Engineering Education.23(2), 4/2015, Publisher: Wiley.
Acknowledgments
I express sincere appreciation and thanks to my Supervisor, Dr. Iain Murray who has been a tremendous mentor for me. He was the reason why I decided to go to pursue a career in research.I would like to thank you for encouraging my research and for allowing me to grow as a research scientist. His advice on both research as well as on my life have been priceless. I would also like to thank Co-supervisor, Dr. Cesar Ortega Sanchez for scientic advice and knowledge and many insightful discussions and suggestions and Associate Supervisor Dr. David McMeekin for helping me immensely. I would especially like to thank respondents of my surveys including CAVI students. I would like to thank Mr Russel Wilkinson for technical resources and help.
A special thanks to my family. Words cannot express how grateful I am to my husband, Pejman and my son Hiva for all of the sacrices that they have made on my behalf. Your prayer for me was what sustained me thus far. My special thanks are due to my mother, father, brother and sister-in-law for their constant support and unconditional love.
I would also like to thank all of my friends, especially Chandrika who sup-ported me in writing and help me to strive towards my goal.
Abstract
According to a World Intellectual Property Organization (WIPO) estimation, only 5% of the world's one million print titles that are published every year are accessible to the approximately 340 million blind, visually impaired or print disabled people. Equal access to information is a basic right of all people. Essen-tial information such as yers, brochures, event calendars, programs, catalogues and booking information needs to be accessible by everyone. Information helps people to make decisions, be involved in society and live independent lives. Ar-ticle 21, Section 4.2. of the United Nation's Convention on the rights of people with disabilities advocates the right of blind and partially sighted people to take control of their own lives. However, this entitlement is not always available to them without access to information. Today, electronic documents have become pervasive. For vision-impaired people electronic documents need to be available in specic formats to be accessible. If these formats are not made available, vision-impaired people are greatly disadvantaged when compared to the general population. Therefore, addressing electronic document accessibility for them is an extremely important concern. In order to address the accessibility issues of electronic documents, this research aims to design an aordable, portable, stand-alone and simple to use Complete Reading System to provide accessible electronic documents to the vision-impaired.
Contents
Acknowledgments ix
Abstract xi
List of Figures xix
List of Tables xxiii
1 Introduction 1
1.1 Introduction . . . 1
1.2 Statement of the Problem . . . 1
1.3 Aims of the Research . . . 2
1.4 Outcome of the Research . . . 2
1.5 Thesis Overview . . . 3
1.6 Summary . . . 4
2 Current State of Technology 5 2.1 Introduction . . . 5
2.2 Print Disability . . . 5
2.3 Reading and Learning Methods . . . 6
2.3.1 Dierent Types of Reading and Learning Methods . . . . 6
2.3.2 Learning Styles for the Sighted Print Disabled . . . 7
2.3.3 Learning Styles for the Vision-Impaired . . . 7
2.4 Traditional Alternate Access . . . 8
2.4.1 Braille . . . 8
2.4.2 Analogue Talking Books . . . 8
2.5 Adaptive and Assistive Technologies for Vision-Impaired . . . 9
2.5.1 Assistive Technologies based on Tactile Methods . . . 9
2.5.2.1 Screen Reader . . . 10
2.5.2.2 Sonication . . . 10
2.5.2.3 Digital Accessible Information System (DAISY ) 11 2.5.2.4 electronic publication (ePub) . . . 12
2.6 Documents Accessibility Issues . . . 12
2.7 Categories of Electronic Documents Accessibility . . . 13
2.8 Portable Document Format (PDF) . . . 14
2.9 Categories of PDF Documents Accessibility . . . 14
2.9.1 Scanned PDF . . . 14
2.9.2 Structured PDF . . . 15
2.9.3 Tagged PDF . . . 15
2.9.4 PDF/Universal Accessibility . . . 15
2.10 Components of PDF Documents . . . 16
2.10.1 Accessibility of Non-Textual Components . . . 16
2.10.2 Approaches for Accessibility of Non-Textual Components 17 2.10.3 Graphical Components Accessibility Requirements . . . . 20
2.10.4 Alternate Access Methods for Visual Printed Material . . . 20
2.10.5 Mathematical Expressions . . . 21
2.10.6 Tactile Approaches for Accessibility of Mathematical Ex-pressions . . . 23
2.10.7 Audio Approaches for Accessibility of Mathematical Ex-pressions Using Mark-up Formats . . . 24
2.10.8 Accessibility of High Volume Transaction Output (HVTO) 24 2.11 Accessibilty Issues of Chemical Equation Presentation . . . 26
2.11.1 Represent Chemical Equations to Vision-Impaired Students 26 2.11.2 Concepts of Chemical Equation and Balancing . . . 26
2.11.3 Current Methods to Represent Chemical Equation to Vision-Impaired Students . . . 27 2.12 Summary . . . 28 3 Methodology 31 3.1 Introduction . . . 31 3.2 Objectives . . . 31 3.3 Research Methodologies . . . 32
3.4 Alternative Methodologies for Software Development . . . 32
3.4.1 Waterfall . . . 32
3.5 Agile Methodology and this Research . . . 34
3.6 Summary . . . 34
4 Requirements Analysis 35 4.1 Introduction . . . 35
4.2 Requirement Analysis Using Survey Results . . . 35
4.2.1 Accessibility of Electronic Documents . . . 35
4.2.2 Communication with non-Textual Components in Elec-tronic Document . . . 41
4.3 Design Documents and Prototype Using Licensing Approaches . 47 4.4 Summary . . . 49 5 System Components 51 5.1 Introduction . . . 51 5.2 Overview of CRS Modules . . . 51 5.3 DAISY Player . . . 52 5.3.1 Introduction . . . 52
5.3.2 Developed DAISY Player . . . 53
5.3.3 Features of Developed DAISY Player . . . 55
5.4 Creation of Accessible and Navigable Document from Scanned PDF 56 5.4.1 html OCR (hOCR) . . . 56
5.4.2 Segmentation and Layout Analysis . . . 57
5.4.3 Segmentation by Recognition by Adaptive Subdivision of Transformation (RAST) and Voronoi Methods . . . 58
5.4.4 Preprocessing . . . 59
5.4.4.1 Format Conversion . . . 59
5.4.4.2 Convert to Binary . . . 60
5.4.4.3 Scaling . . . 60
5.4.4.4 Black Borders Removal . . . 60
5.4.4.5 Margin Removal . . . 60
5.4.4.6 Skew Detection and Correction . . . 61
5.4.5 Block Segmentation . . . 62
5.4.6 Text/non-Text Segmentation . . . 64
5.4.7 Line Segmentation . . . 66
5.5 Mathematical Information Retrieval (MIR) from Scanned PDF . . 70
5.5.1 MIR Overview . . . 70
5.5.3 Mathematical Expression Accessibility Issues in Scanned
PDF Documents . . . 73
5.5.4 Previous Methods of Mathematical Formulae Extraction . 75 5.5.5 Global Line Labelling by Feature Extraction and SVM . . 76
5.5.6 EF Extraction Using Word Segmentation . . . 81
5.5.7 Recursive Symbol Segmentation and Bracket Rule . . . 83
5.5.8 Generating a Symbol Dictionary Using InftyMDB-1 . . . . 86
5.5.9 Mathematical Symbol Recognition Using K-Nearest Neigh-bours (kNN) Based on Binary Vector . . . 88
5.5.10 Symbol Layout or Structural Semantic Analysis . . . 90
5.5.11 Merging Primitive Components and MathML Generation . 92 5.5.12 Rendering Math-TEX Family with MATHSPEAK . . . 97
5.6 Mathematical Graph Accessibility with MathGraphReader . . . . 100
5.6.1 Current Method . . . 101
5.6.2 Proposed Method . . . 101
5.6.2.1 Examined Graphs . . . 102
5.6.2.2 General Concepts . . . 104
5.6.2.3 Data Extraction . . . 104
5.6.3 Testing and Evaluation . . . 109
5.7 Not-in-Order Components: Tables and TableReader . . . 110
5.7.1 Tables Categories . . . 112
5.7.1.1 Regular table . . . 112
5.7.1.2 Irregular table . . . 114
5.8 High Volume Transactional Output (HVTO ) Segmentation . . . 116
5.9 Non-Textual (Graphical) Components Accessibility . . . 118
5.9.1 ChartRecognition . . . 118
5.10 Examined Non-textual Components in this Research . . . 123
5.10.1 Bar Chart and BarChartReader . . . 123
5.10.2 Line Chart and LineChartReader . . . 128
5.10.3 Pie Chart and PieChartReader . . . 130
5.10.4 GNUPLOT Evaluation Tool . . . 132
5.11 A Method to Present Chemical Equations to Vision-Impaired Stu-dents . . . 134
5.11.1 Representing Chemical Equation in Markup Format . . . 134 5.11.2 Species Classication and Reactants/Products Extraction . 135 5.11.3 Chemical Elements Extraction and Symbol Replacement . 136
5.11.4 Calculation of Total Quantity of Element at Left and Right
Side and Comparison . . . 137
5.11.5 Insert Unknown Coecients Before Species Including Re-actants and Products . . . 140
5.11.6 Dening Algebraic Equations to Obtain Coecients and Balance Chemical Equation . . . 141
5.11.7 Tagging Classied Information and Markup Format Gen-eration . . . 141
5.12 Summary . . . 143
6 Implementation of Hardware Platform 145 6.1 Introduction . . . 145
6.2 FPGA Prototype . . . 145
6.2.1 FPGA Prototype Requirements . . . 145
6.2.2 FPGA DTB Books Player Functionality . . . 146
6.2.3 FPGA DTB Player Hardware Components . . . 146
6.2.4 FPGA DTB Player . . . 147
6.3 Embedded Platform for Reading System . . . 150
6.4 User Interaction Methods . . . 152
6.4.1 Introduction . . . 152
6.4.2 Speech Recognition . . . 152
6.4.3 Customised Tactile Keypad . . . 154
6.4.4 Users Feedback about USB keypad . . . 156
6.4.5 Joysticks . . . 157
6.4.6 Text to Speech (TTS) . . . 158
6.4.7 Braille Terminal . . . 160
6.5 Summary . . . 162
7 Testing for Evaluation and Usability 163 7.1 Introduction . . . 163 7.2 Block Segmentation . . . 163 7.3 Chart Recognition . . . 164 7.4 BarChartReader . . . 167 7.5 PieChartReader . . . 169 7.6 LineChartReader . . . 170 7.7 MathGraphReader . . . 172 7.8 MATHSPEAK . . . 175
7.9 Alternative Text Description for Chemical Equation . . . 180 7.10 Mathematical Handwritten Documents Segmentation Results . . . 183 7.11 Summary . . . 183
8 Conclusion 185
8.1 Future Work . . . 193 8.2 Final Reection . . . 194
List of Figures
2.1 The PHANToM (Murray, 2008) . . . 19
2.2 2-Dimensional mathematical equations contain subscripts and su-perscripts . . . 22
4.1 Survey participants age group (left) and survey participants gender (right) . . . 36
4.2 Respondents' perception of the most common document format . 37 4.3 Respondents' Most preferred format . . . 38
4.4 Important factors in selecting a document format . . . 39
4.5 Accessibility Tool . . . 40
4.6 Most inaccessible graphical components . . . 42
4.7 Frequent encounter inaccessible graphical representations . . . 43
4.8 Best perceived accessibility method for graphical components . . . 44
4.9 Navigation importance for access to graphical components . . . . 45
4.10 Preferred type of tactile method . . . 45
4.11 Frequent usage of Haptic device . . . 46
5.1 Overview of CRS modules . . . 52
5.2 Three DAISY standards comparison . . . 54
5.3 Steps of reading session by developed DAISY player . . . 55
5.4 An image before (left) and after (right) margin removal . . . 61
5.5 Rotated PDF document . . . 62
5.6 left: a multiple columns document including sidebar. Top-right: the result of using Voronoi segmentation. Bottom-left: a multiple columns document. Bottom-right result of using Hori-zontal/Vertical segmentation (blocks are specied with blue lines separator) . . . 64
5.7 Left to right: A sample Binary-Block-Segment, dilated, non-Textual and Text-Only images . . . 65
5.8 Text/non-Text Segmentation . . . 66
5.9 Dierent types of lines in mathematical documents:Top: Text, Middle: EF and Bottom: IF . . . 66
5.10 A Fully-Segmented EF line . . . 67
5.11 A 2-D IF line (top) and over-segmentation result (middle and bot-tom) . . . 68
5.12 Two 2-D IF lines (top) and under- segmented IF 2-D line result (bottom) . . . 68
5.13 EF 2-D line (top) and broken symbol (bottom) . . . 68
5.14 Mathematical PDF document (top ) and its line segmentation re-sult (bottom) . . . 70
5.15 Overview of MIR . . . 72
5.16 A sample of mathematical document (left) and extracted features of its segmented lines (right). These features are: left margin (LM), right margin (RM), height (H), width (W), vertical space (VS) and aspect ratio (AR) . . . 77
5.17 Threshold value=Mode(x) (left), Threshold value=Median(x)(right) 78 5.18 Global Line Labeling . . . 80
5.19 An EF line sample (top) and word segmentation result of the sam-ple line (bottom). . . 82
5.20 Voronoi result for sample of EF . . . 84
5.21 VCE and HCE result for a sample . . . 85
5.22 Slopes between two adjacent symbols . . . 90
5.23 Relationship between two adjacent mathematical symbols Height-Ratio . . . 91
5.24 8-dierent relationships between a symbol and its neighbours . . . 91
5.25 A Mathematical graph and its similar product by GNUPLOT . . 109
5.26 From top to bottom: Illustration of a regular table (Binary-Table.png), vertical borders (vertical.png), horizontal borders (horizontal.png), vertical/horizontal borders intersection (intersec-tion.png) and the values of cells bounding boxes . . . 114
5.27 Sample of an irregular table and its cells bounding box values . . 116
5.28 HVTO Segmentation: The sample of a bill (British Gas, 2013) (left) and result of horizontal/vertical segmentation (right) . . . . 117
5.29 Chart-Legend Segmentation Results: Left: from top to bottom a sample of Pie Chart, Pie-Chart-Only and Legend-Only images. Right: from top to bottom a sample of Bar Chart, Bar-Chart-Only and Legend-Only images . . . 120
5.30 Chart classication . . . 122
5.31 Horizontal/Vertical lines removal and binarizing result for samples of pie chart and bar chart. Left: top to bottom, Pie-Chart-Only and its Binarized-Removed-Horizontal-Vertical-Lines. Right: top to bottom Bar-Chart-Only and its Binarized-Removed-Horizontal-Vertical-Lines . . . 123
5.32 Horizontal/Vertical Axes Segmentation . . . 126
5.33 a) Bar chart; b) Chart-only; c) Legend-only; d) Vertical axis; e) Horizontal axis; and f) Binary-Bars-Only images by BarChart Reader. . . 127
5.34 a) A line chart sample; b) Horizontal axis; c) Vertical axis; d) Horizontal grid lines removal result; and e) Marker-Only images by LineChartReader . . . 130
5.35 From left to right: a) A sample of Pie Chart; b) Pie-Only; c) Legend-Only; d) Transparent-Background-Pie-Only; and e) Color-Reduced-Ttransparent-Pie-Only . . . 132
5.36 A sample line chart (left) and generated line chart by GNUPLOT based on data extracted by LineChartReader (right) . . . 133
5.37 A sample bar chart (left) and generated bar chart by GNUPLOT based on data extracted by BarChartReader(right) . . . 134
5.38 Algebraic nding to balance three samples . . . 143
6.1 Finite state machine for PmodDA2 . . . 148
6.2 FPGA-based embedded system for DTB player . . . 150
6.3 Block diagram of customized board for reading systems . . . 151
6.4 Arduino Hex Keypad Schematic . . . 155
6.5 CRS customised keypad . . . 156
6.6 Mobile keypad arrangement . . . 157
6.7 NLS Digital Talking Book Player (Maine State Library, 2013) . . 157
6.8 Reading system connected to Braille terminal . . . 162
7.1 Engineering drawing . . . 164
7.2 Recognised as pie chart . . . 165
7.3 Recognised as bar chart . . . 165
7.4 Recognised as line chart . . . 166
7.5 Examined bar charts by BarChartReader. Maximum / Minimum Bars, total number of bars were propery recognised. . . 168
7.6 Bar chart samples which were examined by BarChartReader . . . 169
7.7 Sample of Pie charts,their slices percentage and equivalent bar charts generated by PiechartReader . . . 170
7.8 Line Chart Samples processed by LineChartReader . . . 172
7.9 Sample1 Mathematical graph . . . 173
7.10 Sample2 Mathematical graph . . . 174
7.11 Sample3 Mathematical graph . . . 175
7.12 Sample4 Mathematical graph . . . 175
7.13 Handwritten mathematical sample (left ), line segmentation result (middle) and character segmentation of two lines (right). . . 183
List of Tables
4.1 Results summary for survey 1 . . . 41 4.2 Results summary for survey 2 . . . 47 5.1 hOCR Existing tags and recommended tags by this research . . . 57 5.2 Mathematical Symbol Categories Based on symbol Aspect Ratio . 88 5.3 Predened size for three symbol categories . . . 89 5.4 Dierent relations between two adjacent symbols in mathematical
expressions . . . 92 5.5 Graph location in plane . . . 107 5.6 Supported functions with GNUPLOT and MathGraphReader
ap-plication . . . 108 5.7 Evaluation responses . . . 110 5.8 Category 1 of symbols for chemical elements . . . 137 5.9 Category 2 of symbols for chemical elements . . . 138 5.10 Algebraic equations . . . 141 5.11 Mark-up Format Tags . . . 142 6.1 Push button values for dierent functions . . . 147 6.2 Basic Functions in VOCA le in Voice Recognition System for CRS 153 6.3 Using joystick as a user interface . . . 158 6.4 TTS control commands . . . 160 7.1 Chart recognition features . . . 167 8.1 Research objectives, proposed methods and used techniques -1 . . 188 8.2 Research objectives, proposed methods and used techniques-2 . . 189 8.3 Research objectives and proposed methods to achieve them-3 . . . 190
List of Abbreviations
ASCII American Standard Code for Infor mation Interchange ASTER Audio System for Technical Readings
BNF Backup Normal Form
CRMFS Compressed ROM File System CRS Complete Reading System
DAC Digital to Analogue
DAISY Digital Accessible Information System DAS Document Accessibility Services
DFA Deterministic Finite Automaton DFU Device Firmware Update
dpi Dot Per Inch
DRM Digital rights management DTB Digital Talking Book
DTE/DCE Data Terminal Equipment/Data Circuit Equipment ePUB Electronic Publication
FPGA Field Programmable Gate Array FTP File Transfer Protocol
GPIO General Purpose Input/Output HMM Hidden Markov Model
HVTO High Volume Transactional Output
ICAD the International Community for Auditory Display I/O Input/Output
kNN k nearest neighbours
LVCSR Large Vocabulary Continuous Speech Recognition MAC Media Access Controller
MathML Mathematical Markup Language MIR Mathematical Information Retrieval MMU Memory Managment Unit
MPC Magick Persistent Cache image le format NCX Navigation Center eXtended
NLS National Library Service
OCR Optical Character Recognition OPF Open Packaging Format
OS Operating System
PCM Pulse Code Modulation PDF Portable Document Format PLB Processor Local Bus
PPM Portable Pixel Map
RAST Recognition by Adaptive Subdivision of Transformation SIGHT Summarizing Information Graphics Textually
SMIL Synchronised Multimedia Integration Language STEM science, technology, engineering and math SBC Single Board Computer
SVM Support Vector Machine TFTP Trivial File Transfer Protocol
TTS Text to Speech
UART Universal Asynchronous Receiver Transmitter VEM Visual Extraction Module
WHO World Health Organisation XED Structured Electronic Document
XCDF Structured Canonical Document Format (XCDF) XHTML Extensible Hyper Text Markup Language XML Extensible Markup Language
Chapter 1
Introduction
1.1 Introduction
Today technology advancement leads to pervasive usage of electronic documents. These electronic documents include a wide range of daily life information for people such as yers, brochures, news, statement presentment in the nancial services, insurance, utilities, and government sectors such as bills or bank state-ments and learning materials such as reference books or user manuals. Thus, the accessibility to electronic documents for vision-impaired is a noteworthy concern which is considered the major problem that this research will address.
1.2 Statement of the Problem
The right to access information is internationally recognised in Article 21 of the United Nation Convention on the Rights of Persons with Disabilities (United Nations, 2006). According to the World Health Organisation (WHO) statistics updated by October 2013, 285 million people are estimated to be visually impaired worldwide: 39 million are blind and 246 million have low vision. Approximately 90% of visually impaired people live in developing countries (Georey et al., 2010) and 65% of them are aged 50 or older. This age group comprises approximately 20% of world's population. There are 19 million visually impaired children, out of which 1.4 million are irreversibly blind for the rest of their lives and need social, vocational, economic and educational support (WHO, 2013). They experience great diculty in accessing printed material. Having access to information and printed resources is an essential requirement for independent and constructive lives. In addition, more people will be at risk of age-related visual impairment with increasing elderly population in many countries (WHO, 2011). The lack of accessible information is one of the most critical obstacles for vision-impaired students to complete post secondary education and undertake academic studies in science, technology and engineering elds which may lead to lower career success.
1.3 Aims of the Research
This research is focused on the accessibility of printed and electronic documents including websites, books, invoices, letters and leaets to a broad, vision-impaired audience. It aims to study the issues in accessing the most commonly used electronic documents to formulate the denition of the research problem and ad-dress the accessibility issues of dierent components of electronic documents such as charts, graphs, tables, mathematical expressions and chemical equations.To achieve this goal design and implement of Complete Reading System (CRS) for vision-impaired was considered. The CRS is a fully functional, simply operated, standalone, and aordable system which provides opportunity to access, navigate and bookmark documents through reading sessions for the vision-impaired. CRS supports vision-impaired users specically students in high school and high level education to access reference books and user manuals just like sighted people by using navigation, bookmarking and searching abilities. The CRS is a modular software system.
1.4 Outcome of the Research
An initial prototype has been designed and implemented for the system to provide accessibility to the most commonly used electronic documents for vision-impaired people during this research. In designing the CRS, eorts have been made to oer electronic document accessibility and fully featured multimedia reading ex-perience to vision-impaired people, even for those with little or no computer experience. The CRS is an embedded device with a software application which provides features and exibility, and a hardware platform that provides cost ef-fective performance without the requirement for expensive computers or smart phones. The ndings support making a standalone, low cost and aordable CRS for vision-impaired that:
Enables electronic documents such as HTML, PDF, POSTSCRIPT, plain text, DOC, DOCX, and ODT to be accessible to vision-impaired using audio method;
Plays Digital Accessible Information System (DAISY) with navigation, searching and bookmarking abilities for reading user manuals, reference books and encyclopedias ;
Performs layout analysis;
Runs multilayer segmentation on document as follows: 1. Block segmentation
3. Line segmentation 4. Word segmentation 5. Character segmentation;
Generates comprehensive text descriptions from non-textual components and converts text descriptions to accessible audio output;
Keeps author reading order or structure of not-in-order components such papers with more than one column;
Converts mathematical formula images to mark-up format using Mathe-matical Optical Character Recognition (MOCR);
Converts High Volume Transaction Output (HVTO) to accessible formats for vision-impaired customers; and
Converts text format of chemical equations to alternative mark-up and au-dio description to help vision-impaired students to balance chemical equa-tions.
1.5 Thesis Overview
The structure of the thesis consists of 8 chapters as follows, Chapter 2 provides a literature review of current knowledge, ndings, denitions and theoretical and methodological contributions to the electronic documents accessibility to vision-impaired users. Chapter 3 compares the Waterfall and Agile methodologies as two methods for software development. Chapter 4 presents results of two sur-veys conducted among vision-impaired people to perceive their accessibility to electronic documents and communication with non-textual components. Chap-ter 5 explains dierent software modules as approaches to solve issues regarding electronic document accessibility. Chapter 6 describes hardware implementation methods for the Complete Reading System followed by nding and driving ap-propriate input/output interface to communicate system and users. Chapter 7 includes outcomes of applying the developed modules in Chapter 5 for random samples to evaluate the system.Chapter 8 recapitulates the research that has been carried and Appendix B contains codes of developed software modules in Chapter 5.
To demonstrate the contributions of this research complete alternative text descriptions were provided for all gures in this thesis to be accessible to vision-impaired users who utilize screen readers.
1.6 Summary
Electronic documents may contain non-textual components (such as charts and graphs), non-linear components such as tables, not-in-order (papers with more than one column), and multidimensional components (mathematical expressions and chemicals equations). These documents have a number of accessibility issues for those who are vision-impaired and using assistive technology. The next chap-ter is the lichap-terature review of established knowledge and ideas about electronic document accessibility issues and the current solutions to address them.
Chapter 2
Current State of Technology
2.1 Introduction
This chapter includes the current knowledge, ndings, denitions and theoretical and methodological contributions to electronic document accessibility for vision-impaired users.
2.2 Print Disability
George Kerscher coined the term print disabled (circa 1988-1989) to describe persons who could not access print (Kendrick, 2001). The denition is as follows: A print-disabled person is a person who cannot eectively read print because of a visual, physical, perceptual, developmental, cognitive, or learning disability (Reading Rights Coalition, 1989). A print disability prevents a person from gaining information from printed material in the standard way and requires them to utilize alternative methods to access that information. Print disabilities include visual impairments, learning disabilities, or physical disabilities that impede the ability to manipulate a book in some way (Learning Ally, 2012).
Additionally, the Higher Education Opportunity Act denes print disabled as a student with a disability who experiences barriers to accessing instructional material in non-specialized formats (Title 17 of the Copyright Act - Council for Exceptional Children, 2008).
The Google Library Project Settlement denes print disabled as a user who is unable to read standard printed material due to blindness, visual disability, physical limitations, organic dysfunction or dyslexia (Reading Rights Coalition, 1989).
The reasons for print disability vary but may include: Vision impairment or blindness;
Physical dexterity problems such as multiple sclerosis, Parkinson's disease, arthritis or paralysis;
Learning disabilities, such as dyslexia brain injury or cognitive impairment; Literacy diculties; and
Early dementia (Vision Australia, 2012).
2.3 Reading and Learning Methods
In active learning environment learners can restructure the new information and their prior knowledge into new knowledge about the content and to practice using it such as visual aids, demonstrations or integrated into class presentation learners. In passive learning area can get only what they are told such as listen to tape recorder (Herr, 2006). Study by and for educators identify three basic styles of learning (Barbe et al., 1979). These three styles are used by sighted, print disabled and vision-impaired dierently.
2.3.1 Dierent Types of Reading and Learning Methods
The three basic learning styles are:
Tactile/kinetic: This type of learner learns best through moving, doing and touching. They prefer hands-on approach, actively exploring the physical world around them. They remember best what was done, but may have diculty recalling what was said or seen. They need direct involvement and may nd it hard to sit still for long periods and may become distracted when not able to move; by their need for activity and exploration. These learners often need to take frequent breaks or move around when bored (Baxter, 2013).
Auditory: Auditory learners learn best through listening lectures, discus-sions, talking things through and listening to what others have to say. Au-ditory learners focus in on tone of voice, pitch, speed and other aspects of verbal presentations. Written information may have little meaning until it is heard. These learners prefer to sit where they can hear, but may not pay attention to what is happening up front. They may talk to themselves or others when bored. Auditory learners often benet from reading text aloud and using a tape recorder. The learners using this style would rather listen to things being explained than read about them. However, other noises may become a distraction resulting in a need for a relatively quiet place (Baxter, 2013).
Visual: This category of learner learns best by looking at graphics, watching a demonstration, or reading. For the visual learner, it is easy to look at charts and graphs, but they may have diculty focusing while listening to an explanation. Visual learners often think in pictures and prefer graphical representations of concepts through charts, diagrams, or tables (Baxter, 2013).
Although most people use a combination of these three learning styles, there is usually a clear preference for one (Manktelow & Carlson, 2014). Knowing and understanding the types of learning styles is important for students of any age. A balanced, intelligent child is able to develop all three types of learning styles (Felder & Brent, 2005). Just because a child has dominant learning style does not mean that the other learning styles cannot be improved (Cleaver, 2011). People may have to adapt to new learning styles as their lifestyles change. For example, a visual learner who is experiencing the eects of ageing on their eyesight may need to shift toward the auditory learning style. Conversely, a youngster who has successfully learned through hands-on, tactile methods may need to adapt to the visual and auditory learning as they enter higher education.
2.3.2 Learning Styles for the Sighted Print Disabled
Dyslexia is a type of sighted print disability, which causes diculties with specic language skills, particularly reading. Students with dyslexia may experience di-culties in other language skills such as spelling, writing and speaking. Moreover, people with dyslexia have problems with discriminating sounds within a word, which is a key factor in their reading. It is important for these individuals to be taught by a method that involves several senses (hearing, seeing, touching) at the same time. They can benet from listening to books-on-tape and from writing on computers (Ramus, 2003; Silverman, 2000).
2.3.3 Learning Styles for the Vision-Impaired
Individuals cannot be categorized into these three simple learning styles, they may require a combination of two styles to understand and comprehend new material (James & Gardner , 1995). While all students learn in three methods: visual, audio and tactile or kinaesthetic methods, most blind and vision-impaired students learn by both audio and tactile methods (Gardier et al., 1997). Students who are blind or vision-impaired often prefer using printed course materials that have been converted into Braille, large print and digital recordings (Miner et al., 2001).
2.4 Traditional Alternate Access
Since learning styles for the vision-impaired are tactile and audio methods, printed documents need to be available in specic alternate formats. Traditionally alter-native formats such as Braille, large print or audio are used by the vision-impaired to read material.
2.4.1 Braille
In 1829 Louis Braille the inventor of the Braille system stated that Braille is knowledge, and knowledge is power. Braille is a tactile form of reading and writing used by people who are blind or vision impaired. It is based on a six dot cell with two columns of three, like the six on a dice. The dots in the rst column are numbered 1, 2 and 3 from the top down; and the dots in the second column are numbered 4, 5 and 6 from the top down. By using any number of these six dots 63 dierent patterns can be formed (Vision Australia, 2012)
It is a method of writing words, music and plain song by means of dots, for use by the blind and arranged by them (Sullivan, 2002). The range of paper-based Braille material is limited. Braille books, whilst of great value, particularly in conveying mathematics or musical score, are heavy and bulky. For the vision-impaired person over the age of 85 has newly acquired sight loss, the adaption to a new way of reading may be dicult and learning Braille may be problematic (European Blind Union, 2011). Eective use of Braille requires tactile sensitivity and all vision-impaired people may not have this sensitivity (Murray, 2008).
2.4.2 Analogue Talking Books
One of the most common ways of the audio method is using traditional talking book, which is an analogue representation of a print publication, usually human read and recorded on cassette tape or record (Friedmann, 2008). These media suer several signicant disadvantages.
Distribution of physical media is costly;
They are easily damaged and wear over time; and
They oer only sequential access to information and extremely limited nav-igation. This navigation is sometimes available but needs cassette number, side and tone number.
2.5 Adaptive and Assistive Technologies for
Vision-Impaired
Print disability occurs in dierent forms and ranges of severity (Miner et al., 2001). Consequently, assistive technology which includes any device, piece of equipment or system to help them must be exible. Assistive technology provides alternative strategies to compensate for weakness and capitalize talents. Print material needs to be reproduced in a format such as Braille, large print hard copy, sound recording of text narration, or digital les for people with a print disability (Australian Copyright Council, 2007).
A wide range of adaptive and assistive software and devices are available for people with a print disability. Many adaptations are simple and readily available (University of Washington, 2013). Most technologies use tactile and/or audio methods to convey information.
A screen magnier can make monitors easier to read (Macular Degenera-tion FoundaDegenera-tion, 2012). Software reconguraDegenera-tion or special software can reverse images on computer screens from the conventional black on white to white on black (or other combinations) for individuals who are light-sensitive. For many partially sighted readers, well-designed print information using a minimum 12-point size font on good quality non-shiny paper makes the document accessible (European Blind Union, 2011). Additionally; accessibility features in computer operating systems and other programs are useful for those with vision disabil-ities (National Center Accessible Information Technology Education University Washington, 2013). Some of the available computer programs to support elec-tronic documents accessibility are:
Text to Speech using computer-synthesised voice and screen readers can be used to read text on the screen such as Voice Over for Macintosh (Schwarzenegger, 1991) and JAWS for Windows (Henter, 1989);
Text to Braille, translate text into refreshable Braille such as Duxbury in Windows (Gilda et al., 2014) and liblouis and libbrlapi using Brltty in Linux (Boyer, 2002);
Screen magnier software to enlarge text on the screen; and
Text recognition applications such as Optical Character Recognition (OCR).
2.5.1 Assistive Technologies based on Tactile Methods
Braille embossers provide hard copy output which is too bulky. In addition the Braille embosser is noisy and expensive. Nowadays, Braille users can read com-puter screens and other electronic devices using a Braille Terminal or refreshable Braille display. A refreshable Braille display is a piece of hardware that provides
Braille output from computer input. The 8-dot refreshable Braille cells change, or refresh, according to the part of the screen that has the computer's attention and is used with screen reading software, such as JAWS for Windows (Assistive Technology and Accessibility Centers , 2007). Refreshable Braille displays pro-vide word-by-word translation of text on the screen into Braille on a separate display. The display reproduces words in the format of vertical pins that raise and lower to form Braille characters in real time as the text is scanned (Miner et al., 2001). Picture in a Flash (PIAF) is a Tactile Graphic Maker which makes raised line drawings on special paper. Tactile graphics, including tactile pictures, tactile diagrams, tactile maps and tactile graphs, are images with raised surfaces to convey non-textual information such as maps, paintings, graphs and diagrams. Production of tactile maps is one of the most common uses for tactile graphics (McCallum et al., 2005).
Adaptive software and devices, such as refreshable Braille display can be ex-tremely expensive. Even people using free or inexpensive software need computer equipment with adequate memory, processing speed, as well as assistance with learning new software. Thus people who are not in full-time employment, may not be able to aord such equipment. Consequently many of them are unable to use most of sophisticated adaptive technology (Australian Copyright Council, 2007).
2.5.2 Assistive Technologies based on Audio Methods
2.5.2.1 Screen Reader
Screen readers convert screen information to audio using computer speech synthe-sizers and allow vision-impaired people to use the computer, create a document using a word processor like MS Word, read any article on the internet, com-municate through instant messaging software, create a blog post and write an email.
Vision-impaired people listen to a screen reader reading the text displayed on the screen (Theofanos & Redish, 2003).
Screen reader users do not usually have the chance to learn the correct spelling of certain words, especially uncommon ones such as medical terms. They can make a screen reader read a word character by character after they hear a word that they do not know the spelling but it is very time consuming and in some cases, they cannot recognise S's and F's (Bohman, 2014).
2.5.2.2 Sonication
Sonication or Non-speech audio is also used to convey information or perceptual data. It is used to represent the behavior of the graphed equation by presenting mathematical concepts in a dierent way. Sonication is useful for all students regardless of disability or learning diculties in a web browser or reading tool that
supports audio (Hermann et al., 2011). This auditory perception has advantages in temporal, amplitude and frequency resolution that open possibilities as an al-ternative or complement to visualization techniques (Kramer, 1993). A Geiger detector which conveys information about the level of radiation or a church bell which conveys the current time are very basic examples for Sonication (Flowers, 2005). Though many experiments with data Sonication have been explored in forums such as the International Community for Auditory Display (ICAD), Soni-cation faces many challenges to widespread use for presenting and analysing data (Brown & Brewster, 2003). Many Sonication attempts are coded from scratch due to the lack of a exible tool for Sonication research and data exploration (Matheson, 2013). In some cases, it is dicult to provide adequate context for in-terpreting data of Sonication. Sonication has the potential to bring computing to a new level of naturalness and depth of experience for the user. Sonication requires that interfaces use audio in the rst place. It faces the problem that cer-tain interfaces perform poorly at the outset and may need more user engagement with practicing and longer learning period. To evaluate Sonication , instead of comparisons of interactive visual versus interactive auditory displays. Possibly, the better way is thinking of that whether the interactive sound can improve a user's performance in a combined audio visual display (Hermann et al., 2011). 2.5.2.3 Digital Accessible Information System (DAISY )
DAISY is an international standard for digital talking books and multimedia representation of print publication that can be either human read, utilize syn-thetic speech or contain both to support audio methods (Leith, 2006). Unlike analogue talking books, an important feature of DAISY books is the easy and rapid navigation ability within sequential and hierarchical structures consisting of marked-up text by such elements as sentence, paragraph and page including specic page numbers. DAISY synchronises text with audio that allows the user to navigate via multiple levels, search, bookmark, annotate, and alter playback speed as well as many other features (Kerscher, 2003). Using the DAISY, people with a print disability can locate particular chapters or page numbers in a digital text or sound recording le almost as easily as a sighted person (Russo, 2010). This standard is increasingly being adopted internationally (Tank & Kerscher, 2007).
By synchronising audio, images and text, DAISY multimedia can address the needs of each type of learner. Full-text/Full-audio DAISY books synchronise the audio playback with written text displayed on a computer screen to the benet of visual learners. Easy navigation of information produced in DAISY oers tactile/kinetic learners the opportunity to:
Explore documents; Interact with information; Retain attention; and
Improve learning skill.
A DAISY player is a device for people with print disabilities to read, search, nav-igate, annotate and bookmark materials such as novels, reference books and user manuals. DAISY players are implemented in either hardware or software. Hard-ware DAISY players, like CD players or MP3 players, can be of great assistance to auditory learners who benet from audio playback, whether presented through a text-to-speech feature or human narration. Hardware DAISY players oer bet-ter portability and can access online content and download books, streamed over the internet, or copied into player via a USB port. Software DAISY players en-able DAISY books to be played on a computer or mobile devices such as iPads and mobile phones (Nazemi et al., 2014). Dierent DAISY players oer dier-ent functionality levels. Some are very basic, only oering access to the audio and navigational structure of the DAISY book while other players oer enhanced functionality, such as the ability to search text, record audio notes for future ref-erence, synchronise audio and text during play time and view text on the screen, which may be helpful for people with dyslexia. Both hardware and software play-back devices have signicant drawplay-backs. Hardware players tend to be expensive due to the relatively limited market (compared to mainstream consumer devices) and software requires the user to be competent with a computer.
2.5.2.4 electronic publication (ePub)
ePub is a standard for digitized text and publishing an enhanced feature eBook. Semantic markups make it possible to access scientic and mathematical expres-sions in more meaningful and useful way. ePub uses Structure in contents, image descriptions and alt text and MathML for mathematical expressions. Each sec-tion in the document is marked up using appropriate styles such as h1, h2 . So it has a correct hierarchy of sections and page numbers in precisely the same manner as Daisy. ePub uses proper and complete markup for text and tabular data. Im-ages and content embedded in image which are not accessible to vision-impaired has a description, caption or alt text . ePub borrows heavily from the DAISY Standards and W3C & Web Accessibility Initiative (WAI) specications to the point where the ePub and Daisy standard organisations are working towards a merging the two very similar standards in the near future. All of the features in ePub3 reading systems have been part of DAISY readers since version.
2.6 Documents Accessibility Issues
Most print disability organisations use various forms of software to produce copies in accessible formats. Some organisations may ask the publisher for a digital le of the work, or may use other methods for getting a digital le, such as scanning the text. Publishers in Australia are not legally obliged to supply digital les (Australian Copyright Council, 2007). Access to the digital format greatly reduces
the time and expense of converting a text into Braille or other alternative formats. Where a digital le has been provided in an image format such as scanned PDF or Quark (Barrett, 2007), the print disability organisation must extract the text using specialised software. This process generates errors such as incorporation of page numbers into text, substitution of letters and words as well as displacement of sections of text. Therefore, the editing process required after text extraction consists of:
Error corrections;
Checking for correct reading order;
Writing a text content description of an illustration or diagram; Incorporation of text and visual cues; and
Formatting.
Thus, the editing task is time consuming and needs an editor to work on each le. This task is sometimes outsourced to specialist contractors, for subjects such as mathematics and science. Where symbols are used, editors must have detailed knowledge of the subject. In such a case, the original le produced by the publisher and the edited le delivered to vision-impaired individuals by a print disability organisation usually look quite dierent. In some cases, results are not visually attractive, even hard to use for sighted people and unreadable by blind or partially sighted people (Vystrcil et al., 2011).
2.7 Categories of Electronic Documents
Accessi-bility
In terms of accessibility to assistive technology, electronic documents have been divided into three categories:
Accessible and navigable due to containing mark-up tags such as DAISY, ePub, HTML, tagged PDF. Information that was previously unavailable to blind people such as newspapers, encyclopedias or telephone directories, be-comes now accessible on the internet using HTML (European Blind Union, 2011). The barrier of access to journals has been signicantly reduced with the immediate online availability of journals in electronic formats like HTML and PDF les, which can be read by screen output programs. Accessible due to being text or text convertible but not navigable such as
plain text, structured untagged PDF documents.
Not accessible, nor navigable, requiring image processing and/or OCR to extract text such as images and scanned PDF documents.
2.8 Portable Document Format (PDF)
PDF is a common way for organisations to publish documents, the most used electronic format for online presentations and the most popular after HTML le (Harris, 2001). PDF documents preserve fonts, images, graphics and layout of any source document and are ideal for printing exactly as the author intended. Text information, pictures and signatures can be scanned into a PDF and easily emailed to recipients. Upon the document's arrival, the receiver can open and view it using a vast array of dierent PDF viewing applications such as Adobe Reader and Apple Preview. As one of the most common digital document, PDF has historically been a major challenge for the vision-impaired community. A lack of standards and the growing variety of PDF export programs, which do not support tagging, has created major accessibility issues for this community (Wild, 2010). PDF content is presented to assistive technology as a textual representa-tion of the document. Accessibility of PDF documents by assistive technologies depends on the manufacturers of those technologies incorporating PDF support into their products. Several manufacturers have done this with recent versions of their products, but for the many users of earlier versions of the technology, the PDF will remain inaccessible (Hudson, 2004). In addition secured PDF is another issue regarding PDF accessibility. During PDF creation, some authors add restrictions to prevent users from printing, copying, extracting, commenting or editing text and these restrictions can interfere with a screen reader's ability to convert text to speech. Digital Rights Management (DRM) technology that publishers use to control or restrict digital media content on electronic devices can have negative impact on access by users with vision impairment. As a re-sult, documents may not automatically be accessible to screen readers and may require conversion tools (Adobe Systems Incorporate, 2008). In spite of recent changes by Adobe Acrobat in order to optimize accessibility of the secured PDF for assistive technology, it seems many are still inaccessible. Content authors are rarely familiar with the requirements of publishing with successful accessibility and have little reason to learn (Johnson, 2004).
2.9 Categories of PDF Documents Accessibility
2.9.1 Scanned PDF
Scanned PDF documents represent the most inaccessible type of document in terms of accessibility to users of assistive technology. The scanned PDF doc-ument is an image le and contains images of text not the real text. Text on the page is not searchable or selectable. To the assistive technology user, the document appears completely blank, tags are not available and images are not identied through alternative text (ALT-text) which conveys the same essential information as the image. Although the page can be viewed with a PDF viewer,
screen readers cannot recognise the content. Since the scanned PDF is an image format, it is inaccessible to assistive technologies such as a screen reader which reads plain text. Therefore, the information retrieval requires Optical Character Recognition (OCR) (O'Brein, 2012). The OCR software scans the scanned PDF le and through text extraction generates an editable text formatted document. To make a scanned document accessible, it must be converted from the image of the document into selectable and scalable text. This text document can then be edited, formatted, searched and indexed as well as translated or converted to speech. A problem that the OCR software does not solve is the accurate regen-eration of the full text layout. Text obtained from OCR comprises of unexpected segments and these segments may be out-of-order in terms of the expected docu-ment reading order(Bailey, 2005). Additionally, the image with less than 75 dpi may not be converted accurately into text by OCR (Neal, 2011).
2.9.2 Structured PDF
Structured PDF is somewhat accessible and is best for documents without com-plex structure such as columns, tables, footers and side bars (Bailey, 2005). The document has no tags, images have no ALT-text. The columns, rows and headers of the table are not dened, and screen-readers may read text out-of-order, skip or incorrectly interpret sections of text or read tables across rather than by cell.
2.9.3 Tagged PDF
Tagged PDF is fully accessible to the software that supports PDF interpreta-tion. Documents are tagged using many elements of the tag structure to identify sections, divisions, captions, tables and images (Bailey, 2005). Tables are fully rendered using tags to identify columns, rows, headers and data content. Read-ing order is identied throughout the document. TaggRead-ing in PDF is designed to provide a structure similar to HTML. Tagged PDFs are created with built-in accessibility like HTML, not added on (Bohman, 2002). Adding tags creates a document's duplicate that is marked-up for accessibility. PDF tags provide a hid-den structure and have no visible eect on the le. Structured PDF documents are made fully accessible and navigable for screen reader users by tagging certain elements within the document. Screen reader users can often understand a prop-erly tagged PDF as well as an HTML document. HTML tags and PDF tags often use similar tag names and organisation structures. Improper PDF tagging, lack of tools and misunderstanding of usability, causes these electronic documents to be inaccessible for users with vision impairments (Xenos Group Inc, 2010).
2.9.4 PDF/Universal Accessibility
The PDF/UA standard denes technical requirements for universally-accessible PDF documents by identifying a set of relevant PDF functions (including text
content, images, form elds, comments, bookmarks and metadata) based on ISO 32000-1 (PDF 1.7) and species how they must be used in PDF/UA-compliant documents. It does not address elements which have no direct impact on ac-cessibility, such as the compression algorithms used for image data. ( Drümmer & Chang, 2012).Consequently, as soon as the PDF le (created by PDF/UA) is converted to scanned PDF, its accessibility features do not work.
In addition for some components such as mathematical expressions and chem-ical equations PDF/UA needs to be equip with sets of MathMl and ChemMl. Otherwise the results of assistive technology have usability issues due to multidi-mensional and non-linearity nature of these components.
2.10 Components of PDF Documents
2.10.1 Accessibility of Non-Textual Components
PDF documents may contain textual, non-textual components, tables, mathe-matical expressions, chemical equations and non-alphanumeric symbols. Non-textual or graphical components such as gures, charts, diagrams, and graphs enable readers to easily acquire the nature of the underlying information (Lin et al., 2012). The use of non-textual graphical information such as line graphs, bar charts, and pie charts is rapidly increasing in digital scientic literature and busi-ness reports. These graphical components are commonly used to present data in an easy-to-interpret way. Graphs are frequently used in economics, mathematics and other scientic subjects. A vast amount of science, technology, engineering and mathematics (STEM) information is usually presented visually. Illustrations are often easier to understand for sighted people. These graphics are widely used in newspapers, text books, web pages, metro maps and instruction manuals. They provide signicant cognitive benets over text. These graphical compo-nents have an important role in conveying, clarifying and simplifying information (McCathieNevile & Koivunen, 2000). The majority of information in graphics that appear in formal reports, newspapers and magazines are intended to con-vey a message or communicative intention (Elzer et al., 2007). Traditionally, charts are used to display trends and relationship and, communicate processes or display complicated data simply. These charts may be designed for the experts and trained users for data visualization or in popular media without complicated scientic reasoning (Greenbacker et al., 2011).
Charts are typically intended to convey a message that is an important part of the document and this information generally not repeated in the article (Car-berry et al., 2006). Since these components are inaccessible, data visualization techniques are not useful for vision-impaired users and they miss all conveyed in-formation in images. A partially sighted person sees the author schema or ow chart but cannot decipher the labels. A colour blind person sees a pie chart but will not understand it if only colour is used to indicate each section. Students
and professionals in the STEM elds who are blind or have low vision must nd other ways to access this data. In many cases, they still rely on sighted peo-ple to read and describe images for them. This creates dependency which can be inecient and time consuming. They are unable to see and understand this graphical parts and lose important parts of information. They are unable to learn about the processes involved in reading, analysing, and interpreting information presented in data visual graph and charts, which are frequently used in mathe-matics and scientic materials to present and summarize data. It is fair to say that lack of access to diagrams and other graphical content signicantly limits educational and workplace opportunities for people with vision impairment. This is in contrast with textual content in which assistive technology have improved access (Diagram2012, 2012). They require a complete equivalent in text as a short description, or a text alternative to be accessible.
2.10.2 Approaches for Accessibility of Non-Textual
Com-ponents
Scanners with interaction compatible OCR software can be used to read printed materials and store them electronically on computers, for later access. Such sys-tems provide independent access to abstracts, journals, syllabus and homework assignments, however many OCR reading machine packages are not able to con-vert technical information like chemical and mathematical equations into text and are not capable of providing verbal descriptions of pictures and other graphical information (Miner et al., 2001).
There are several approaches to address the accessibility of charts using al-ternative methods. Many projects have attempted to make graphic components accessible to vision-impaired users by reproducing the image in an alternative medium, such as audio (Meijer, 1992), touch (Ina, 1996) or a combination of the two (Ramloll et al., 2000; Roth et al., 2001).
Graphics like bar and line graphs can be printed in raised dots or Braille. Traditionally, graphs and diagrams are presented in Braille, or raised dots and lines on the swell-paper (Yu & Brewster, 2002). Tactile graphics are images that use raised surfaces and vision-impaired users can feel them. They are used to convey non-textual information such as maps, paintings, graphs and diagrams. Picture in Flash is a tactile graphs method (PIAF).The general shape of the graph can be understood by touching it carefully, but hardware is needed to generate tactile charts (Goncu & Marriott, 2008). Discriminating ability and searching ability are the main eective restriction factors in this method which must be considered for tactile symbols in charts (Watanabe et al., 2012). Tactile symbols without these properties could not help to explore concepts of charts by vision-impaired people. Several problems are associated with Braille and Tactile technique:
The cost of translating into an accessible graphics format, use of expensive tactile graphics and peripheral devices and lack of congruence with the original visual graphic cause limitation for tactile graphs usage (Diagram 2012, 2012).
Only a small proportion of blind people can use Braille, because reading it requires sucient tactile sensitivity which not all vision impaired may have (Murray, 2008).
Blind people can only get a rough idea or estimation about the content (Yu & Brewster, 2002).
Tactile diagrams are not durable.
It is not easy to make changes to tactile diagrams.
Non-textual components can be made accessible to the vision-impaired in verbal description and audio format. Viable alternatives include generation of a tactile graph, delivering the information in text, interaction with an audio graph, or a combination tactile and audio approach (NCAM, 2009). When describing chart or gures within the text, the reader should state the gure and caption numbers before starting a verbal description of the image. After completing a verbal description of the image, the reader should return to the text. These techniques help to express graphical data in non-visual ways.
Haptic feedback is tactile feedback technology which recreates the sense of touch by applying forces, vibrations, or motions to the user.This technology is useful for guidance and assisting users navigation on the graph but it is not ecient to present exact data values to the user. Figure 2.1 shows a Haptic PHANToM device.
Figure 2.1: The PHANToM (Murray, 2008)
Due to nature of human touch, conveying large amounts of information through the touch channel is dicult and the narrow bandwidth can be eas-ily overloaded. As a result haptic and tactile signals could be used to represent variables that don't change frequently but require attention. In addition it may take users some time to familiarize themselves with the new interface. The lim-itations of force feedback devices hinder users' exploration of the graphs (Yu & Brewster, 2002).
Sonication is conveying data via sound pitch and 2-dimensional acoustics (Brown & Brewster, 2003). It is dicult to convey data accurately with the acoustic method (non-speech sound), and moreover, since acoustics are volatile, information can easily be misheard.
A research by McMullen and Fitzpatrick aimed to make talking tactile dia-grams viable as a method of delivering graphical material to visually impaired stu-dents at a distance, explained the merits and deciencies of the system described and also provided psychological observations into how blind learners approach tactile diagrams and the cognitive processes that are used in their comprehension (McMullen & Fitzpatrick, 2008)
The Interactive SIGHT (Summarizing Information GrapHics Textually) sys-tem provides high-level knowledge for the vision-impaired that one would gain from viewing. SIGHT uses image processing techniques to extract
communica-tive signals from a chart (Elzer et al., 2007), but it is still limited to present information of the bar chart within web pages. In this research the accessibility and usability of bar charts, line charts and pie charts have been reviewed.
Previous studies by Elzer et al recommends Visual Extraction Module (VEM) to provide chart accessibility. VEM is responsible for analyzing the graphic's image le and producing an XML representation containing information such as the graphic type (bar chart, pie chart) and the textual pieces of the graphic (such as its caption). For a bar chart, the representation includes the number of bars in the graph, the labels of the axes, and information for each bar such as the label, the height of the bar, and the colour of the bar (Elzer et al., 2007).
2.10.3 Graphical Components Accessibility Requirements
Providing alternative access to content is one of the primary ways that authors can make their documents accessible to people with disabilities. Text equivalents are always required for graphical information. Alternative content for users with disabilities will do the same as the primary content does for users without any disabilities.
The factors which must be focused on making alternative for graphical com-ponents are:
The purposes for using graphs, task characteristics and discipline charac-teristics;
The dierences between presenting information visually and aurally; How and what graphical parts are represented in the mind (Friel et al.,
2001);
The problems of non-visual exploration of graphical components;
Accurate understating of the ways in which non-textual parts benet sighted people ( Brown et al., 2013; Brown, 2007); and
Obtaining comprehensive information and advanced techniques for the graphical understanding (Huang & Tan, 2007).
2.10.4 Alternate Access Methods for Visual Printed
Mate-rial
Considering learning styles for the vision-impaired, alternate access to represent visual printed material to the vision-impaired are divided into two categories tactile the method and the audio method.
Tactile representation is active or dynamic. The graphs and diagrams are presented in Braille or raised dots and lines on the swell-paper (Yu et al., 2001).
Tactile graphics are images that use raised surfaces which a vision-impaired per-son can feel. They are used to convey non-textual information such as maps, paintings, graphs and diagrams and the user can explore the graphic (Cohen et al., 2006).
Audio representation is passive or static and the user is presented with a rep-resentation of the entire visual part at one time with limited user input (Conrod, 1996).
2.10.5 Mathematical Expressions
Mathematical Expressions are one of the most signicant components within scientic and engineering PDF documents. Students with vision impairment en-counter barriers in studying mathematics particularly in higher education levels (Murray, 2008). Accessing and doing mathematics, is one of the biggest obstacles for them in school and at the university. The lack of easy access to mathematical resources is a barrier to higher education for many vision-impaired students and puts them at an unfair disadvantage in school, academia, and industry (Jayant, 2006). Results from the National Assessment of Educational Progress show that there is great disparity between the mathematical skills of students with dis-abilities and students without disdis-abilities (Noble, 2008). Students with print disabilities must be oered an equal chance with sighted students in mathematics subjects. For students who are blind or vision-impaired, providing an alternative text-only format from PDF documents is very important. They can easily access text format with screen-reader software that converts text into audio but mak-ing mathematics accessible to the vision-impaired users is a complicated process. Traditionally, to check accuracy of mathematical description an specialist helps with the aid of a Braille reader, which is a time consuming process.
There are some PDF documents containing mathematical expressions, which can be accessed via standard assistive technology. Although screen reader convert mathematical expressions into audio, this result cannot convey the conceptual meaning of the original mathematical expressions. The language of mathematics is not purely descriptive and sequential. In most cases making a text description can cause ambiguity.
A screen reader reads mathematical documents in a linear way from left to right but due to the multidimensional nature of some materials, such as math-ematical formulae containing subscripts and superscripts, the result would be ambiguous. As it is observed from following mathematical expressions (Figure 2.2), there are three dierent expressions, that the screen reader reads them from left to right in linear manner and produces one result for all of them such as:
Figure 2.2: 2-Dimensional mathematical equations contain subscripts and super-scripts
Reading and writing mathematics is inherently dierent from reading and writing text. Mathematics can even be considered a language of its own (Karsh-mer et al., 1999).
Mathematical formulae presentation in an accessible form is very complex. If mathematical equations contain fraction bars, to prevent ambiguity, it is im-portant to indicate the numerator and the denominator and be clear about the quantities being multiplied, divided, added, or subtracted (Karshmer & Bled-soe, 2002). Students with vision impairments can learn mathematics when they have access to the proper combination of computer hardware, software and other assistive technologies.
In contrast with image-based mathematical expressions there is MathJax which:
is an open-source JavaS