Quantifying scribal behavior : a novel approach to digital paleography

(1)

A NOVEL APPROACH TO DIGITAL PALEOGRAPHY

Vinodh Rajan Sampath

A Thesis Submitted for the Degree of PhD at the

University of St Andrews

2016

Full metadata for this item is available in St Andrews Research Repository

at:

http://research-repository.st-andrews.ac.uk/

Please use this identifier to cite or link to this item: http://hdl.handle.net/10023/9429

This item is protected by original copyright

(2)

A Novel Approach to Digital

Paleography

by

Vinodh Rajan Sampath

This thesis is submitted to the

U

niversity of

S

t

A

ndrews

in partial fulfilment for the degree of

D

octor of

P

hilosophy

submitted on

August 2016

(3)

(4)

We propose a novel approach for analyzing scribal beha-vior quantitatively using information about the handwriting of characters. To implement this approach, we develop a computational framework that recovers this information and decomposes the characters into primitives (called strokes) to create a hierarchically structured representation. We then pro-pose a number of intuitive metrics quantifying various facets of scribal behavior, which are derived from the recovered in-formation and character structure. We further propose the use of techniques modeling the generation of handwriting to directly study the changes in writing behavior.

(5)

(6)

proximately 33, 000 words in length, has been written by me, and that it is the record of work carried out by me, or principally by myself in collaboration with others as acknowledged, and that it has not been submitted in any previous application for a higher degree.

I was admitted as a research student in September 2012, and as a candid-ate for the degree of Doctor of Philosophy in May 2016; the higher study for which this is a record was carried out in the University of St Andrews between 2012 and 2016.

Date . . . Signature of Candidate . . . .

Supervisor’s Declaration

I hereby certify that the candidate has fulfilled the conditions of the Resolution and Regulations appropriate for the degree of Doctor of Philosophy in the University of St Andrews and that the candidate is qualified to submit this thesis in application for that degree.

Date . . . Signature of Supervisor . . . .

Permission for publication

(7)

requested below, and that the library has the right to migrate my thesis into new electronic forms as required to ensure continued access to the thesis. I have obtained any third-party copyright permissions that may be required in order to allow such access and migration, or have requested the appropriate embargo below.

The following is an agreed request by candidate and supervisor regarding the publication of this thesis:

No embargo on any electronic nor print copy.

Date . . . Signature of Candidate . . . .

(8)

From the inscriptions of king A´soka, around 3rd century BCE

This thesis is dedicated to those who carved these edicts and helped spread writing all over the Indian subcontinent two millennia ago

1_{This is an extremely overloaded term that is usually translated as either law or}

(9)

(10)

My journey towards this thesis has been long and moly pleasant but not without the usual highs and lows that come as part and parcel of the journey itself. This document wouldn’t have been made possible if not for the following people who supported me along the way. Firand foremowould be my supervisor Dr Mark-Jan Nederhof. He has been an excellent guide and a wonderful mentor hugely supporting my research and providing conruive critiques when necessary. At times, he was the one who lent an ear when I ranted about my research. The entire thesis and my research reached the current shape because of his conant guidance. Given my background as an engineer used to building things, Dr Nederhof had a major part in mentoring me to be a researcher.

I also should mention a few others without whom I would not have

arted this PhD. Dr Jean-Luc Chevillard (CNRS, Université Paris-Diderot) was the one who initially prodded me into doing a PhD and has been a gentle guide and mentor all along. Dr Shriramana Sharma (Nerur, India), a long time friend and collaborator, supported me a lot with my PhD ap-plication. . My mate Prem Krishnan provided me with the much needed academic papers at unearthly hours while I was writing my PhD proposal.

SICSA provided the necessary funding for this research through their Prize PhD Studentship. I am grateful to my second supervisor Dr Aaron

(11)

er-and focus my research. The school adminiration team, especially Ms Judi Robertson, Ms Julie Dunsire and Mr Alex Bain, helped with all the day-to-day needs and formalities of my research and conference travel.

Dr Réjean Plamondon (École Polytechnique de Montréal) provided the software that implemented the Sigma-Lognormal model. Thanks to invitations from Dr Eva Wilden (CSMC, University of Hamburg), I was able to visit the University of Hamburg twice to present my work and exchange fruitful discussions at the Centre for the Study of Manuscript Cultures. I was also able to visit the University of Cambridge thanks to Dr Vincenzo Vergiani, who kindly invited me to discuss my research and provided valuable manuscript data.

I also utilize this space for everyone who made my journey towards my PhD a lot more intereing and enjoyable. My officemates Shyam Reyal and Martin McCaffery provided a conducive environment and the much needed diraion (and occasional support) for my research.Merci

& to you both! I would like to mention my Scottish Gaelic teacher Mòrag, with whom Iudied Gaelic for two years, for providing a refresh-ing break from my research. Tapadh Leibh, a Mhòrag! Bu mhòr a chòrd na

clasichean agaibh rium.

I am indebted to my parents, my uncle and aunt, and in fa to all of my extended family, without whose support I wouldn’t be here.

. I am also thankful to my close friends in Scotland who have supported and been with me when I was going through a very tough phase in my life.

I guess I kept the laspot for the be. My boyfriend (and fellow PhD

udent) Juan José Mendoza Santana was with me all through the writing

age and coped up with my terrible mood swings. Without his conant support, de-ressing and motivation I would not have made through the writing phase sane. Eres el mejor. Sólo quiero decirte que eoy muy feliz

por haberte encontrado en el momento perfeo y haber entrado en mi vida.

Muchas gracias por ear siempre conmigo y apoyarme, cariño mío. ¡Sin ti,

earía perdido! Sólo desearía poder pasar el reo de mi vida junto a ti in

(12)

Contents ix

1 Introduction 1

1.1 Motivation . . . 1

1.2 Hypothesis . . . 6

1.3 Research Questions . . . 6

1.4 Contributions . . . 7

1.5 Limitations . . . 7

1.6 Publications . . . 8

1.7 Structure . . . 8

2 Background and Related Work 11 2.1 Digital Paleography . . . 11

2.2 Character Features . . . 15

2.3 Handwriting Models . . . 19

2.4 Summary . . . 23

3 Quantitative Analysis of Scribal Handwriting: Methods and Metrics 25 3.1 Kinematics of Handwriting . . . 25

3.2 Digital Paleographic Framework for Quantitative Analysis of Handwriting . . . 28

(13)

3.4 Summary . . . 76

4 Case Study: Analysis of the Development of Indic Scripts 79 4.1 Data Set . . . 80

4.2 Quantitative Analysis of Metrics . . . 84

4.3 Handwriting Modeling of Indic Script Development . . . . 105

4.4 Open Data . . . 118

4.5 Summary . . . 119

5 Evaluation 121 5.1 Participants . . . 121

5.2 Evaluation Design . . . 123

5.3 Evaluation of the Framework . . . 125

5.4 Evaluation of Metrics . . . 135

5.5 Evaluation of Quantitative Analysis . . . 140

5.6 Summary . . . 145

6 Conclusion and Future Work 147 6.1 Limitations . . . 149

6.2 Future Work . . . 150

A University Ethics Approval for User Evaluation Study 153

B Questionnaire for User Evaluation Study 155

References 193

List of Tables 205

(14)

Introduction

This invention [Writing], O king... will make the Egyptians wiser and will improve their memories; for it is an elixir of memory and wisdom that I have discovered.

Thoth, Egyptian god of writing to Thamos, king of Egypt1

easily1

1.1 Motivation

P

aleography as a discipline concerning historical handwriting

probably dates back to the times of the Benedictine monks of St Maur, when they started cataloging handwritten manuscripts three centuries ago (Aussems & Brink, 2009). Ever since, the discipline has firmly established itself as an integral part within the broader area of history. Paleography as a field has evolved substantially in these three hundred odd years. There were multiple attempts to move it from the

realm of subjectivity2to objectivity3, as progressive efforts were made to

1_From_{Plato’s Phaedrus}_{(Fowler, 1925)}

2_{This denotes here overt dependence on personal internal opinions and intuitions}

that cannot be explicitly stated and substantiated through external facts and evidences.

3_{This denotes here dependence on external facts and evidences. True objectivity is}

(15)

introduce methodological approaches to paleographic analyses starting from the middle of the twentieth century such as Mallon (1952) and

Gilissen (1973). The term Digital Paleographywas originally coined only

in 2005 (Ciula, 2005), to describe an innovative approach that applied computational methods to paleography. The very late rise of digital methods is not surprising as the discipline was long held to be an inexact science; an art of appreciating aesthetics that defies objectification and thus quantification. Not surprisingly, quantitative methods are still seen with skepticism, and even now the trained eye of a paleographer is generally trusted more. As such, Digital Paleography does not yet have a wider adoption into the mainstream field of paleography.

On a wider scale, scripts are often seen as simple carriers of languages and usually relegated to an auxiliary role. Research on scripts until recently has been minimal and niche, except for the field of paleography. They are an important part of the cultural heritage of humanity and their analysis and study require more research. Fortunately, there is a growing interest in the analysis of scripts per se. Altmann and Fengxiang

(2008) published a volume titledAnalyses of Scripts: Properties of Characters

and Writing Systems to explore various properties such as complexity, ornamentality and distinctivity. Changizi et al. (2006) discuss the various contour configurations of written symbols and their similarity to the en-vironment in which they were produced. They also study the distribution of the configurations of various scripts. Changizi and Shimojo (2005) further discuss complexity of characters and the redundancy of stroke combinations of various major writing systems across history. It is to be noted that analyses by Changizi and most methods described in Altmann and Fengxiang (2008) are performed manually. Traditionally, analysis and study in paleography have also been done manually. As mentioned earlier, digital paleographic methods are at present making more inroads into the field. However, applying quantitative analysis on paleographic data is not yet popular or standardized (Stokes, 2009a). Paleographers rarely use such approaches and tend more towards semantic approaches.

One of the reasons is that these systems are seen as black boxes, where

(16)

We think this is partially due to the difficulty of quantifying script related features, and partially due to the lack of defined methods and metrics with theoretical and qualitative underpinnings. Hence, there is a distinct need for metrics and methods, which can be used in quantitative analysis but still have sound semantic and qualitative interpretations. This helps the researcher to perform the analysis efficiently and at the same time have a better understanding of the methods and metrics that are being employed. Such an approach must also lend itself to the application of computational techniques, which would enable users to perform analyses with much more ease. Thus, a well grounded computational approach would be extremely helpful in encouraging analysis of scripts, within and outside the context of paleography.

Paleography is traditionally associated with the classification of letters, which usually takes the form of assigning provenance to a scribal artifact. One of the most important tasks of a paleographer is to assign a scribe, a geographical location or a time period to a piece of handwritten arti-fact. This is usually expressed as detection of (scribal) hands. Stansbury

(2009) classifies paleography into two different approaches - theLinnaean

approach and the Darwinianapproach. The former is more focused on

classification as we just saw and the latter focuses on explaining "the ways that scribes created and modified scripts". We could call this approach

Descriptive Paleography. It can explain fundamental paleographic phenom-ena such as "the evolution of scripts and their relationships to each other" as well as "looking for mechanisms to explain these phenomena".

Jean Mallon was a pioneer in what Stansbury refers to as Darwinian paleography. Mallon et al. (1939) attempts to explain the evolution of Latin minuscule letters from the corresponding majuscule letters. While elaborating the physical processes that drive the evolution, he noted "This evolution of the majuscule to minuscule proceeds under the influence of the muscles of the hand, which always traces the strokes of the capital in the same order, then over time joins them, rounding and simplifying

them under the control of the eye"4 (Mallon, 1937). Also, Peignot (1937)

4_{cette évolution de la capitale à la minuscule se déroule sous l’influence des muscles}

(17)

while discussing the same evolution process observed that some lower case letters retain the form of their capital letters "only because these simple forms were easily written, and that scribe’s hands did not feel

the need to simplify"5. It can be noticed that even in the early 20th

century, paleographers were interested in understanding the physical processes that produce handwriting and used it to explain the evolution of characters from their perspective.

Figure 1.1: Evolution of Latin minuscule from majuscule as derived by Mallon (pointypo, 2013)

Blanchard (1999), following Mallon, explains the evolution of lower case Greek letters in terms of changes in handwriting and stroke behavior.

He formulates a concept called The Unit Ductus6 that is "unchanged

across the ages". He notices the various changes that occur to characters’ strokes such as rounding of corners, change in angles, merging of strokes, etc. He uses them along with the reconstructed handwriting motion to explain the evolution of minuscule Greek letters from their corresponding majuscule letters.

Figure 1.2: Evolution of Greek minuscule letter ’α’ from majuscule ’A’

(Blanchard, 1999, 8)

5_{c’est uniquement parce que ces formes simples s’écrivaient facilement et que la}

main des scribes n’a pas éprouvé le besoin de les simplifier.

(18)

Figure 1.3: Evolution of Greek minuscule letter ’ϑ’ from majuscule ’Θ’

and minuscule ’ϕ’ from majuscule ’Φ’ (Blanchard, 1999, 20)

Characters in paleography (and outside of it) have for the most part been analyzed as inanimate forms. But in fact, characters inherently

contain information regarding movements of the writing implement (i.e.

Ductus), which animates them and provides profound information about their formation and therefore about scribes. Canart (2006) has also argued that common paleographic procedures only allows us to see the static aspect and we could use appropriate descriptions that can serve as clues to the dynamic aspect of writing. While focusing on the movements of implements may not appear to be outright applicable to the Linnaean approach, it is fundamental for the Darwinian approach as seen above. In fact, Stansbury (2009) also notices the lack of handwriting-motion based analysis in digital paleography and says:

"[...] manuscripts have the potential to deliver up a vast quantity of information about how scribes wrote. Perhaps the automated analysis of script will soon turn its attention to reconstructing the motion of the scribe’s pen on the page and [...] explore the ways that these strokes evolved. It is then that we will begin to be able to measure the scribe’s art."

This thesis is an attempt to measure the art of a scribeby proposing

(19)

also in much more detail. Stansbury (2009) correctly notes that the Darwinian approach may have the "most interesting collaboration with technology". In a similar manner, Aussems and Brink (2009) additionally point out that an inter-disciplinary approach could lead to ways that change "the way we look at scribal hands and medieval handwriting", moving away from paleography in its traditional form. In fact, this thesis focuses and expands on such a framework, which would provide metrics and methods that can be used to explain changes in strokes and the relationship between characters.

However, these approaches need not necessarily be seen as mutually

exclusive with the Linnaean approach. This attempt to quantifyandassist

descriptive paleography, while helping to understand scribal behavior, can also greatly aid in the classification of scripts. By focusing on the fundamental nature of these characters, we can propose methods that are more intuitive and semantic, which the current digital paleographic methods usually lack.

1.2 Hypothesis

The central hypothesis of this thesis is that information on the handwrit-ing of characters (either recovered or injected into paleographic data) provides insights into scribal behavior and assists in creating intuitive computational methods for quantitative digital paleographic analyses.

1.3 Research Questions

In order to validate the central hypothesis, we seek to answer the follow-ing research questions.

1. What is the computational framework required to extract handwrit-ing information present in characters and analyze them?

(20)

3. How can the recovered information be directly used to quantitat-ively study changes in scribal behavior?

1.4 Contributions

We make the following contributions through this thesis.

1. We propose a modular framework that performs recovery of charac-ters’ handwriting information and decomposes the characters into proper primitives suitable for digital paleographic analysis.

2. We propose a range of intuitive quantitative metrics (and accompa-nying statistical methods) that can quantify handwriting informa-tion contained in characters and be used to analyze scribal behavior.

3. We propose the use of the Sigma-Lognormal model of handwriting generation to quantitatively analyze shape changes (and hence scribal behavior) in scripts.

1.5 Limitations

Scripts can generally be divided intoconnected andunconnectedwriting.

In connected writing, such as cursive Roman, individual characters are conjoined, thereby reducing pen-lifts and making writing faster. This

form of connected writing is commonly referred to ascursivestyle (but

there are cursive styles such as Ancient Egyptian Hieratic where indi-vidual letters are not necessarily connected). The approach proposed in this thesis is ideally designed to work with writing styles that have uncon-nected characters. If the metrics are to be adapted for conuncon-nected writing, the writing would have to be segmented into individual characters as part of preprocessing and then analyzed. Alternatively, if the analysis is performed on the individual base graphemic (hence unconnected) set instead of character exemplars from manuscripts/epigraphs, the metrics can be readily applied to any script.

The methods described in this thesis may also not be ideal forblock

(21)

graphemes (radicals in Chinese and jamo in Hangul) are arranged in blocks to form characters. Even though a single block can be considered as a character for practical purposes, specific methods based on the ar-rangement of graphemes might be better suited for such block scripts.

1.6 Publications

Some material presented in this thesis has also appeared as the following peer-reviewed publications.

1. Rajan, V. (2016). Quantifying Scripts: Defining metrics of characters for quantitative and descriptive analysis. Digital Scholarship in the Humanities. [To Appear]

2. Rajan, V. (2015). How Handwriting Evolves: An Initial Quantitative Analysis of the Development of Indic Scripts. Proceedings of the 17th International Graphonomics Society Conference.

3. Rajan, V. (2014). Framework for Quantitative Analysis of Scripts. DH2014 Book of Abstracts, Digital Humanities 2014.

1.7 Structure

The structure of this thesis is outlined below.

Chapter 2discusses the background literature pertaining to the meth-ods employed in the thesis, along with the previous work that is related to our work. It also discusses the shortcomings of the previous approaches.

Chapter 3describes the methodology that is being proposed in this thesis. It describes a script analysis framework to analyze scripts using the recovered trajectory information of characters. It also details vari-ous metrics that are derived using the framework for quantitative and descriptive analysis. It further discusses how handwriting modeling techniques can be utilized in the context of analyzing shape change of characters.

(22)

illustrates the use of handwriting modeling to understand changes in Indic scribal behavior.

Chapter 5presents the evaluation of the framework, the metrics and some of the analyses of our case study.

Chapter 6summarizes the thesis and presents our conclusions. It also establishes the limitations of our work and proposes future work that needs to be done.

(23)

(24)

Background and Related Work

You have invented an elixir not of memory, but of remind-ing; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem to know many things [...] since they are not wise, but only appear wise.

Thamos to Thoth1

2.1 Digital Paleography

D

igitalPaleographyis a fairly recent term dating back to 2005,

as seen in section 1.1. Before we discuss it, we must first estab-lish the context for quantitative methods in paleography, which by extension forms the groundwork for the application of digital methods. Some early examples of which include Loew (1914) and Mallon (1952). Loew (1914) described in detail various criteria for dating and localizing such as abbreviations and variant letter forms. This was followed by the significant work of Mallon (1952), who proposed seven factors that must be considered while distinguishing scribal hands. More than a decade later, Meuthen and Prevenier (1968) made some additions to Mallon’s original list, which are generally considered redundant. However, they

(25)

found that some factors suggested by Mallon such as angle of writing could also be expressed quantitatively. But it was Gilissen (1973) who prominently proposed the methodological quantification of those factors. He defined methods of analysis and an investigative process for each of those methods. He also tried to apply the scientific method to paleo-graphy by creating formulas for some of the factors rather than relying on their verbose descriptions (see §2.2 for more details). As discussed previously in section 1.1, paleography has been mostly viewed as an art that is to be imparted through subjective analysis rather than a science that can be objectified (Gumbert, 1976; Costamagna et al., 1995; Pratesi, 1998). This lack of trust in objectifying paleography is also extended to the application of quantitative methods, which by design requires the objectification of methodology. Not surprisingly, there were reservations among paleographers to accept newly proposed quantitative methods in paleography. Poulle (1974) critiqued Gilissen’s methodology and was skeptic about the universality of the methods, noting several deficiencies in his criteria. However, he finally acknowledged the contribution as a turning point. A similar critique is provided in d’Haenens (1975). Around the same time, Bischoff and Koch (1979) correctly predicted that due to (advancements in) technical means paleography was on the path of be-coming an art of measurement from being an art of aesthetics (Stansbury, 2009). This indeed was to become true in the future. Derolez (2003) comments that the existing methods in paleography tend to be overtly subjective, often depending upon the authority of the author and the faith of the reader (Stokes, 2009a). He proposes replacing the qualitative

techniques with quantitative ones. Aussems (2006) coins the termScribal

Fingerprint to denote objective and quantifiable characteristics that are unique to a scribe. He also uses this to perform quantitative paleography on a medieval manuscript. He confirms that numerical techniques can indeed be very convenient by facilitating a quick but at the same time more accurate and objective analysis (Aussems & Brink, 2009).

(26)

characters, as this is relevant to the area of our interest in this thesis. Through these systems, we intend to showcase the growing need to have a combined quantitative and qualitative approach to analyze characters.

In terms of quantitative methods in digital paleography, A System

for Palaeographic Investigations (SPI) (Ciula, 2005, 2009; Aiolli & Ciula, 2009) was developed at the University of Pisa to help paleographers to

classify and identify scripts. In fact, the term Digital Paleography was

coined to describe this attempt. The system aims to provide quantitat-ive support to analyze unseen documents in the context of documents already processed by it. The system consists of several modules that perform the processing. The segmentation module allows users to extract individual characters from manuscripts. These extracts are then used by the analysis module to create, what they refer to as, tangent-based models for each character by essentially averaging them. Relationships between a new sample and data already existing in the paleographic database (through the constructed models) are also given by this module. The system additionally allows users to morph and visualize transform-ations of a character. Bulacu and Schomaker (2007b, 2007a) develop a

writer identification tool called theGroningen Automatic Writer

Identifica-tion System(GRAWIS) that uses probability distribution functions (PDFs) to characterize individual writers. Various PDFs were used to encode both textural and allographic features. Other similar work such as Bulacu et al. (2003) and Bulacu and Schomaker (2006) are also of interest. Stokes (2007) proposes an analysis of scribal hands through image-processing and data-mining. He extracts features based on pixel information to perform quantitative paleographic analyses. He selects five different fea-tures (extracted from forensic recognition such as Bulacu and Schomaker (2007b)) and tests their usefulness for studying medieval handwriting. The features are used for a clustering algorithm in an experiment trying to group related samples of handwriting, which is largely successful with few errors. He also suggests that tools for paleographic analysis must be used with caution but can be effective to supplement human

judgments. Stokes also releases a program called Hand Analyzer(Stokes,

2009b, 2009a) that follows the principles outlined in Stokes (2007). The

(27)

quantitative features to describe scribal hands, based on images. The files then can then be used for measuring statistical distances between two scribal hands. Azmi et al. (2011) perform digital paleographic analysis of Jawi manuscripts for classification of writing styles through features extracted from constructing scalene triangle blocks. Similarly, Soumya and Kumar (2014) attempt to classify ancient Indian epigraphs using random forests based on their time period. Similar approaches have also been performed on Greek inscriptions in Papaodysseus et al. (2010). There are several work such as Wolf et al. (2011) that invoke complex statistical processes to attempt the classification of characters. We do not enumerate them all.

Most of these systems do not aim at extracting information that is

eas-ily understood by paleographers. They useblack box like features, which

are not readily interpretable by them. Hassner et al. (2013) point out that "high-level terminology, natural to paleography, should be integrated into computerized paleographic systems". This is partially explored by

Brink et al. (2012), who propose a new feature called Quill. The Quill

feature uses the relation between the ink direction and the ink width to identify writers. The feature aims to be intuitive and easily explainable as it is directly derived through the modeling of trace production by a quill. Herzog et al. (2010) propose a completely autonomous system to extract strokes from historical scripts using Constrained Delaunay Triangulation (CDT), motivated by stroke extraction procedures in Chinese script. The extracted strokes are meant to be used as a proxy for shape features in the context of various paleographic analyses. However, they do not propose any further methods for analyzing the information. Also, the strokes are at a very high level and are not very effective in characterizing the handwriting process of the characters (which we are more interested in).

(28)

also be adopted). This approach mainly aims at character retrieval and search at different levels of character abstractions using string based attributes rather than proper quantitative analysis. In Levy et al. (2012), characters were manually described using a standard set of string-based

descriptors. These are then analyzed for the distinctive nature of their

appearance using Gene Set Enrichment Analysis (GESA). They show that by using appropriate statistical methods, it is possible to get meaningful insights in paleography. Stokes et al. (2014) present a script framework

calledDigipalthat attempts to provide clear and convincing

paleograph-ical descriptions by using a formal model for describing handwriting. This involves tagging the manually segmented characters with structured string descriptors. Rather than performing qualitative or quantitative analysis, Digipal aims to be an exploratory and also a pedagogical tool. While all these methods lean towards descriptive analysis, they are also very qualitative and do not invoke quantitative features very much.

We can see that descriptive systems are mostly not quantitative, and

quantitative systems are not necessarily descriptive. Also, the ductus

feature, which can be defined as the direction and order of strokes to produce a character, has not been given high priority (nor has it been the basis of analysis) in most of the systems discussed. Apart from a few systems that focus on descriptors for scribal hands, most of the systems do not focus on deriving a descriptive analysis of them. This is understandable as these systems are more interested in classification. Hence, there is a distinct need for a system that enables descriptive digital paleography based on quantitative analysis. In the next section, we look into both paleographic and non-paleographic features that were proposed to analyze characters.

2.2 Character Features

(29)

that must be considered when trying to distinguish between various scribal hands.

1. Form (morphology of letters)

2. Angle of writing (in relation to the base line)

3. Ductus

4. Modulus (proportions of letters)

5. Contrast (the difference in thickness between the hair lines and the shadow lines)

6. Writing support

7. Internal characteristics (nature of the text)

However, they are originally meant to be descriptors rather than quantified values. Others that follow tend to be similar either defining additional criteria or requiring more details. Many of these criteria are text-based and/or linguistic differentiators such as orthography, abbre-viation, punctuation, etc. As described earlier, it is Gilissen (1973) who prominently proposed quantifying features such as modulus and the angle of writing. There also have been several variants and improve-ments of Mallon’s fundamental differentiators such as M. P. Brown (1996), Rumble (1994) and T. J. Brown (1993). Burgers et al. (1995) using the ex-isting objective features of paleography and including some aspects from forensic analysis comes up with a methodology appropriately named as

Burger’s Methodology. The methodology is adapted (with minor modific-ation) to medieval manuscripts in Aussems (2006), which also contains a detailed analysis of each feature in it. He chooses eight features to analyze, out of which four are quantifiable. They are as follows: (i) Angle of inclination (ii) Angle of writing (iii) Modulus (iv) Degree and type of curvisation of connecting characters (Aussems & Brink, 2009). At this

point, we can comfortably invoke the term features, a term frequently

(30)

we eschew the term at least in the context of paleography (in the future chapters), it is frequently used in the context of computer science. A sig-nificant overview of major paleographic features that have been proposed over time can be seen in Stokes (2009a) and Aussems (2006).

It must be noted thatductusis usually included in many of the feature

sets but never given importance. In fact, there are no major feature sets that include ductus as the main factor, or list other major factors, save for a few, that are based or derived mainly from the ductus feature. It was generally thought to be either not quantifiable or not considered to be very usable for paleographic analyses. It is also highly difficult to recover it from an image of a specimen of writing. Apart from paleography, several other fields are also interested in quantifying characters through various features, which we elucidate below.

The most related area to digital paleography is automatic forensic document analysis. In this, a handwriting sample is usually compared to other samples to identify the writer of the sample in question. Such systems can identify, within a degree of uncertainty, if two documents were written by the same person or not(Impedovo & Pirlo, 2008). This is not very different from the detection of hands performed in paleography, and the related methodologies from forensic analysis can, in theory, be directly applied to it. Some of the digital paleographic systems described earlier were based on such methods. However, most forensic systems work as black boxes and rely on complex statistical methods that are hard to understand (T. Davis, 2007; Hassner et al., 2013). The methods are also mostly based on the static image of a sample and do not involve the handwriting information. As interesting as it may be, classification of scripts is not main motive of our research. Hence, we will not be focusing on features used in these systems, and interested readers may refer to literature reviews in related work such as Bulacu and Schomaker (2007a) that discuss applications of forensic methods to paleography. An interesting discussion comparing and contrasting paleographic and forensic handwriting identification can be seen in T. Davis (2007).

(31)

dis-tintinctivity were discussed in detail. This work is done in the context of quantitative linguistics to derive quantitative descriptions of writing systems. Interestingly, some of the metrics proposed such as distinctiv-ity (Anti´c & Altmann, 2005) are very much applicable in the context of paleography. For instance, Hegenbarth-Reichardt and Altmann (2008)

apply one of the metrics that definecomplexity to the study of the

devel-opment of Egyptian Hieratic from Egyptian Hieroglyphs. By quantifying complexity, they analyze the process of simplification of Hieroglyphs and attempt to fit a mathematical model for the process.

(32)

but many of them are not particularly aimed at quantifying any specific property of characters (in terms of handwriting) and most do not provide substantial qualitative underpinnings for those features. They are mostly proposed as pure statistical descriptors to construct feature vectors for pattern recognition systems. However, there have been some preliminary applications of these features for semantic analysis. Long Jr et al. (2000) use their proposed features to analyze the subjective similarity of the gestures, but they do it indirectly using multi-dimensional scaling (MDS). Similarly, Vatavu et al. (2011) use some of the features to correlate with

perceived execution difficultyof gestures. Though the above features were aimed at pen gestures, the features that closely correspond to the physical attributes can very well be adoptable for descriptive paleography as well.

In the next section, we will see various techniques that can mathemat-ically model the production of handwriting.

2.3 Handwriting Models

(33)

movement simulation methods and shape simulation methods (Dolinsk `y & Takagi, 2007).

Handwriting Models

Shape Simulation

Movement Simulation

Top-Down Approach

Bottom-Up Approach

Figure 2.1: Categorization of handwriting models

Shape-simulation techniques do not require any of the dynamics asso-ciated with handwriting movement. They only consider the static shape of the generated handwritten character. Wang et al. (2002) propose a generative model that is based on control-points and B-splines. They train a model using handwriting samples and extract training vectors, which are then used to synthesize trajectories of whole words. In a similar way, Choi et al. (2003) use Bayesian networks to train handwriting samples. The shape is generated by searching for the most probable input point sequences. Xu et al. (2005) create aesthetic Chinese calligraphy through machine learning by additionally incorporating geometric constraints that reject unaesthetic shapes. Most of these techniques include using statistical methods to generate handwriting through (machine) learning from pre-existing samples. While this is an efficient paradigm to generate handwriting, it does not throw any light on the process of generation.

(34)

para-meters such as velocity, force, pressure, spatial target, etc. (Plamondon & Maarse, 1989). Many models have been proposed under this bottom-up approach.

Hollerbach (1981) suggests a continuous model for handwriting gen-eration consisting of two orthogonal sinusoidal oscillatory moments (horizontal and vertical), which are superimposed horizontally with a constant velocity. The oscillations are responsible for the shape of charac-ters, while the horizontal sweep motion combines them. The model is given by parameters such as horizontal/vertical velocity amplitude, hori-zontal/vertical frequencies, horihori-zontal/vertical phases, horizontal sweep and amplitude of motion. However, the model is mathematically com-plex and also not very intuitive. Singer and Tishby (1994) also propose a similar oscillatory model, but based on cycloid motion. Other models depending on orthogonal muscle movements can be seen in Denier and Thuring (1965), Eden (1968), and Koster and Vredenbregt (1971).

Morasso and Ivaldi (1982) detail a computational model of generating handwriting that is more intuitive. Here, handwriting is considered to be

composed of basic curve elements called strokes, which overlap during

production to form a visible trajectory i.e. shape of a character. The curves, which are represented as polynomial segments, are composed piecewise as a weighted sum over time to produce a smooth trajectory.

In this way, the strokes are effectively hidden and are not immediately

(35)

of elliptic stroke primitives, which are defined as mathematically complex beta functions. We see a clear convergence of ideas here, concerning handwriting being composed of primitive segments called strokes.

Of particular significance to us is the kinematic theory of rapid hand movements proposed by Plamondon (1995). It describes human handwrit-ing as composed of strokes, which have asymmetric bell-shaped velocity profiles that are represented through log-normal functions. This closely mirrors the actual process, where an (ideal) handwriting is characterized by multiple bell-shaped velocity profiles. The kinematic theory offers a family of hierarchal handwriting models (Plamondon & Djioua, 2006) such as the Delta-Lognormal model and the Sigma-Lognormal model. The former can only predict rectilinear strokes, whereas the latter can predict complex curvilinear strokes as well. Since characters are made up of such strokes, Sigma-Lognormal is suitable for modeling actual character shapes. Under this model, each stroke is represented as a vector of six parameters. The handwriting trajectory is finally represented as a

vectorial sum of all the constituent strokes. A stroke sis given by:

s= f(D,t0,θs,θe,µ,σ) (2.1)

D is the amplitude of the stroke,t0 is the initiation time of the stroke,

θs andθe are starting and ending angles of the stroke, and µ andσ are

neuro-muscular parameters. These parameters can be used to manipulate the shape of an individual stroke and also to control the amount of overlapping with a succeeding stroke.

(36)

handwriting specimens. Here, the parameters of the Sigma-Lognormal model are varied to generate additional specimens as required. They propose that these synthetic data can be used to train and test online handwriting classifiers. Almaksour et al. (2011) use a similar method to increase and improve handwriting classifiers through synthetic gestures derived from deforming the model gestures. Ramaiah et al. (2014) take advantage of the model to add distortions to handwritten text, which are then employed as CAPTCHA. It is also used to generate and analyze graffiti tags by Berio and Leymarie (2015). According to them, this model allows the production of curves that are very similar, both visually and kinetically, to those made by humans using modern implements of writ-ing. One of the reasons for such a widespread usage of this model is its extreme simplicity with only six parameters as we have seen. This makes it not only very effective but also an elegant way to computationally model human handwriting movements. We will be using this technique in particular for our further analysis. It will be expanded and explained in detail in the next chapter.

2.4 Summary

(37)

(38)

Quantitative Analysis of Scribal

Handwriting: Methods and

Metrics

Writing, Phaedrus, has this strange quality, and is very like painting; for the creatures of painting stand like living beings, but if one asks them a question, they preserve a solemn silence.

Socrates to Phaedrus, an Athenian aristocrat1

3.1 Kinematics of Handwriting

H

andwriting is one of the key themes driving the research

presented in this thesis. It is, therefore, necessary that we first elaborate on handwriting and its production before we present and discuss our work.

Handwriting is a dynamic process that is produced by the movement of an implement on a surface. Even though writing a character is often seen as a single contiguous hand movement, it is actually made up of several sub-movements. It is a fluid process, in which these movements

(39)

Figure 3.1: Velocity profile of the characterS

overlap and compose to form a character (Morasso, 1981). The over-all movement of an implement across the writing surface to produce a

character is called a trajectory. Handwriting, being dynamic, can be

de-scribed in terms of physical parameters such as velocity and acceleration with respect to an implement’s movement. It is usually categorized as a ballistic activity, where sub-movements that each exhibit three distinct phases - an acceleration phase, a velocity phase and finally a deceleration phase (Teulings & Schomaker, 1993). During such a sub-movement, an implement initially accelerates until it comes close to the mid-point of that movement, after which its velocity stabilizes for a very brief moment. This is followed by a deceleration, where the velocity decreases as it reaches the end point. If it is just a single such sub-movement in isola-tion, the implement comes to a complete halt. Otherwise, it is followed by another acceleration phase for the next consecutive sub-movement. This behavior is understandable as an implement has to accelerate to reach its target and then decelerate gradually as it nears the target. This results in a characteristic bell-shaped velocity profile with a distinct peak, which typically occurs during an ideal, smooth and uninterrupted writing process. The sub-movement which corresponds to a single

bell-shaped velocity profile we call primitive stroke. When there are several

(40)

it results in a velocity profile with several peaks each corresponding to an individual primitive stroke. Thus, the writing of a character can be kinematically represented as consisting of several contiguous bell-shaped velocity profiles. Figure 3.1 shows the velocity profile of the character

S (as written by a modern implement) with three distinct bell-shaped

peaks each corresponding to a primitive stroke.

The moment when an implement touches a surface is called a

pen-downevent and the moment when it leaves the surface is a pen-upevent.

When a character requires only a pen-up and pen-down event pair it

is called aunistroke character. For instance, Latin letter S is a unistroke

character (in modern writing), where the lifting of the implement occurs only once i.e. when the writing is complete. But in some cases, an implement has to make multiple discrete contacts with a surface. These

are calledmultistroke characters. For example, writing Latin lettertrequires

the implement to leave the surface at the end of the vertical stroke and then touch the surface again to complete the horizontal stroke. Thus, it

consists of two pairs of pen-up and pen-down events. The termstrokehere

refers to the overall movement of an implement in continuous contact with a surface i.e. between a pen-up and a pen-down event (which we

later specifically refer to as pen-strokes). The movement of an implement

between two consecutive pen-up and pen-down events in multistroke

(41)

3.2 Digital Paleographic Framework for

Quantitative Analysis of Handwriting

As discussed in the previous chapter, while several digital paleographic systems are aimed at analyzing characters, none of them are particularly interested in analyzing scribal behavior quantitatively for descriptive paleography. As a result, more often than not, they do not have an established rigorous way to study characters’ shapes. They analyze the shapes in terms of a collection of pixels on a screen, which we think is not appropriate for descriptive analysis. We argue that if we are to analyze characters systematically, we need a suitable computational framework that operates on a proper paradigm. This requires having a theoretical appreciation of the underlying handwriting processes. To elaborate, we have to understand the processes behind stroke creation and interaction that define the corresponding scribal behavior. The paradigm we choose

is that of the ductusfeature of characters.

(42)

support.

Most of the modules proposed in the framework are computational and therefore can be largely automated. While one of the motivating factors of the framework is automated analysis, we do recognize the role of expert users interacting with our system. We are of the opin-ion that they should be able to inject their knowledge as required, as it can enhance computational processes. Hassner et al. (2014) argue for a human-oriented approach in digital paleography and note, "’human in the loop’ can and should be integrated into all stages" to "overcome the shortcomings of strictly automatic approaches". Accordingly, we incor-porate user input into our framework and allow users to override and perform manual operations as well. It will be seen that the framework, in fact, provides various avenues for interaction with each step of its processing. Effectively, this results in a semi-automatic approach that can be augmented with human judgments as required.

Such a human-aided approach is, as matter of fact, better suited to dealing with paleographic scripts, which frequently require reconstruc-tions and as a result also frequent subjective decision making. While we do not aim to completely eliminate human subjectivity from the process, we attempt to streamline the amount of subjectivity involved by making the underlying process more explicit. This allows users to interact with the system within our paradigm at a level comfortable to them. They are usually quite wary of using entirely automated approaches and are not completely convinced to trust the output of a given software that performs a task that has traditionally been performed manually. The pro-posed framework is not designed to be used as a black-box application, where a user imports characters only for the framework to expunge a

collection of opaque numerical values. It is intended to be used as a gray

box, where users can see the principles and guiding factors behind each

analysis and interact with them. As noted by Stokes (2007), any ad-hoc involvement by the user to manipulate and/or improve the results must be logged for the sake of reproducibility. Any implementation of this framework should include this feature.

(43)

bereproducibleas it is strongly grounded on the theoretical principles of handwriting generation. This also makes the results of the framework more interpretable or communicable as they can be discussed directly in terms of handwriting behavior with respect to specific scribes and/or

specific characters. The framework is also flexible, in that any change

in the underlying assumptions can be comfortably incorporated with

minimal overhead. Overall, it aims to provide an overarching common

frameworkto quantitatively analyze characters for descriptive paleography, through the handwriting information of characters. Within this broad context, the framework is very open-ended, extendable and customizable

to any particular situation. It is also highly modular with its modules

being self-contained with the outputs of the preceding modules being the inputs for the succeeding modules, and their results can, therefore, be improved or edited as and when required.

In the following sections, we discuss the individual modules contained in the framework. We explain in detail their motivation, inputs, workings, outputs and also the associated limitations/assumptions (if any).

3.2.1 Spline Representation

The first module of the framework pertains to the initial digital repres-entation of characters. As seen in section 3.2, many paleographic systems use pixel representations for their analyses as they frequently employ image-based techniques. Though simple and convenient, they are not ideal for analyzing characters in terms of handwriting production. There-fore, we attempt to find a suitable representation for characters in the context of our analysis.

In the field of computer-aided design (CAD), mathematical

represent-ations calledsplinesare often used to represent complex shapes. These are

(44)

Character

Spline Representation

Trajectory Reconstruction

Stroke Segmentation

Structure Representation

Metrics Extraction

Figure 3.2: Modules of the framework

natural and suitable for our processing as well.

There are several types of splines and of particular interest areBezier

(45)

offer the option of localized shape control. Any changes made to a curve segment is localized to that particular piecewise polynomial. To represent longer curves, a larger number of piecewise polynomials is used without resorting to higher-order representations. B-splines, like Bezier splines, also have control points that can be manipulated without significant effort. For these reasons, we computationally represent characters using B-splines.

Conversion of characters’ shapes into their B-spline based representa-tion can be done either manually or automatically. In the former case, the shapes of characters are explicitly constructed using a set of B-splines, with users defining them for their analysis. But very often, characters already have existing digital representations that are image-based. Hence, we need to provide a way to automatically convert image representations to the required B-spline representations. This is done in multiple stages. In the first stage, corners of the imported images are detected using a robust standard corner detection algorithm (Chen, Zou, Zhang & Dou, 2009) such as the Harris operator. Once they are detected, we attempt to find and list all connected pixels between all the adjacent pairs of corners. Using a standard curve reduction algorithm, these lists are reduced to the bare minimum required to capture the shape of curve segments between the corners. We find the Douglas-Peucker algorithm (Douglas & Peucker, 1973) to be particularly effective in performing this task. These reduced lists are then used to create the corresponding B-splines through stand-ard spline interpolation. This finally results in a spline representation of a character, where the constituent curve segments are now B-splines. Figures 3.3 and 3.4 shows sample results of spline conversion. When we encounter loops in a character, we insert pseudo-nodes to facilitate trajectory reconstruction (which will be discussed in the next section). In figure 3.4, nodes F and G are essentially pseudo-nodes.

(46)

Figure 3.3: Spline representation (right) of a character (left)

Figure 3.4: Spline representation of a character with pseudo-nodes on loops

As such, the framework leaves open the ability to augment additional information.

3.2.2 Trajectory Reconstruction

Using spline representations we only capture characters’ shapes, to be more precise, their static appearance. They do not (yet) contain any information about production, i.e. trajectories, which is required to prop-erly analyze characters. The second module of the framework pertains to the extraction of trajectories from spline representations.

(47)

notes paleographic artifacts usually do not preserve such information. However, based on the static shape of characters it is possible to perform a reasonable reconstruction of their trajectories. Doermann and Rosenfeld (1995) suggest that the recovery can be obtained by:

1. Global cues such as

Relative direction of handwriting

Minimizing of effort/energy required for production

2. Local cues such as:

Striations

Stroke width variations

We are more focused on a high-level reconstruction and therefore are more interested in global cues rather than local cues. We feel that global cues provide a generic abstraction about writing characters. Trajectory recovery techniques may be classified into three approaches. The first approach is the graph theoretic approach, as suggested by Bunke et al. (1997), Jäger (1996) and others, which performs the trajectory search on a graph. In the second approach suggested by Doermann and Rosenfeld (1995), Lee and Pan (1992), and Lallican and Viard-Gaudin (1997), the search is performed on the image contour or skeleton of an image and typically includes local cues to aid the recovery. In the third approach, seen in the work of Lau et al. (2003) and Nel et al. (2005), recovery systems are typically trained with online data that is then used to recover the information from a test set. Nguyen and Blumenstein (2010) have made a comprehensive survey of various techniques for trajectory reconstruction.

In our context, the machine learning approach is not applicable as we do not have existing trajectory data to begin with. The image-based approach as it stands focuses more on local cues and hence is also not very suitable for our purpose, as we prefer a higher level approach to recovery. Based on our need for a global approach, graph-based methods are well suited for us.

(48)

the case of paleographic scripts, where original trajectories are often not known, there is not a unique answer. Very fittingly, his work is able to provide several alternative theoretical trajectories ranked according to their viability. It also facilitates imposing several additional heuristics specific to a script as the recovery is performed using high-level hand-writing behavior as desired. In Jäger’s method, a character is mapped into a graph and Eülerian paths for that graph are generated. These paths are then ranked based on the length minimization and curvature minimization principles. Usually, humans tend to follow shorter paths (i.e. length minimization) and write smoother strokes with less deviation (i.e. curvature minimization), as opposed to longer paths with rugged strokes. It is assumed that an ideal trajectory, in this way, attempts to minimize the effort to produce a character. We have adapted Jäger’s algorithm to fit our purpose. We elucidate below the modified algorithm for recovering trajectories.

Figure 3.5: Graph representation of a character

We begin by abstracting the spline representation into a higher level graph representation. Each node in the graph represents a corner, and the B-spline segments are represented as edges. The data structure for the edges holds the corresponding B-spline segments. We then enumerate all possible paths of traversal available in the given graph, which is done by calculating the Eülerian paths for the graph (Ore & Ore, 1962). A path

required to visit all edges in a graph at least once is called anEülerian path.

In terms of trajectory, this translates into the pen movements required to trace the entire shape of a character. We proceed to calculate various

(49)

a normalized score by assigning weights to each of these costs. This normalized score indirectly corresponds to the effort required to write

a character. The top n paths are ranked according to their score and

presented to users. We illustrate this with the following example.

Assume we have the following path p for the character in figure 3.5:

A→B →C→D→ C→E→F →G→E →H

We calculate the length cost of the path, len(p), by summing the

length of all edges i.e. B-spline segments in that path.

len(p) =len(A,B) +len(B,C) +...+len(E,H) (3.1)

For path curvature, we calculate the absolute value of the angle between successive edges, which is computed by calculating the angle between tangents of the two curves at the point of intersection. In figure 3.5, edge pair (G,E) and (E,H) have smooth transition and hence lower cost, compared to edge pair (D,C) and (C,E) which requires a sharp turn.

The curvature cost, curv(p), is the sum of all the angles covered when

writing the path.

curv(p) =curv((A,B),(B,C)) +curv((B,C),(C,D)) +...+curv((G,E),(E,H))

(3.2) We also calculate the directional cost by invoking common heuristics. Most scripts have a preferential direction of writing such as left to right or top to bottom. For instance, if we are sure that the usual direction of a script is left to right and top to bottom, then trajectories following these directions are given higher priority by penalizing other paths. We do this by assigning negative scores to such paths. The heuristics have to be

decided based on the script under consideration. dir(p) would then give

the directional cost for the path.

The total cost for the path is calculated by:

cost(p) =w1·len(p) +w2·curv(p) +w3·dir(p) (3.3)

The weightsw1,w2 andw3 are fractions that sum to 1 and are assigned

(50)

Figure 3.6: Reconstructed trajectory of character. PU and PD are pen-up and pen-down events respectively

scripts. The individual costs are normalized between 0 and 1 for this calculation.

From the returned topn paths, users can select a trajectory of their

choice. However, during this selection they also need to consider other external factors such as consistency with trajectories of other characters in the same script, allographic variations, etc. See figure 3.6 for the reconstructed trajectory of the character in figure 3.5. In some cases, par-ticularly with calligraphic scripts, the basic assumption of reconstruction,

namely effort reduction, might not hold at all. In such cases,

augment-ation of users’ practical external knowledge particularly enhances this process. In such cases, they can override the suggested trajectory by either partially modifying it or just replacing it with their own choice. Also, the heuristics used can always be customized to a specific script under consideration. For multistroke characters, the trajectories must be reconstructed for each pen-stroke and then they should be ordered based on a higher level heuristics to recover the overall path. For instance, a longer pen-stroke is most likely to be written first followed by shorter ones.

(51)

3.2.3 Stroke Segmentation

A precursor to performing any kind of analysis on a structure would be to decompose it into its fundamental constituent parts. Such decom-position provides a finer view of the structure and gives insight into its construction, and also relations among its constituent parts. Similarly, to analyze a character we propose that it also needs to be decomposed into its fundamental parts i.e. strokes. Stansbury (2009) supports this argument by stating that "analyzing letterforms into their component strokes and pen angles" is a fruitful approach for digital analysis. Bishop (1961) further elaborates that "[e]ven more than in ductus and sense of form and proportion the idiosyncratic is to be found in the production of single strokes, in the behaviour of the pen as it turns a curve or a corner [...]" (Stokes, 2009a). This reiterates the importance of extracting strokes from characters using a proper underlying principle.

There have been several approaches to decompose characters. Edelman and Flash (1987) decompose characters into four different templates -hook, cup, gamma and oval. We feel such predefined decomposition is not suitable for the creation of proper primitives required for our analysis. Writing is a natural process, which cannot be simply reduced to a set of predefined templates. Changizi et al. (2006) decompose the characters intoseparable strokes using three subjects who decide (unanimously) on the decomposition. Such a completely subjective process relies on some underlying criteria that are often unknown and do not directly corres-pond to the handwriting process. A better alternative is to have a specific process as a guiding factor to perform character decompositions, which helps in automation, and at the same time, it also provides a reasonable set of guidelines through which users may choose to interact with the process. In our particular context, the primitives of characters would be the fundamental strokes involved in their production. This is consistent with the way they are internalized and produced by humans. Based on our chosen paradigm, we propose to use handwriting information (as reconstructed by the previously discussed module) to decompose characters into their fundamental parts.

(52)

spline representations. The glyphic segments generated do not directly correspond to the actual strokes of characters. Therefore, using the recovered trajectory, characters are reconstituted as a set of B-splines representing the overall composite strokes that directly correspond to their trajectories. For instance, if two curve segments are part of the same smooth stroke, they are combined into a single stroke. At this point, we have already decomposed the character into composite strokes. But we have to further decompose these to extract primitive strokes. Figure 3.7 demonstrates the restructuring of the character shown in figure 3.3. Note how the glyph segments have been combined to form composite strokes.

Figure 3.7: Character with composite strokes as a result of restructuring .

With the reconstituted representation of characters, we now attempt to retrieve the primitives. This retrieval is performed by segmenting the trajectory at appropriate points. Hence, the process is appropriately

namedstroke segmentation. As expressed in section 3.1, writing a character

is not a discrete, but a continuous process where individual strokes overlap. Based on a character’s trajectory, we proceed to find specific points where the (apparent) primitive strokes connect. It has been shown that the minimal velocity points occur where the curvature is maximal or minimal and also where strokes are explicitly delineated such as at sharp junctions (Li, Parizeau & Plamondon, 1998). When the composite strokes are segmented at sharp junctions, we extract what are termed

(53)

yields such calculations. The xand ycoordinates of a B-spline are given

as a function of a parametert.

x=x(t) y=y(t) (3.4)

The curvature k at each point of a B-spline is calculated using the

derivatives of the above parametric functions.

k= x

0_y_”₋_y0_x_”

(x02₊_y02₎3₂ (3.5)

where the prime and double prime refer to the first derivative and the second derivative respectively.

Using this equation, we calculate the curvature at each point of a disjoint stroke and then attempt to find the local maxima and minima to extract the segmentation points. One must notice that this method is extremely sensitive and can detect even very minor changes to curvature. To overcome this, we impose additional heuristics to appropriately filter them. Initially, we set an empirical threshold for curvature - only points above this are considered to be segmentation points. Additionally, if the distances between two or more of these points are less than a threshold, they are combined. The disjoint strokes are then segmented at all these points where the primitive strokes overlap and connect. See figures 3.8 and 3.6 for illustration. These segmented primitive strokes are a good approximation of the underlying strokes. In this way, we produce a natural set of primitives corresponding to a character. For a multistroke character, the pen-drag between the individual strokes is included as an additional invisible stroke because it also involves a movement of hands. It may also be necessary to override the automated stroke segmentation process on a case-by-case basis. For instance, given a trajectory, it might be more practical to have one longer stroke instead of two successive short strokes.

(54)

Figure 3.8: Disjoint strokes decomposed into primitive strokes. The different colors refer to up-strokes (red) and down-strokes (green). See §3.2.4.

the definition of stroke needs to be redefined to suit the specific physical process that produces the writing under consideration.

3.2.4 Structure Representation

Following the decomposition of a character into its primitives, we create a hierarchical structure based on the results of decomposition to represent the character.

Character Pen Strokes Disjoint Strokes Primitive Strokes

Up-Strokes

Down-Strokes

Figure 3.9: Stroke hierarchy

A stroke hierarchy is first constructed as shown in figure 3.9. At the

top are pen strokes, which are the total movements written without lifting

the pen. The multi-stroke character illustrated in figure 3.7 has two pen

strokes. These pen strokes can be divided into what we calldisjoint strokes

that are delineated by sharp junctions, as seen in the previous section.

Disjoint strokes are further composed ofprimitive strokes, which are again

(55)

and more stable (Maarse & Thomassen, 1983). Up-strokes are faster to produce (Isokoski, 2001), which may also lead to their lower stability. Maarse and Thomassen (1983) defines strokes that are produced between 210° and 280° to be down-strokes. The range of angles appears to be very restrictive (as it considers only Roman handwriting). Hence, we have included strokes that are pointed downwards within 210° and 330° as down-strokes and all others as strokes. The criteria to judge up-strokes and down-up-strokes should be modified (or even inverted) based on the writing style and/or writing implement.

The structural representation of a character is then built based on our proposed hierarchy, which can abstract characters at any stroke level as needed. In this new representation, a character is composed of strokes rather than curve segments. They are also represented as B-splines similar to the curve segments of a character. Such a fundamental representation using primitives derives quantitative features that are more descriptive. This representation can be used for other kinds of related analyses as well (see §3.3, specifically §3.3.3). If necessary, a structured and detailed XML representation as suggested by Terras and Robertson (2004) could also be adopted.

Additionally, this representation can be used to view the stroke invent-ory for a script. This inventinvent-ory can be abstracted at any stroke level suited to an analysis. To create the script inventory, we collate all the individual strokes and reduce the inventory list by merging similar strokes based on a cost threshold that is empirically chosen. This can be particularly useful if the motivation is to study patterns appearing in scripts and how they change and evolve.

3.2.5 Metrics Extraction

One of the main purposes of this framework is to extract metrics that describe characters to facilitate easy and effective quantitative analysis

of scribal behavior. We use the term metrics here synonymously with