Transcribing and annotating
audio and video:
Jeff Good MPI EVA and the Rosetta Project [email protected]
Goals of presentation
•
Discuss basic concepts of audio and video transcription and annotation•
Illustrate process of transcription and annotation using Elan3
What is annotation?
•
In recent years, a new conception of language documentation has been emerging (see, e.g., Himmelmann (1998), Woodbury (2003))•
This view takes primary sources of data (e.g., audio and video) to be the foundational materials for language documentationWhat is annotation?
•
“Traditional” linguistics is then conceptualized as annotations on primary data, including•
Transcription of audio or video•
Annotation for grammatical5
Text annotation example
Cicko,
[
ch’aara ’a goj,
]
’i
bu’u.
cat.
erg
fish
&
see.
cvpan 3s.abs b
.eat.
prs
‘The cat sees a fish and eats it.’
1
Extra layer of annotation
Why does this matter?
•
It’s a pretty different way of doing “language documentation” than before•
It forms the conceptual underpinnings of the functionality of annotation tools•
It can be a lot more work at first...•
...with (hopefully) a worthwhile payoff7
Good annotation
•
Under present thinking, good annotations should have the following properties•
Archival format•
Time-aligned to primary data•
Transparent, documented terminologyArchival format
•
How do you make sure yourannotations are in an archival format?
•
Short answer: Use a tool designed for research purposes (e.g., Elan, Shoebox)•
What not to do?•
Use FileMaker, Microsoft Word, etc.,9
Archival format
•
Simple tip: If the annotation file isn’t designed to be easily opened in aplain text editor (e.g., Notepad, TextEdit), it’s not archival
•
The biggest mistake people make isn’t deliberately choosing a program that uses a “bad” format—it’s not eventhinking about formats before using some program
Time-aligned annotation
•
When you’re annotating audio and video, ideally, you want theannotations to be time-aligned
•
That is, you want them to be “linked” to appropriate sections of the audioand video recording
•
This allows you or another researcher to have access to the primary data on which an annotation is basedTerminology
•
When doing annotation of linguistic data, there will always be a need for specialized terminology•
For example, transcription systems, like IPA, are a type of specializedterminology
•
Interlinear glossing also usesspecialized terminology (e.g., “sg” for “Singular”—itself a specialized term)
13
Terminology
•
When possible, use existing standard term sets (and document that you’ve done this)•
For example, IPA with notes on any modifications/interpretation•
Leipzig glossing rules for interlinear abbreviationsTerminology
•
Document the use of any specialconventions you devise for your data
•
Develop controlled vocabularies and make use of any features of your tools supporting their use•
Controlled vocabulary: Astandardized list of terms used for annotating data
Terminology
•
Possible controlled vocabularies•
Yes/No•
Speaker identifiers•
Left/Center/Right (for eye gaze)•
Grammatical phenomena of particular interest17
Elan
•
Elan is a time-aligned annotation tool available at:http://www.mpi.nl/tools
•
Supports annotation of•
Audio (in WAV format)•
Video (in MPEG and Quicktime format)Elan
•
Noteworthy features of Elan•
Designed in the context of language documentation•
Supports Unicode•
Export/import of Shoebox files19
Annotation tiers
•
In Elan, tiers are where annotations are located•
Tiers can be thought of as a “line” in an analyzed text. For example:•
Transcription tier•
Morpheme-analysis tier•
Interlinear tierAnatomy of an Elan window
Annotation viewer
Wave form
21
Elan in action
•
A brief demonstration of Elan, including•
The tiers I’ve been using•
Making a new annotationElan’s tier types
•
Time-aligned•
The “foundational” annotation,directly aligned to audio or video.
•
Typical example: sentence transcription23
Elan’s tier types
•
Time subdivision•
Must be linked to a basic time-aligned tier•
Allows you to make subdivisions of that tier with their own timestamps•
Typical example: Words in a sentenceElan’s tier types
•
Symbolic subdivision•
Must be linked to another tier•
Allows that tier to be subdivided without times associated with the subdivision•
Typical example: Morpheme subdivision25
Elan’s tier types
•
Symbolic association•
Must be linked to another tier•
Cannot be subdivided furtherSchematic example of tier types
Puer puellam amat.
puer puellam
amat
puer puell am am
a
t
boy girl ACC love PRS 3s
The boy loves the girl.
Sentence-level transcription time-aligned with wave form
Time subdivision of sentence into words Symbolic subdivision of words into morphemes
Symbolic association of morphemes with glosses
Symbolic association of sentence transcription with free translation
27
Tier types
•
These tier types weren’t invented out of the blue for Elan•
They correspond to the “meanings” of different kinds of linguistic parsing•
The tool designers allow for flexibility of tier types (a good thing)•
It’s up to the linguist to understandtheir data well enough to use the right tier type
Tier templates
•
It is likely that for a given projectyou’ll have some tier sets you’ll use often
•
Elan provides the ability to save a set of tiers as a “template” that can be29
Aside: .eaf files
•
How does Elan store annotations?•
Inside XML files using an .eaf extension•
What does that mean?•
Your archivist will be happy•
Hopefully, you’ll never need to know more than thatElan conclusion
•
This is just an introduction•
Elan has more features than I have made use of or can describe here•
My wish list for features•
Integration with a phonetic analysis tool (e.g., Praat)31
Other tools
•
There are a number of annotation tools out there•
Two that seem to also be popular among linguists•
Transcriber (http://trans.sourceforge.net/) (apparently good for conversational recordings)•
Praat (http://praat.org) (primarily known as phonetic analysis software, but also has facilities for time-aligned annotation)Why annotate?
•
Time-aligned annotation is a lot of work•
For me, it’s much more timeconsuming than just jotting things down in a notebook
33
Why annotate?
•
Intangible reasons•
It’s currently considered good documentary practice•
It facilitates wider use of resources by other people•
Time-aligned annotations combine linguistic analysis with theWhy annotate?
•
Tangible reasons•
It allows you (and others) to double-check your analysis more easily•
The ability to do searches across structured annotations facilitates analysis•
Makes creation of sound “clips” much easier35
Conclusion
•
If you’re going to go through the trouble to make good recordings...•
...it’s worth going through the trouble of annotating them well.•
Unsure of how to proceed?•
Consult the E-MELD School of Best PracticesReferences
E-MELD School of Best Practices
http://emeld.org/school /
Leipzig glossing rules
http://www.eva.mpg.de/lingua/files/morpheme.html
Himmelmann, Nikolaus P. 1998. Documentary and descriptive
linguistics. Linguistics 36:161–195.
Woodbury, Tony. 2003. Defining documentary linguistics. In P.
Austin (Ed.) Language documentation and description, volume 1,
33–51. London: Hans Rausing Endangered