Future work - From Discourse Structure To Text Specificity: Studies Of Coherence Preferences

In this final section, we summarize future directions led by our work.

The instantiation relation. One aspect that we haven’t explored when characterizing instantiation (Section 3.2) is textual entailment. As pointed out in Section 3.2.4, even

though in theory the second argument of instantiation should entail the first, few in-

stances of the relation in the PDTB are automatically recognized as entailment. We found that the entailment relationship appears to be at phrase or clause levels, and often depends on context and external knowledge. We believe future work exploring these directions can

benefit RTE systems and instantiationrecognition.

When comparinginstantiationwithspecification, we pointed out that thechangein

specificity acrossspecificationarguments may not be as large as that in instantiation,

especially if the first argument of specification does not need to be particularly general or the second argument particularly specific (Section 3.4). This hypothesis, if confirmed, will bring new insight into discourse relations and specificity. We leave for future work to have a fair judgement of this hypothesis, using a measure of specificity independent of either relations (e.g., via human judgements, such as that outlined in Section 4.1).

Subsentential specificity. We present an annotation guideline and a pilot corpus to annotate the degree of sentence specificity, and the cause and effect of underspecified text (Section 4.1). In this annotation, we did not separate if an underspecified text segment is elaborated in upcoming context or not in upcoming context. We also leave for future work to analyze the content of the underspecified segments and their associated questions, which can be useful for gaining further insights into what needs elaboration and what causes vagueness.

Our work proposes the first model to predict underspecified words within a sentence (Section 4.2). As pointed out in Section 4.3, there are multiple ways to improve the model, for example, to train on more data so that more powerful models can be adopted, to gain sharper attention weights using repeated attention, and to obtain structure on top of the current token-level prediction with structured attention. Future work can also tackle the

prediction of the number of underspecified tokens, along with how and where they can be resolved.

Cross-lingual analysis. In Section 5.1, we pointed out that both Arabic and Chinese

have sentences that need multiple English sentences to translate. However we did not

find these sentences to be especially problematic for Arabic-English translation. Future work can explore this negative result, and uncover linguistic constructs that lead to this contrasting finding between Arabic and Chinese. Future work can also look into more languages, especially those with more extreme differences in punctuation usage (e.g., Thai). To identify content-heavy sentences in Chinese which need multiple English sentences

to translate, we developed a system with rich syntactic features. We also pointed out

differences in discourse relation distribution across split components in a heavy sentence vs. those not involved in splitting (Section 6.3.3). As pointed out in Section 5.3, one obvious future direction is to incorporate the insight from our work to improve Chinese to English machine translation.

In terms of specificity, we discovered strong associations between content-heavy Chinese sentences, text specificity and the second argument ofinstantiation(Chapter 6). Future

work can further explore specificity across different languages, e.g., sentence specificity prediction in Chinese.

Specificity and sentence simplification. Section 7.1 shows strong associations between specificity and simplified sentences. When characterizing sentences that need simplification, specificity is as indicative as and complementary to readability. Furthermore, we found that often the simplified version of a sentence uses multiple sentences to express the content in the original. We leave to future work to incorporate specificity and our insights in content- heavy sentences into sentence simplification systems.

Specificity and demographics. Section 7.2 presents our pilot study exploring specificity perception variation across varying autism symptoms. While our study did not lead to statistically significant findings, we pointed out several ways to improve the experiment: recruiting subjects with clinically diagnosed ASD, expanding the applicability of our stimuli

and extending our analysis on subject-produced summaries.

Our work exploring links between specificity and gender, reading abilities and autism symptoms opens new directions for future work to go further into aspects in socio-demographics and personal background (Section 7.3). Research in social media text has found distinctive language usage across people of different genders, income levels, personality and political views. We leave for future work to investigate how specificity is perceived and organized when these aspects vary.

References

Manish Agarwal, Rakshit Shah, and Prashanth Mannem. 2011. Automatic question generation using discourse cues. In Proceedings of the Sixth Workshop on Innovative Use

of NLP for Building Educational Applications, pages 1–9.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. International Conference on Learning

Representations.

David Bamman, Jacob Eisenstein, and Tyler Schnoebelen. 2014. Gender identity and lexical variation in social media. Journal of Sociolinguistics, 18(2):135–160.

Simon Baron-Cohen, Sally Wheelwright, Richard Skinner, Joanne Martin, and Emma Club- ley. 2001. The autism-spectrum quotient (AQ): Evidence from asperger syndrome/high- functioning autism, malesand females, scientists and mathematicians. Journal of autism

and developmental disorders, 31(1):5–17.

Regina Barzilay and Noemie Elhadad. 2003. Sentence alignment for monolingual compa- rable corpora. In Proceedings of the 2003 Conference on Empirical Methods in Natural

Language Processing, pages 25–32.

David I Beaver and Brady Z Clark. 2009. Sense and sensitivity: How focus determines

meaning, volume 12. John Wiley & Sons.

Daniel Beck, Kashif Shah, and Lucia Specia. 2014. SHEF-Lite 2.0: Sparse multi-task gaussian processes for translation quality estimation. In Ninth Workshop on Statistical

Machine Translation.

Parminder Bhatia, Yangfeng Ji, and Jacob Eisenstein. 2015. Better document-level sentiment analysis from rst discourse parsing. In Proceedings of Empirical Methods for

Natural Language Processing, September.

Or Biran and Kathleen McKeown. 2013. Aggregated word pair features for implicit discourse relation disambiguation. In Proceedings of the 51st Annual Meeting of the Asso-

ciation for Computational Linguistics: Short Papers, pages 69–73.

Or Biran and Kathleen McKeown. 2015. PDTB discourse parsing as a tagging task: The two taggers approach. InProceedings of the 16th Annual Meeting of the Special Interest

Group on Discourse and Dialogue, pages 96–104.

Chlo´e Braud and Pascal Denis. 2015. Comparing word representations for implicit discourse relation classification. In Proceedings of the 2015 Conference on Empirical Methods in

Natural Language Processing, pages 2201–2211.

Peter F. Brown, Peter V. deSouza, Robert L. Mercer, Vincent J. Della Pietra, and Jenifer C. Lai. 1992. Class-based n-gram models of natural language. Computational Linguistics, 18(4):467–479.

AS Buescher, Cidav. Z, M Knapp, and DS Mandell. 2014. Costs of autism spectrum disorders in the United Kingdom and the United States. JAMA Pediatrics, 168(8):721– 728.

John D. Burger, John Henderson, George Kim, and Guido Zarrella. 2011. Discriminating

gender on twitter. In Proceedings of the 2011 Conference on Empirical Methods in

Natural Language Processing, pages 1301–1309.

Lynn Carlson and Daniel Marcu. 2001. Discourse tagging reference manual. ISI Technical

Report ISI-TR-545, 54.

Gregory N. Carlson. 2005. Generics, habituals and iteratives. InEncyclopedia of Language

and Linguistics. Elsevier.

Marine Carpuat and Dekai Wu. 2007. Improving statistical machine translation using word sense disambiguation. InProceedings of the 2007 Joint Conference on Empirical Methods

in Natural Language Processing and Computational Natural Language Learning, pages

61–72.

Joyce Y. Chai and Rong Jin. 2004. Discourse structure for context question answering. In

HLT-NAACL 2004: Workshop on Pragmatics of Question Answering, pages 23–30.

Yee Seng Chan, Hwee Tou Ng, and David Chiang. 2007. Word sense disambiguation

improves statistical machine translation. In Proceedings of the 45th Annual Meeting of

the Association of Computational Linguistics, pages 33–40.

Raman Chandrasekar, Christine Doran, and Bangalore Srinivas. 1996. Motivations and methods for text simplification. InProceedings of the 16th Conference on Computational

Linguistics, pages 1041–1044.

Pi-Chuan Chang, Daniel Jurafsky, and Christopher D. Manning. 2009a. Disambiguating “DE” for Chinese-English machine translation. In Proceedings of the Fourth Workshop

on Statistical Machine Translation.

Pi-Chuan Chang, Huihsin Tseng, Dan Jurafsky, and Christopher D. Manning. 2009b.

Discriminative reordering with Chinese grammatical relations features. In Proceedings

of the Third Workshop on Syntax and Structure in Statistical Translation.

Jennifer Coates. 2015. Women, men and language: A sociolinguistic account of gender

differences in language. Routledge.

Michael Collins, Philipp Koehn, and Ivona Kuˇcerov´a. 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association

for Computational Linguistics, pages 531–540.

Ian Palmer Cook. 2016. Content and Context: Three Essays on Information in Politics.

Ph.D. thesis, University of Pittsburgh.

Lee J. Cronbach. 1951. Coefficient alpha and the internal structure of tests. Psychometrika, 16(3):297–334.

Osten Dahl. 1975. On generics. Formal Semantics of Natural Language, pages 99–111. Marie-Catherine de Marneffe, Christopher D. Manning, and Christopher Potts. 2010. “Was

it good? It was provocative.” Learning the meaning of scalar adjectives. InProceedings

of the 48th Annual Meeting of the Association for Computational Linguistics, pages

167–176.

Georgiana Dinu and Mirella Lapata. 2010. Measuring distributional similarity in context. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language

Processing, pages 1162–1172.

Peter Dixon. 1982. Plans and written directions for complex tasks. Journal of Verbal

Learning and Verbal Behavior, 21(1):70–84.

Peter Dixon. 1987. The processing of organizational and component step information in written directions. Journal of memory and language, 26(1):24–35.

Alex Djalali, David Clausen, Sven Lauer, Karl Schultz, and Christopher Potts. 2011. Mod- eling expert effects and common ground using questions under discussion. InAAAI Fall

Symposium: Building Representations of Common Ground with Intelligent Agents.

Greg Durrett and Dan Klein. 2014. A joint model for entity analysis: Coreference, typing, and linking. In Transactions of the Association for Computational Linguistics.

Greg Durrett, Taylor Berg-Kirkpatrick, and Dan Klein. 2016. Learning-based single-

document summarization with compression and anaphoricity constraints. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume

1: Long Papers), pages 1998–2008.

N. Elhadad, M.-Y. Kan, J.L. Klavans, and K.R. McKeown. 2005. Customization in a uni- fied framework for summarizing medical literature. Artificial Intelligence in Medicine, 33(2):179 – 198. Information Extraction and Summarization from Medical Documents. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. Liblinear: A library for large linear classification. Journal of Machine Learning Re-

search, 9:1871–1874.

Federico Fancellu and Bonnie Webber. 2014. Applying the semantics of negation to SMT through n-best list re-ranking. In Proceedings of the 14th Conference of the European

Chapter of the Association for Computational Linguistics.

Christiane Fellbaum. 1998. WordNet. Wiley Online Library.

Vanessa Wei Feng and Graeme Hirst. 2014. A linear-time bottom-up discourse parser with constraints and post-editing. In Proceedings of the 52nd Annual Meeting of the

Association for Computational Linguistics, pages 511–521.

Lucie Flekova and Iryna Gurevych. 2013. Can we hide in the web? large scale simultaneous

age and gender author profiling in social media. In CLEF 2012 Labs and Workshop,

Lucie Flekova, Jordan Carpenter, Salvatore Giorgi, Lyle Ungar, and Daniel Preot¸iuc-Pietro. 2016. Analyzing biases in human perception of user age and gender from text. In

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics

(Volume 1: Long Papers), pages 843–854, Berlin, Germany, August. Association for

Computational Linguistics.

Austin F Frank and T Florain Jaeger. 2008. Speaking rationally: Uniform information density as an optimal strategy for language production. In Proceedings of the cognitive

science society, volume 30.

William Frawley. 1992. Linguistic Semantics. L. Erlbaum Associates.

Lyn Frazier, Charles Clifton Jr., and Britta Stolterfoht. 2008. Scale structure: Processing

minimum standard and maximum standard scalar adjectives. Cognition, 106(1):299 –

324.

Annemarie Friedrich and Manfred Pinkal. 2015. Discourse-sensitive automatic identifica- tion of generic expressions. InProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural

Language Processing (Volume 1: Long Papers), pages 1272–1281.

Michel Galley, Mark Hopkins, Kevin Knight, and Daniel Marcu. 2004. What’s in a trans-

lation rule? In Proceedings of the Conference of the North American Chapter of the

Association for Computational Linguistics, pages 273–280.

Dmitriy Genzel and Eugene Charniak. 2002. Entropy rate constancy in text. InProceedings

of 40th Annual Meeting of the Association for Computational Linguistics, pages 199–

206.

Shima Gerani, Yashar Mehdad, Giuseppe Carenini, Raymond T. Ng, and Bita Nejat. 2014. Abstractive summarization of product reviews using discourse structure. InProceedings

of the 2014 Conference on Empirical Methods in Natural Language Processing, pages

1602–1613.

David Graff and Christopher Cieri. 2003. English Gigaword LDC2003T05. In Linguistic

Data Consortium.

Barbara J. Grosz, Scott Weinstein, and Aravind K. Joshi. 1995. Centering: A framework for modeling the local coherence of discourse. Computational Linguistics, 21(2). NIST Multimodal Information Group. 2013. NIST 2008-2012 Open Machine Translation

(OpenMT) Progress Test Sets LDC2013T07. In Linguistic Data Consortium.

Francisco Guzm´an, Shafiq Joty, Llu´ıs M`arquez, and Preslav Nakov. 2014. Using discourse structure improves machine translation evaluation. In Proceedings of the 52nd Annual

Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),

pages 687–698.

Nizar Habash and Fatiha Sadat. 2006. Arabic preprocessing schemes for statistical machine

translation. In Proceedings of the Conference of the North American Chapter of the

Vasileios Hatzivassiloglou and Janyce M. Wiebe. 2000. Effects of adjective orientation

and gradability on sentence subjectivity. In Proceedings of the 18th Conference on

Computational Linguistics - Volume 1, pages 299–305.

Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and compre- hend. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors,

Advances in Neural Information Processing Systems 28, pages 1693–1701.

Derrick Higgins, Jill Burstein, Daniel Marcu, and Claudia Gentile. 2004. Evaluating multiple aspects of coherence in student essays. In HLT-NAACL 2004: Main Proceedings, pages 185–192.

Donald Hindle. 1983. Discourse organization in speech and writing. In Muffy E. A. Siegel and Toby Olson, editors, Writing Talks. Boynton/Cook.

Tsutomu Hirao, Yasuhisa Yoshida, Masaaki Nishino, Norihito Yasuda, and Masaaki Nagata.

2013. Single-document summarization as a tree knapsack problem. In Proceedings of

the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1515–

1520.

Sepp Hochreiter and J¨urgen Schmidhuber. 1997. Long short-term memory. Neural Com-

putation, 9(8):1735–1780, November.

Alexander Hogenboom, Flavius Frasincar, Franciska de Jong, and Uzay Kaymak. 2015. Using rhetorical structure in sentiment analysis. Communication of the ACM, 58(7):69– 77, June.

Dirk Hovy. 2015. Demographic factors improve classification performance. InProceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long

Papers), pages 752–762.

Shudong Huang, David Graff, and George Doddington. 2002. Multiple-Translation Chinese

Corpus LDC2002T01. In Linguistic Data Consortium.

Shudong Huang, David Graff, Kevin Walker, David Miller, Xiaoyi Ma, Christopher

Cieri, and George Doddington. 2003. Multiple-Translation Chinese (MTC) Part 2

LDC2003T17. In Linguistic Data Consortium.

T Florian Jaeger and Roger P Levy. 2007. Speakers optimize information density through syntactic reduction. In Advances in neural information processing systems, pages 849– 856.

T Florian Jaeger. 2010. Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology, 61(1):23–62.

Yangfeng Ji and Jacob Eisenstein. 2014. Representation learning for text-level discourse

parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computa-

Yangfeng Ji and Jacob Eisenstein. 2015. One vector is not enough: Entity-augmented distributed semantics for discourse relations. Transactions of the Association for Com-

putational Linguistics, 3:329–344.

Yangfeng Ji, Gholamreza Haffari, and Jacob Eisenstein. 2016. A latent variable recurrent neural network for discourse-driven language models. InProceedings of the 2016 Confer- ence of the North American Chapter of the Association for Computational Linguistics:

Human Language Technologies, pages 332–342.

Yaohong Jin and Zhiying Liu. 2010. Improving Chinese-English patent machine translation using sentence segmentation. In International Conference on Natural Language

Processing and Knowledge Engineering, pages 1–6.

Meixun Jin, Mi-Young Kim, Dongil Kim, and Jong-Hyeok Lee. 2004. Segmentation of

Chinese long sentences using commas. In Proceedings of the Third SIGHAN Workshop

on Chinese Language Processing, pages 1–8.

Anders Johannsen, Dirk Hovy, and Anders Søgaard. 2015. Cross-lingual syntactic variation

over age and gender. In Proceedings of the Nineteenth Conference on Computational

Natural Language Learning, pages 103–112.

Shafiq Joty, Giuseppe Carenini, and Raymond T. Ng. 2015. Codra: A novel discriminative framework for rhetorical analysis. Computational Linguistics, 41(3):385–435.

David Kauchak. 2013. Improving text simplification language modeling using unsimplified text data. In Proceedings of the 51st Annual Meeting of the Association for Computa-

tional Linguistics.

Andrew Kehler. 2004. Discourse coherence. The handbook of pragmatics, pages 241–265. Yuta Kikuchi, Tsutomu Hirao, Hiroya Takamura, Manabu Okumura, and Masaaki Nagata.

2014. Single document summarization based on nested tree structure. In Proceedings

of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume

2: Short Papers), pages 315–320.

Yoon Kim, Carl Denton, Luong Hoang, and Alexander M Rush. 2017. Structured attention networks. International Conference on Learning Representations.

Dan Klein and Christopher D. Manning. 2003. Fast exact inference with a factored model for natural language parsing. In Advances in Neural Information Processing Systems, volume 15.

Emiel Krahmer and Kees van Deemter. 2012. Computational generation of referring expressions: A survey. Computational Linguistics, 38(1):173–218.

Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, and Richard Socher. 2016. Ask me anything: Dynamic memory networks for natural language processing. InInternational Conference on Ma-

Alex Lascarides and Nicholas Asher, 2007. Segmented Discourse Representation Theory:

Dynamic Semantics With Discourse Structure, pages 87–124. Springer Netherlands.

Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing neural predictions. In

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Process- ing, pages 107–117.

Roger Levy and Christopher D. Manning. 2003. Is it harder to parse Chinese, or the

Chinese Treebank? In Proceedings of the 41st Annual Meeting of the Association for

Computational Linguistics.

Junyi Jessy Li and Ani Nenkova. 2014. Reducing sparsity improves the recognition of implicit discourse relations. In Proceedings of the 15th Annual Meeting of the Special

Interest Group on Discourse and Dialogue, pages 199–207.

Junyi Jessy Li and Ani Nenkova. 2015a. Detecting content-heavy sentences: A cross-

language case study. In Proceedings of the 2015 Conference on Empirical Methods in

Natural Language Processing, pages 1271–1281.

Junyi Jessy Li and Ani Nenkova. 2015b. Fast and accurate prediction of sentence specificity.

In Proceedings of the Twenty-Ninth Conference on Artificial Intelligence.

Junyi Jessy Li and Ani Nenkova. 2016. The instantiation discourse relation: A corpus analysis of its properties and improved detection. InProceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technolo- gies.

Junyi Jessy Li, Marine Carpuat, and Ani Nenkova. 2014. Assessing the discourse factors that influence the quality of machine translation. In Proceedings of the 52nd Annual

Meeting of the Association for Computational Linguistics (Volume 2: Short Papers),

pages 283–288.

Junyi Jessy Li, Bridget O’Daniel, Yi Wu, Wenli Zhao, and Ani Nenkova. 2016. Improv- ing the annotation of sentence specificity. In Proceedings of the Tenth International

Conference on Language Resources and Evaluation.

Jiwei Li, Will Monroe, and Dan Jurafsky. 2017a. Data Distillation for Controlling Speci- ficity in Dialogue Generation. ArXiv e-prints, February.

Junyi Jessy Li, Julia Parish-Morris, Leila Bateman, and Ani Nenkova. 2017b. Autism

In document From Discourse Structure To Text Specificity: Studies Of Coherence Preferences (Page 141-157)