Although the idea is straightforward, we face two problems in practice. First, for frequent semantic frames, the number of substitution candidates can be very large. It will generate many new sentence pairs, and can easily exceed the capacity of our system. To deal with the problem, we pre-filter the SSRs so that each semantic signature is associated with no more than 100 SSRs. As we can see from the cri- teria for extracting SSRs, all the entries in the SSR rule set satisfies the commonly used phrase extrac- tion heuristics. Therefore, the set of SSRs is a subset of the phrase table. Because of this, We use the fea- tures in the phrase table to sort the rules, and keep 100 rules with highest the arithmetic mean of the feature values.
We evaluate our system on the semanticrolelabel- ing portion of the CoNLL-2009 shared task (Hajiˇc et al., 2009), on all seven languages, namely Cata- lan, Chinese, Czech, English, German, Japanese and Spanish. For each language, certain tokens in each sentence in the dataset are marked as pred- icates. Each predicate takes as arguments other words in the same sentence, their relationship marked by labeled dependency arcs. Sentences may contain no predicates.
Automatic generation of questions (AQG) is an important and challenging research area in natural language processing. AQG systems can be useful for educational applications such as assessment of reading comprehension, intelligent tutoring, dia- logue agents, and instructional games. Most of the research on AQG focuses on factoid questions – questions that are generated from reading passages and ask about information that is expressed in the text itself (as opposed to, e.g., readers’ opinions of the text or external knowledge related to the text). Traditional architectures for AQG involve syn- tactic and semantic analysis of text, with rule- based and template-based modules for converting linguistic analyses into questions. Many of these systems employ semanticrole labeling (SRL) as an important analytic component (Mazidi and Ta- rau, 2016; Huang and He, 2016). Recently, neural network architectures have also been proposed for the AQG task (Du et al., 2017; Serban et al., 2016). In this paper we present an automatic question generation system based on semanticrolelabel- ing. The system generates questions directly from semantic analysis, without templates. Our system includes two innovations. While previous SRL- based AQG systems generated only wh-questions,
First, we take advantage of the network’s dy- namic connectivity to highlight the portion of the tree that bears semantic information. We augment the nodes that can influence parsing decisions at the current state by explicitly adding the vectors of latent variables related to the most recent child bearing a semanticrolelabel of either type (REL, A0 to A5 or AM-X) to the connectivity of the current decision. These additions yield a model that is sensitive to regularities in structurally de- fined sequences of nodes bearing semanticrole la- bels, within and across constituents. These exten- sions enlarge the locality domain over which de- pendencies between predicates bearing the REL label, arguments bearing an A0-A5 label, and ad- juncts bearing an AM-X role can be specified, and capture both linear and hierarchical constraints be- tween predicates, arguments and adjuncts. Enlarg- ing the locality domain this way ensures for in- stance that the derivation of the role DIR in Figure 1 is not independent of the derivations of the roles TMP, REL (the predicate) and A0.
show that by performing joint learning of all the ar- guments in the same proposition (for the same predi- cate), the SRL accuracy is improved. To test the effi- cacy of joint-learning for nominalized predicates in Chinese, we conducted a similar experiment, using a perceptron reranker described in Shen and Joshi (2004). Arguments and adjuncts of the same predi- cate instance (proposition) are chained together with their joint probability being the product of the indi- vidual arguments and the top K propositions are se- lected as the reranking candidates. When the argu- ments are given and the input is hand-crafted gold- standard parses in the treebank, selecting the top 10 propositions yields an oracle score of 97%. This ini- tial promise does not pan out, however. Performing reranking on the top 10 propositions did not lead to significant improvement, using the five feature classes described in (Haghighi et al., 2005). These are features that are hard to implement for individual arguments: core argument label sequence, flattened core argument label sequence, core argument labels and phrase type sequence, repeated core argument labels with phrase types, repeated core argument la- bels with phrase types and adjacency information. We speculate that the lack of improvement is due to the fact that the constraint that core (numbered) arguments should not have the same semanticrolelabel for Chinese nominalized predicates is not as rigid as it is for English verbs. However further error analysis is needed to substantiate this speculation. 5 Related Work
Following previous work, we regard Chinese S- RL as a task of sequence labeling, which assigns a label for each word in the sequence. To iden- tify the boundary information of semantic roles, we adopt the IOBES tagging schema for the la- bels as shown in Figure 1. For sequence labeling, it is important to capture dependencies in the se- quence, especially for the problem of SRL, where the semanticrolelabel for a word not only relies on its local information, but also is determined by long-range dependencies from other words. The advantage of RNN is the ability to better capture the contextual information, which is beneficial to capture dependencies in SRL. Moreover, we en- rich the basic RNN model with bidirectional LST- M RNN, which can model bidirectional and long- range dependencies simultaneously.
Our approach to solve the PP-attachment ambigu- ity is based on a Support Vector Machines learner (Cortes and Vapnik, 1995). The feature set contains complex information extracted automatically from candidate syntax trees generated by parsing (Char- niak, 2000), trees that will be improved by more ac- curate PP-attachment decisions. Some of these fea- tures were proven efficient for semantic information labeling (Gildea and Jurafsky, 2002). The feature set also includes unsupervised information obtained from a very large corpus (World Wide Web). Fea- tures containing manually annotated semantic infor- mation about the verb and about the objects of the verb have also been used. We adopted the standard approach to distinguish between verb and noun at- tachment; thus the classifier has to choose between two classes: V when the prepositional phrase is at- tached to the verb and N when the prepositional phrase is attached to the preceding head noun. 3 Data
However, these methods cannot model the in- ternal correlations among labels. To capture such correlations, the following work, including ML- DT (Clare and King, 2001), Rank-SVM (Elisseeff and Weston, 2002), LP (Tsoumakas and Katakis, 2006), ML-KNN (Zhang and Zhou, 2007), CC (Read et al., 2011), attempt to capture the re- lationship, which though demonstrated improve- ments yet simply captured low-order correlations. A milestone in this field is the application of sequence-to-sequence learning to multi-label text classification (Nam et al., 2017). Sequence- to-sequence learning is about the transformation from one type of sequence to another type of se- quence, whose most common architecture is the attention-based sequence-to-sequence (Seq2Seq) model. The attention-based Seq2Seq (Sutskever et al., 2014) model is initially designed for neu- ral machine translation (NMT) (Bahdanau et al., 2014; Luong et al., 2015). Seq2Seq is able to en- code a given source text and decode the represen- tation for a new sequence to approximate the tar- get text, and with the attention mechanism, the decoder is competent in extracting vital source- side information to improve the quality of de- coding. Multi-label text classification can be re- garded as the prediction of the target label se- quence given a source text, which can be modeled by the Seq2Seq. Moreover, it is able to model the high-order correlations among the source text as well as those among the label sequence with deep recurrent neural networks (RNN).
Based on these error types, a new reranker-centered evaluation method can be defined. Indeed, using this categorization, the reranker can be seen as a binary classifier: from this perspective, the job of the reranker is to detect the sentences for which the CRF answer was wrong, and leave the right ones as they are. Thus, for every sentence, either the answer is changed (positive instances) or not (nega- tive instances). With this idea in mind, the four main categories can be translated to the standard true/false positive/negative categories in a straightforward way: if the reranker changes the answer correctly (WR), the instance is a true posi- tive; if it changes the answer incorrectly (RW), the instance is a false positive, and similarly for the last two cases: RR and WW correspond respectively to true negative and false negative instances (the former was rightly not changed, and the latter should have been changed). Thanks to this interpretation, performance measures like precision, recall and F-score can be calculated for the semantic reranker, independently from the performance of the CRF component. 18 An ex- ample of using such performance scores is given in Table 5.
The TRIPS component WordFinder can construct lexical entries for words not explicitly found in the core lexicon, using a mapping between WordNet and the TRIPS ontology. This mecha- nism provides broad coverage of words in gen- eral use. However, certain “everyday” words have specialized usage in biology. For instance, “association” is not just a vague relationship but a specific kind of binding between molecules. Some other words are used in idiosyncratic con- structions. For instance, “the protein localizes to the nucleus”, which means the protein exists in the nucleus, required a novel syntactic template (and semantic characterization). These words pose particular difficulties for our system as our automatically derived general constructions would be inadequate. For such cases we often have to provide hand-tailored lexical entries with appropriate syntactic templates and semantic re- strictions to distinguish the everyday and bio- logical senses of the words.
Syntax-based statistical machine translation (SSMT) has achieved significant progress during recent years (Galley et al., 2006; May and Knight, 2007; Liu et al., 2006; Huang et al., 2006), showing that deep linguistic knowledge, if used properly, can improve MT performance. Semantics-based SMT, as a natural extension to SSMT, has begun to receive more attention from researchers (Liu and Gildea, 2008; Wu and Fung, 2009). Semantic structures have two major advantages over syntactic structures in terms of helping machine translation. First of all, semantic roles tend to agree better between two languages than syntactic constituents (Fung et al., 2006). This property motivates the approach of using the consistency of semantic roles to select MT outputs (Wu and Fung, 2009). Secondly, the set of semantic roles of a predicate models the skeleton of a sentence, which is crucial to the readability of MT output. By skeleton, we mean the main structure of a sentence including the verbs and their arguments. In spite of the theoretical potential of the semantic roles, there has not been much success in using them to improve SMT systems.
Signiﬁcant numbers of prepositional phrases (PPs) in the Penn treebank  are tagged with their semanticrole relative to the governing verb. For example, Figure 1, shows a fragment of the parse tree for the sentence [Japan’s reserves of gold, convertible foreign currencies, and special drawing rights] fell by a hefty $1.82 billion in October to $84.29 billion [the Finance Ministry said], in which the three PPs governed by the verb fell are tagged as, respectively: PP-EXT (“extend”), meaning how much of the reserve fell; PP-TMP (“temporal”), meaning when the reserve fell; and PP-DIR (“direction”), meaning the direction of the fall. According to our analysis, there are 143 preposition semantic roles in the tree- bank. However, many of these semantic roles are very similar to one another; for example, the following semantic roles were found in the treebank: PP-LOC, PP-LOC-1, PP-LOC-2, PP-LOC-3, PP-LOC-4, PP-LOC-5, PP-LOC-CLR, PP- LOC-CLR-2, PP-LOC-CLR-TPC-1. Inspection of the data revealed no systematic semantic diﬀerences between these PP types. Indeed, for most PPs, it was im- possible to distinguish the subtypes of a given superclass (e.g. PP-LOC in our example). We therefore decided to collapse the PP semantic roles based on their ﬁrst semantic feature. For example, all semantic roles that start with PP-LOC are collapsed to the single class PP-LOC. Table 1 shows the distribution of the collapsed preposition semantic roles.
In this paper we have presented a technique to ex- tend an existing parser to produce richer output, an- notated with function labels. We show that both state-of-the-art results in function labelling and in parsing can be achieved. Application of these re- sults are many-fold, such as information extraction or question answering where shallow semantic an- notation is necessary. The technique illustrated in this paper is of wide applicability to all other se- mantic annotation schemes available today, such as Propbank and Framenet, and can be easily extended. Work to extend this technique to Propbank annota- tion is underway. Since function labels describe de- pendence relations between the predicative head and its complements, whether they be arguments or ad- juncts, this paper suggests that a left-corner parser and its probabilistic model, which are defined en- tirely on configurational criteria, can be used to pro- duce a dependency output. Consequences of this ob- servation will be explored in future work.
We propose a joint multi-label transfer learning setting based on LSTM, and show that it can be an effective solution for the multi-relational semantic similarity tasks. Due to the small size of multi- relational semantic similarity datasets and the re- cent success of LSTM-based sentence representa- tions (Wieting and Gimpel, 2018; Conneau et al., 2017), the model is pre-trained on a large corpus and transfer learning is applied using fine-tuning. In our setting, the network is jointly trained on multiple relations by outputting multiple predic- tions (one for each relation) and aggregating the losses during back-propagation. This is differ- ent from the traditional multi-task learning set- ting where the model makes one prediction at a time, switching between the tasks. We treat the multi-task setting and the single-task setting (i.e., where a separate model is learned for each rela- tion) as baselines, and show that the multi-label setting outperforms them in many cases, achieving state-of-the-art performance on all but one relation of the Human Activity Phrase dataset (Wilson and Mihalcea, 2017).
Our news tweet annotation approach consists of four steps. First, we submit hot queries to Twitter and for each query we obtain a list of tweets. Second, for each list of tweets, we single out news excerpts using heuristic rules and re- move them from the list, conduct SRL on news excerpts using SRL-BS, and cluster them in terms of the similarity in content and predicate- argument structures. Third, for each list of tweets, we try to merge every remaining tweet into one news excerpt cluster according to its content similarity to the cluster. Those that can be put into one news group are regarded as news tweet. Finally, semantic structures of news ex- cerpts are passed to the news tweet in the same group through word alignment.
It can be seen that the frame assignment accu- racy is relatively low for all three texts (between 37% and 47%). However, only a relatively small proportion of the misclassifications are due to true errors made by the system. Furthermore, a large amount of errors (41% to 48%, with an average of 46.8%) is due to cases where important infor- mation is missing from FrameNet (Type (iii) er- rors). Consequently, improving the semanticrole labeller by optimising the feature space or the ma- chine learning framework is going to have very little effect. A much more promising path would be to investigate methods which might enable the SRL system to deal gracefully with unseen data. One possible strategy is discussed in the next sec- tion.
In a second step, each argument candidate is labelled with a semanticrole. Every SRL system has a classification model which can be classified into two types, independent model or joint model. While an independent model decides the label of each argument candidate independently of other candidates, a joint model finds the best overall labelling for all candidates in the sentence at the same time. Independent models are fast but are prone to inconsistencies such as argument overlap, argument repetition or argument missing. For example, Figure 6 shows some examples of these inconsistencies when analyzing the Vietnamese sentence Do học chăm, Nam đã đạt thành tích cao (By studying hard, Nam got a high achievement).
This paper investigates a new problem on grounded semanticrole labeling. Besides semantic roles ex- plicitly mentioned in language descriptions, our ap- proach also grounds implicit roles which are not explicitly specified. As implicit roles also cap- ture important participants related to an action (e.g., tools used in the action), our approach provides a more complete representation of action seman- tics which can be used by artificial agents for fur- ther reasoning and planning towards the physical world. Our empirical results on a complex cook- ing domain have shown that, by incorporating se- mantic role information with visual features, our ap- proach can achieve better performance compared to baseline approaches. Our results have also shown that grounded semanticrole labeling is a challenging problem which often depends on the quality of au- tomated visual processing (e.g., object tracking and recognition).
Semantic databases are a stable starting point in developing knowledge based systems. Since creating language resources demands many temporal, financial and human resources, a possible solution could be the import of a resource annotation from one language to another. This paper presents the creation of a semanticrole database for Romanian, starting from the English FrameNet semantic resource. The intuition behind the importing program is that most of the frames defined in the English FN are likely to be valid cross-lingual, since semantic frames express conceptual structures, language independent at the deep structure level. The surface realization, the surface level, is realized according to each language syntactic constraints. In the paper we present the advantages of choosing to import the English FrameNet annotation, instead of annotating a new corpus. We also take into account the mismatches encountered in the validation process. The rules created to manage particular situations are used to improve the import program. We believe the information and argumentations in this paper could be of interest for those who wish develop FrameNet-like systems for other languages.
With the impressive success of deep neural net- works in various NLP tasks (Zhang et al., 2016; Qin et al., 2017; Cai et al., 2017), a series of neu- ral SRL systems have been proposed. Foland and Martin (2015) presented a dependency semanticrole labeler using convolutional and time-domain neural networks, while FitzGerald et al. (2015) ex- ploited neural network to jointly embed arguments and semantic roles, akin to the work (Lei et al., 2015), which induced a compact feature represen- tation applying tensor-based approach. Recently, researchers consider multiple ways to effectively integrate syntax into SRL learning. Roth and La- pata (2016) introduced dependency path embed- ding to model syntactic information and exhib- ited a notable success. Marcheggiani and Titov (2017) leveraged the graph convolutional network to incorporate syntax into neural models. Dif- ferently, Marcheggiani et al. (2017) proposed a