To solve these problems, we propose a convolution enhanced bilingual recursiveneuralnetwork (Con- vBRNN), which exploits word alignments to guide the generation of phrase structures and then inte- grates embeddings of different linguistic units on the phrase structures into bilingual semantic modeling. Specifically, we develop a new recursiveneuralnetwork, in which the composition criterion for tree construction is the degree of consistency to word alignments rather than the reconstruction error. Fur- thermore, we propose a variant of the tree-based convolutional neuralnetwork (Mou et al., 2015) to fully access all embeddings on the phrase structures, which can be used to produce better phrase rep- resentations (see Section 3.2). All these make ConvBRNN more suitable for the subsequent bilingual semantic modeling, where a bilinear model is introduced to interact and compare the source and target phrase representations in terms of the degree of semantic equivalence. To train our model, we introduce two max-margin losses: one for the bilingual semantic structure inference and the other for the semantic similarity model, both of which are derivable.
In this paper, we proposed a new neuralnetwork architecture, the Inside-Outside RecursiveNeuralNetwork, that can process trees both bottom-up and top-down. The key idea is to extend the RNN such that every node in the tree has two vectors associated with it: an inner representation for its content, and an outer representation for its context. Inner and outer representations of any constituent can be computed simultaneously and interact with each other. This way, information can flow top- down, bottom-up, inward and outward. Thanks to this property, by applying the IORNN to depen- dency parses, we have shown that using an ∞- order generative model for dependency parsing, which has never been done before, is practical.
Using neural networks for syntactic parsing has become popular recently, thanks to promising re- sults that those neural-net-based parsers achieved. For constituent parsing, Socher et al. (2013) using a recursiveneuralnetwork (RNN) got an F1 score close to the state-of-the-art on the Penn WSJ cor- pus. For dependency parsing, the inside-outside recursiveneural net (IORNN) reranker proposed by Le and Zuidema (2014) is among the top sys- tems, including the Chen and Manning (2014)’s extremely fast transition-based parser employing a traditional feed-forward neuralnetwork.
The word order between source and tar- get languages significantly influences the translation quality in machine translation. Preordering can effectively address this problem. Previous preordering methods require a manual feature design, making language dependent design costly. In this paper, we propose a preordering method with a recursiveneuralnetwork that learns features from raw inputs. Experiments show that the proposed method achieves comparable gain in translation quality to the state-of-the-art method but without a manual feature design.
Recently, many neuralnetwork models have been applied to Chinese word seg- mentation. However, such models focus more on collecting local information while long distance dependencies are not well learned. To integrate local features with long distance dependencies, we propose a dependency-based gated recursiveneuralnetwork. Local features are first collect- ed by bi-directional long short term mem- ory network, then combined and refined to long distance dependencies via gated re- cursive neuralnetwork. Experimental re- sults show that our model is a competitive model for Chinese word segmentation. 1 Introduction
We propose Adaptive RecursiveNeuralNetwork (AdaRNN) for the target-dependent Twitter senti- ment classification. AdaRNN employs more than one composition functions and adaptively chooses them depending on the context and linguistic tags. For a given tweet, we first convert its dependency tree for the interested target. Next, the AdaRNN learns how to adaptively propagate the sentiments of words to the target node. AdaRNN enables the sentiment propagations to be sensitive to both linguistic and semantic categories by using differ- ent compositions. The experimental results illus- trate that AdaRNN improves the baselines without hand-crafted rules.
Recursiveneural models utilize the recursive structure (usually a parse tree) of a phrase or sen- tence for semantic composition. In RecursiveNeuralNetwork (Socher et al., 2011), the tree with the least reconstruction error is built and the vectors for interior nodes is composed by a global matrix. Matrix-Vector RecursiveNeuralNetwork (MV-RNN) (Socher et al., 2012) assigns matri- ces for every words so that it could capture the relationship between two children. In RecursiveNeural Tensor Networks (RNTN) (Socher et al., 2013b), the composition process is performed on a parse tree in which every node is annotated with fine-grained sentiment labels, and a global tensor is used for composition. Adaptive Multi- Compositionality (Dong et al., 2014) uses multiple weighted composition matrices instead of sharing a single matrix.
We propose a method for implicit discourse relation recognition using a recursiveneuralnetwork (RNN). Many previous studies have used the word-pair feature to compare the meaning of two sentences for implicit dis- course relation recognition. Our proposed method differs in that we use various-sized sentence expression units and compare the meaning of the expressions between two sen- tences by converting the expressions into vec- tors using the RNN. Experiments showed that our method significantly improves the accu- racy of identifying implicit discourse relations compared with the word-pair method.
based models. Kim (2014) reports 4 different CNN models using max-over-time pooling, where CNN-non-static and CNN-multichannel are more sophisticated. MaxTDNN sentence model is based on the architecture of the Time-Delay NeuralNetwork (TDNN) (Waibel et al., 1989; Collobert and Weston, 2008). Dynamic convo- lutional neuralnetwork (DCNN) (Kalchbrenner et al., 2014) uses the dynamic k-max pooling operator as a non-linear sub-sampling function, in which the choice of k depends on the length of given sentence. Methods in the fourth block are RecNN based models. RecursiveNeural Tensor Network (RecNTN) (Socher et al., 2013b) is an extension of plain RecNN, which also depends on a external syntactic structure. Recursive Autoencoder (RAE) (Socher et al., 2011) learns the representations of sentences by minimizing the reconstruction error. Matrix-Vector RecursiveNeuralNetwork (MV-RecNN) (Socher et al., 2012) is a extension of RecNN by assigning a vector and a matrix to every node in the parse tree. AdaSent (Zhao et al., 2015) adopts recursiveneuralnetwork using DAG structure.
Richard Socher, Brody Huval, Christopher D Manning, and Andrew Y Ng. 2012. Semantic compositional- ity through recursive matrix-vector spaces. In Pro- ceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Com- putational Natural Language Learning (EMNLP- CoNLL), pages 1201–1211. Association for Compu- tational Linguistics.
Figure 1 gives an illustration of how our ap- proach models the complicated combinations of the context characters. Given a sentence “ 雨 (Rainy) 天 (Day) 地面 (Ground) 积水 (Accumu- lated water)”, the target character is “地”. This sentence is very complicated because each consec- utive two characters can be combined as a word. To predict the label of the target character “地” un- der the given context, GRNN detects the combina- tions recursively from the bottom layer to the top. Then, we receive a score vector of tags by incorpo- rating all the combination information in network. The contributions of this paper can be summa- rized as follows:
This paper describes our submission to Shared Task on Similar Language Translation in Fourth Conference on Machine Translation (WMT 2019). We submitted three systems for Hindi → Nepali direction in which we have examined the performance of a RecursiveNeuralNetwork (RNN) based Neural Machine Translation (NMT) system, a semi-supervised NMT system where monolingual data of both languages is utilized using the architecture by (Artetxe et al., 2017) and a system trained with extra synthetic sentences generated using copy of source and target sentences without using any additional monolingual data.
puts. Socher et al. (2012) propose Matrix-Vector RecursiveNeuralNetwork (MV-RNN), where in- stead of using only vectors for words, an additional matrix for each word is used to capture operator se- mantics in language. To apply RNN to relation clas- sification, they find the path in the parse tree between the two entities and apply compositions bottom up. Hashimoto et al. (2013) follow the same design but introduce a different composition function. They make use of word-POS pairs and use untied weights based on phrase categories of the pair.
Text classification methods for tasks like factoid question answering typi- cally use manually defined string match- ing rules or bag of words representa- tions. These methods are ineffective when question text contains very few individual words (e.g., named entities) that are indicative of the answer. We introduce a recursiveneuralnetwork (rnn) model that can reason over such input by modeling textual composition- ality. We apply our model, qanta, to a dataset of questions from a trivia competition called quiz bowl. Unlike previous rnn models, qanta learns word and phrase-level representations that combine across sentences to reason about entities. The model outperforms multiple baselines and, when combined with information retrieval methods, ri- vals the best human players.
Recently, neuralnetwork based depen- dency parsing has attracted much interest, which can effectively alleviate the prob- lems of data sparsity and feature engineer- ing by using the dense features. How- ever, it is still a challenge problem to sufficiently model the complicated syn- tactic and semantic compositions of the dense features in neuralnetwork based methods. In this paper, we propose two heterogeneous gated recursive neu- ral networks: tree structured gated re- cursive neuralnetwork (Tree-GRNN) and directed acyclic graph structured gated recursiveneuralnetwork (DAG-GRNN). Then we integrate them to automati- cally learn the compositions of the dense features for transition-based dependency parsing. Specifically, Tree-GRNN mod- els the feature combinations for the trees in stack, which already have partial depen- dency structures. DAG-GRNN models the feature combinations of the nodes whose dependency relations have not been built yet. Experiment results on two prevalent benchmark datasets (PTB3 and CTB5) show the effectiveness of our proposed model.
Neural Inside-Outside Parsers The Inside- Outside RecursiveNeuralNetwork (IORNN) (Le and Zuidema, 2014) is closest to ours. It is a graph-based dependency parser that uses beam search and can reliably find accurate parses when retaining a k-best list. In contrast, our model produces the most likely parse given the learned compatibility of the constituents. The Neural CRF Parser (Durrett and Klein, 2015), similar to DIORA, performs exact inference on the structure of a sentence, although requires a set of gram- mar rules and labeled parse trees during training. DIORA, like Liu et al. (2018), has a single gram- mar rule that applies to any pair of constituents and does not use structural supervision.
Recently, deep learning techniques have been widely used in exploring semantic representation- s behind complex structures. This provides us an opportunity to model the ADP structure in a neuralnetwork framework. Thus, we propose a dependency-based framework where two neural networks are used to model shortest dependency paths and dependency subtrees separately. One convolutional neuralnetwork (CNN) is applied over the shortest dependency path, because CNN is suitable for capturing the most useful features in a flat structure. A recursiveneuralnetwork (RN- N) is used for extracting semantic representations from the dependency subtrees, since RNN is good at modeling hierarchical structures. To connect these two networks, each word on the shortest
Negation words, such as no and not, play a fundamental role in modifying sentiment of textual expressions. We will refer to a negation word as the negator and the text span within the scope of the negator as the argument. Commonly used heuristics to estimate the sentiment of negated expres- sions rely simply on the sentiment of ar- gument (and not on the negator or the ar- gument itself). We use a sentiment tree- bank to show that these existing heuristics are poor estimators of sentiment. We then modify these heuristics to be dependent on the negators and show that this improves prediction. Next, we evaluate a recently proposed composition model (Socher et al., 2013) that relies on both the negator and the argument. This model learns the syntax and semantics of the negator’s ar- gument with a recursiveneuralnetwork. We show that this approach performs bet- ter than those mentioned above. In ad- dition, we explicitly incorporate the prior sentiment of the argument and observe that this information can help reduce fitting er- rors.
Fine-grained opinion analysis aims to ex- tract aspect and opinion terms from each sentence for opinion summarization. Su- pervised learning methods have proven to be effective for this task. However, in many domains, the lack of labeled data hinders the learning of a precise extraction model. In this case, unsupervised domain adapta- tion methods are desired to transfer knowl- edge from the source domain to any un- labeled target domain. In this paper, we develop a novel recursiveneuralnetwork that could reduce domain shift effectively in word level through syntactic relations. We treat these relations as invariant “pivot information” across domains to build struc- tural correspondences and generate an aux- iliary task to predict the relation between any two adjacent words in the dependency tree. In the end, we demonstrate state-of- the-art results on three benchmark datasets. 1 Introduction
combination of recursiveneuralnetwork and recurrent neuralnetwork, and in turn integrates their respective capabilities: (1) new information can be used to generate the next hidden state, like recurrent neu- ral networks, so that language model and translation model can be integrated natu- rally; (2) a tree structure can be built, as recursiveneural networks, so as to gener- ate the translation candidates in a bottom up manner. A semi-supervised training ap- proach is proposed to train the parameter- s, and the phrase pair embedding is ex- plored to model translation confidence di- rectly. Experiments on a Chinese to En- glish translation task show that our pro- posed R 2 NN can outperform the state-