• No results found

Language Modeling for Code Mixing: The Role of Linguistic Theory based Synthetic Data

N/A
N/A
Protected

Academic year: 2020

Share "Language Modeling for Code Mixing: The Role of Linguistic Theory based Synthetic Data"

Copied!
11
0
0

Loading.... (view fulltext now)

Full text

Loading

Figure

Figure 1: Parse trees of a pair of equivalent (a) English and (b) Spanish sentences, with correspondinghierarchical structure (due to production rules), internal nodes (non-terminal categories) and leaf nodes(terminal symbols), and parse trees of (c) incorrectly code-mixed and (d) correctly code-mixed variantsof these sentences (as per the EC theory).
Figure 2: (a) The parse of an English sentenceas per Stanford CoreNLP. This parse is projectedmodified during this process, to produce corre-onto the parallel Spanish sentence Lo har´a andsponding (b) English and (c) Spanish parse trees.
Table 1: Size of the datasets. Numbers in paren-thesis show the vocabulary size, i.e., the no
Figure 4: Scatter plot of fractional increase inword frequency in gCM (y-axis) vs original fre-quency (x-axis).
+2

References

Related documents