• No results found

L5: Word Embeddings (I)

N/A
N/A
Protected

Academic year: 2021

Share "L5: Word Embeddings (I)"

Copied!
33
0
0

Loading.... (view fulltext now)

Full text

(1)

L5: Word Embeddings (I)

Spring 2021

COS 484/584

(Advanced) Natural Language Processing

(2)

Word embeddings

Representing words as vectors

Distributional hypothesis

PPMI vector models

word2vec

What is it?

How does it work?

More variants

Evaluation

Wednesday

lecture

(3)

Before we get started…

Zoom poll:

All the questions & answers will be on the slides

We will ask you to choose A/B/C/D in the poll

We heard that people like having more polls :)

In-lecture questions:

Q: What is the dimension of W?

Please type in chat if you have the answer!

In the meantime, feel free to ask any question you may have in the chat

Question from feedback: 484 and 584 will be graded on different curves

(4)
(5)

Recap

n-gram models

Naive Bayes

<latexit sha1_base64="SHc94MUjFBUuu2Tns4OFs7TvySc=">AAAC6HicbZLLbhMxFIY900JDuIWyZGO1QipCima6AJaBblgGmqSVMlHkcU4aU19G9pkq0ShPwKaLIsSWR2LXp6GeTFQ1lyNZOv7Pd+zflzSTwmEU3Qbhzu6jx3u1J/Wnz56/eNl4td9zJrccutxIY89T5kAKDV0UKOE8s8BUKuEsvTwp62dXYJ0wuoOzDAaKXWgxFpyhl4aN/+2jBGGKBU6AepE6P4ym5VQxnL+jCcsya6b0ATiniRIjWs1PO5+/d0oOhQJ3j/m1VrCybQNyaxCvdlyFjF5h3DZm3ZTv2WDUNkfDxmHUjBZBN5N4mRy2DpL3N7etWXvY+JeMDM8VaOSSOdePowwHBbMouIR5PckdZIxfsgvo+1Qzb2FQLB5qTt96ZUTHxvqhkS7Uhx0FU87NVOpJ73bi1muluK3Wz3H8aVAIneUImlcbjXNJ0dDy1elIWOAoZz5h3ArvlfIJs4yj/xt1fwnx+pE3k95xM/7QjL752/hCqqiRN+SAHJGYfCQt8pW0SZfwAIKfwU3wK/wRXoe/wz8VGgbLntdkJcK/d74H76U=</latexit>

P (the cat sat on the mat)

⇡ P (the | START) ⇥ P (cat | the) ⇥ P (sat | cat) ⇥

P (on

| sat) ⇥ P (the | on) ⇥ P (mat | the)

(6)

Recap

Q: What is the notion of “word representations” in these models?

(7)

Representing words as discrete symbols

In traditional NLP, we regard words as discrete symbols:

hotel

,

conference

,

motel

— a localist representation

Words can be represented by

one-hot

vectors:

one 1, the rest 0’s

Vector dimension = number of words in vocabulary (e.g., 500,000)

hotel = [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0]

motel = [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0]

(8)

Why is this representation not good?

If we use

word identity

as features,

it requires exact same word to be in training and test

hotel = [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0]

motel = [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0]

Training

Test

hotel = [35, 22, 17, ..]

motel = [34, 21, 14, ..]

Training

Test

If we use

word vectors

as features,

We can generalize to similar but unseen words at testing time!!!

(9)

How do we know the meaning of a word?

You can look up the word in a dictionary/thesaurus!

The meaning of words can be defined by other words!

http://wordnetweb.princeton.edu/

“tejuino”

Key idea: you can know the meaning of a word by looking at its context words

“Princeton”

1. a university town in central New Jersey

(10)

Representing words by their context

Distributional hypothesis: words that occur in similar contexts tend to have

similar meanings

J.R.Firth 1957

“You shall know a word by the company it keeps”

One of the most successful ideas of modern statistical

NLP!

These context words will represent “banking”.

When a word w appears in a text, its context is the set of words that appear nearby

(within a fixed-size window).

(11)

Distributional hypothesis

“tejuino”

C1: A bottle of ___ is on the table.

C2: Everybody likes ___.

C3: Don’t have ___ before you drive.

C4: We make ___ out of corn.

(12)

C1

C2

C3

C4

tejuino

1

1

1

1

loud

0

0

0

0

motor-oil

1

0

0

0

tortillas

0

1

0

1

choices

0

1

0

0

wine

1

1

1

0

C1: A bottle of ___ is on the table.

C2: Everybody likes ___.

C3: Don’t have ___ before you drive.

C4: We make ___ out of corn.

“words that occur in similar contexts tend to have similar meanings”

Distributional hypothesis

(13)

Words as vectors

We’ll build a new model of meaning focusing on similarity

Each word is a vector

Similar words are “nearby in space”

A first solution: we can just use word-word co-occurrence counts to represent the meaning of words!

context words:

4 words to the left,

4 words to the right

Q: What is the dimension of these vectors?

(14)

Words as vectors

context words: 4

words to the left, 4

words to the right

C1

C2

C3

C4

tejuino

1

1

1

1

loud

0

0

0

0

motor-oil

1

0

0

0

tortillas

0

1

0

1

choices

0

1

0

0

wine

1

1

1

0

C1: A bottle of ___ is on the table.

C2: Everybody likes ___.

C3: Don’t have ___ before you drive.

C4: We make ___ out of corn.

vs.

Using is too sparse.

(15)

Measuring similarity

<latexit sha1_base64="y31rrPAo+I1V+TGKi/3m/PqV3XU=">AAACZnicdZFLSwMxFIUz46vWV62ICzfBIihImRFRNwXRjUsF+4BOHTJppg1mHuZRKGn+pDvXbvwZZtoKPi8EDue7l5ucRDmjQnreq+MuLC4tr5RWy2vrG5tble1qS2SKY9LEGct4J0KCMJqSpqSSkU7OCUoiRtrR003B2yPCBc3SBznOSS9Bg5TGFCNprbBiApyJoyBBchjFWpkT+KlH5hg2YBBzhHUgVBJq2vDNo560JkarkMJRSI2x6JnLPxoeTwv8Dx3NqAkrNa/uTQv+Fv5c1MC87sLKS9DPsEpIKjFDQnR9L5c9jbikmBFTDpQgOcJPaEC6VqYoIaKnpzEZeGidPowzbk8q4dT9OqFRIsQ4iWxnkYH4yQrzL9ZVMr7saZrmSpIUzxbFikGZwSJz2KecYMnGViDMqb0rxENkg5X2Z8o2BP/nk3+L1mndP69792e1q+t5HCWwDw7AEfDBBbgCt+AONAEGb86qU3V2nHd3091192atrjOf2QHfyoUfgHW8lg==</latexit>

cos(u, v) =

P

|V |

i=1

u

i

v

i

qP

|V |

i=1

u

2

i

qP

|V |

i=1

v

i

2

<latexit sha1_base64="JMbUOvQBplAe8JCwsQJEixz8Dus=">AAACSXicbVDLSgMxFM20Pmp9VV26CRahgpQZEXUjFN24rGAf0BlqJs20oZnJkGQKZTq/58adO//BjQtFXJlpK62tBwLnnnMv9+a4IaNSmearkcmurK6t5zbym1vbO7uFvf265JHApIY546LpIkkYDUhNUcVIMxQE+S4jDbd/m/qNARGS8uBBDUPi+KgbUI9ipLTULjzamMuS7SPVc704Sk7hLx8kJ/Aa2p5AOJ750MYdruaaktgezWx7BGflQJdJu1A0y+YYcJlYU1IEU1TbhRe7w3Hkk0BhhqRsWWaonBgJRTEjSd6OJAkR7qMuaWkaIJ9IJx4nkcBjrXSgx4V+gYJjdX4iRr6UQ9/VnemRctFLxf+8VqS8KyemQRgpEuDJIi9iUHGYxgo7VBCs2FAThAXVt0LcQzo7pcPP6xCsxS8vk/pZ2boom/fnxcrNNI4cOARHoAQscAkq4A5UQQ1g8ATewAf4NJ6Nd+PL+J60ZozpzAH4g0z2B6XGtT8=</latexit>

cos(u, v) =

u

· v

kukkvk

Q: Why cosine similarity instead of dot product

u ⋅ v

?

A common similarity metric:

cosine

of the angle

between the two vectors (the larger, the more

(16)

Zoom poll

What is the range of

cos(u, v)

in this model?

(a) [-1, 1]

(b) [0, 1]

(c) [0,

)

(d) (

−∞ +∞

+∞

,

)

<latexit sha1_base64="y31rrPAo+I1V+TGKi/3m/PqV3XU=">AAACZnicdZFLSwMxFIUz46vWV62ICzfBIihImRFRNwXRjUsF+4BOHTJppg1mHuZRKGn+pDvXbvwZZtoKPi8EDue7l5ucRDmjQnreq+MuLC4tr5RWy2vrG5tble1qS2SKY9LEGct4J0KCMJqSpqSSkU7OCUoiRtrR003B2yPCBc3SBznOSS9Bg5TGFCNprbBiApyJoyBBchjFWpkT+KlH5hg2YBBzhHUgVBJq2vDNo560JkarkMJRSI2x6JnLPxoeTwv8Dx3NqAkrNa/uTQv+Fv5c1MC87sLKS9DPsEpIKjFDQnR9L5c9jbikmBFTDpQgOcJPaEC6VqYoIaKnpzEZeGidPowzbk8q4dT9OqFRIsQ4iWxnkYH4yQrzL9ZVMr7saZrmSpIUzxbFikGZwSJz2KecYMnGViDMqb0rxENkg5X2Z8o2BP/nk3+L1mndP69792e1q+t5HCWwDw7AEfDBBbgCt+AONAEGb86qU3V2nHd3091192atrjOf2QHfyoUfgHW8lg==</latexit>

cos(u, v) =

P

|V |

i=1

u

i

v

i

qP

|V |

i=1

u

2

i

qP

|V |

i=1

v

i

2

The answer is (b). Cosine similarity ranges between -1 and 1. In this model, all the values of , are

non-negative.

(17)

Any issues with this model?

Raw frequency count is a bad representation!

Frequency is clearly useful; if “

pie

” appears a lot near “

cherry

”,

that's useful information.

But overly frequent words like “

the

”, “

it

", or “

they

” also appear a lot

near “

cherry

”. They are not very informative about the context.

Solution: use a weighting function instead of raw counts!

Pointwise Mutual Information (PMI):

Do events x and y co-occur more or less than if they were independent?

<latexit sha1_base64="0tusD2XDzO40FlsKmJyJKaQ2f1k=">AAACIHicbZC7SgNBFIZn4y3G26pgYzMYhQQk7KYwNkLQRgthBXOBbAizk9k4ZPbCzKwYln0TbXwOOxsLRbTTJ7DzFZxNUmjigYGP/z+HM+d3QkaFNIwPLTMzOze/kF3MLS2vrK7p6xt1EUQckxoOWMCbDhKEUZ/UJJWMNENOkOcw0nD6J6nfuCZc0MC/lIOQtD3U86lLMZJK6ugV20PyinuxdX6WFG724aAIj6DNgl4nLifQdjnCsTUykhSK0Coo7Oh5o2QMC06DOYZ8tfr99bB1u2t19He7G+DII77EDAnRMo1QtmPEJcWMJDk7EiREuI96pKXQRx4R7Xh4YAL3lNKFbsDV8yUcqr8nYuQJMfAc1ZmeIya9VPzPa0XSPWzH1A8jSXw8WuRGDMoApmnBLuUESzZQgDCn6q8QXyGViVSZ5lQI5uTJ01Avl8yDknGh0jgGo8qCbbADCsAEFVAFp8ACNYDBHXgEz+BFu9eetFftbdSa0cYzm+BPaZ8/UUqk6Q==</latexit>

PMI(x, y) = log

2

P (x, y)

P (x)P (y)

<latexit sha1_base64="rCtVbjp02gGre7DJ+I/SdJxYZnw=">AAACaHicfVHbSuQwGE6ru+q4u9bjIt4EXWEEGdq5UG+EQW/0QpgFR4XpUNJMOgbTpiR/1aEUX8RH8Sm88wEE8cZXMHO4GEfxh8CX70CSL2EquAbXfbLsickfP6emZ0qzv37/mXPmF860zBRlDSqFVBch0UzwhDWAg2AXqWIkDgU7D68Oe/r5NVOay+QUuilrxaST8IhTAoYKnDs/JnCp4rx+clyUfWC3kN9I1S4CbxuPbqtbeB/7QnaCvFpgP1KE5vXvA8W4YQt/JIwlcDbcitsf/Bl4Q7BRq729PKzc/6sHzqPfljSLWQJUEK2bnptCKycKOBWsKPmZZimhV6TDmgYmJGa6lfeLKvCmYdo4ksqsBHCfHU3kJNa6G4fG2atFj2s98iutmUG018p5kmbAEjo4KMoEBol7reM2V4yC6BpAqOLmrpheEtMhmL8pmRK88Sd/BmfVirdTcf+bNg7QYKbRGlpHZeShXVRDR6iOGoiiZ2vWWrKWrVfbsf/aqwOrbQ0zi+jD2Ovvnbu+Tg==</latexit>

PMI(word

1

, word

2

) = log

2

P (word

1

, word

2

)

P (word

1

)P (word

2

)

(18)

Positive Pointwise Mutual Information (PPMI)

PMI ranges from

to

PMI(

) > 0

PMI(

) < 0

When one or both words are rare, there is high sampling error in their probabilities

Negative values of PMI are frequently not reliable

A simple fix: replace all the negative PMI values by 0s

−∞ +∞

w

1

, w

2

⟹ P(w

1

, w

2

) > P(w

1

)P(w

2

)

w

1

, w

2

⟹ P(w

1

, w

2

) < P(w

1

)P(w

2

)

Warning: negative PMI values may be statistically significant,

and informative in practice, if both words are quite common.

<latexit sha1_base64="3uixNOLD7TwJ9GW6Wnc47ISqZqk=">AAACfnicfVFNbxMxEPVuCw0pH0k59mKoQAm0YTeH0gtSgQs9IC1S01bKRiuvM5tY9dorexYarfZn8Et650dw66G/hAtO0kObIkay9PzmPY39Ji2ksBgEV56/tv7g4UbjUXPz8ZOnz1rtrROrS8NhwLXU5ixlFqRQMECBEs4KAyxPJZym55/n/dPvYKzQ6hhnBYxyNlEiE5yho5LWzzhnODV5FUVfj+pOjHCB1Q9txnUS7tLb136XfqBOfRFLyLATSz1Jqn5N48wwXkX/t9argi69SzjJLg1obMRkit2ktRP0gkXR+yC8ATuHH4PL6/YvFSWt3/FY8zIHhVwya4dhUOCoYgYFl1A349JCwfg5m8DQQcVysKNqEV9NXzlmTDNt3FFIF+xtR8Vya2d56pTzsOxqb07+qzcsMTsYVUIVJYLiy0FZKSlqOt8FHQsDHOXMAcaNcG+lfMpcnOg21nQhhKtfvg9O+r1wvxd8c2l8IstqkG3yknRISN6TQ/KFRGRAOPnjvfDeeG994r/29/x3S6nv3XiekzvlH/wFvgLFIA==</latexit>

PPMI(word

1

, word

2

) = max

log

2

P (word

1

, word

2

)

P (word

1

)P (word

2

)

, 0

(19)
(20)

Zoom poll

Assume that we have a text corpus of 1M tokens, we use the left 4 words and the

right 4 words as context words, what is N (the denominator for computing these

probabilities) approximately?

(a) 1M

(b) 4M

(c) 8M

(d) not enough information

(21)

Zoom poll

Which of the following statements is correct:

(a) PPMI(cherry, pie) > 0, PPMI(cherry, result) < 0, PPMI(digital, result) > 0

(b) PPMI(cherry, pie) > 0, PPMI(cherry, result) = 0, PPMI(digital, result) > 0

(c) PPMI(cherry, pie) > 0, PPMI(cherry, result) = 0, PPMI(digital, result) = 0

(d) PPMI(cherry, pie) > 0, PPMI(cherry, result) < 0, PPMI(digital, result) < 0

(22)

PPMI - A running example

<latexit sha1_base64="9ch7to3vmpOMuFlsnCDAK9aGXo0=">AAACOXicbZDLSgMxFIYz3q23qks3wSJUkDpjq3UjiG50IVSwKrSlZNLTNpi5kJwRyzCv5ca3cCe4caGIW1/ATDsLbwcCX/5zDsn/u6EUGm37yRobn5icmp6Zzc3NLywu5ZdXLnUQKQ51HshAXbtMgxQ+1FGghOtQAfNcCVfuzXHav7oFpUXgX+AghJbHer7oCs7QSO18rekx7Csvrp2dJsUmwh3GvA9KDZItOrqGApJNekBl0GvvFO2SXa5W6TY1UHF2MyhX04lKqbzfzheMMCz6F5wMCiSrWjv/2OwEPPLARy6Z1g3HDrEVM4WCS0hyzUhDyPgN60HDoM880K146DyhG0bp0G6gzPGRDtXvGzHztB54rplMferfvVT8r9eIsLvfioUfRgg+Hz3UjSTFgKYx0o5QwFEODDCuhPkr5X2mGEcTds6E4Py2/Bcud0rOXsk+rxQOj7I4ZsgaWSdF4pAqOSQnpEbqhJN78kxeyZv1YL1Y79bHaHTMynZWyY+yPr8AWByneQ==</latexit>

PMI(cherry, pie) = log

2

(0.0377/0.0415/0.0437) = 4.38

<latexit sha1_base64="e+fEK0kHYUL5iHO/4P8VlW44n/A=">AAACPXicbZBLSyNBFIWrfcb4is7STWEQImislvjYCGHcjAshglEhCaG6cpMUVj+oui2Gpv+Ym/kP7tzNxoUyuJ3tVCe98HWh4Ktzz6XqHi9S0iBjT87U9Mzs3Hxhobi4tLyyWlpbvzJhrAU0RahCfeNxA0oG0ESJCm4iDdz3FFx7t6dZ//oOtJFhcImjCDo+HwSyLwVHK3VLl22f41D7SeP8LK20Ee4xEUPQepTu0MlVg4kVptv0hKpw0N2vsCpj7JjuUQs19yAHVsscu26VHXVL5bHHFv0Kbg5lklejW3ps90IR+xCgUNyYlssi7CRcoxQK0mI7NhBxccsH0LIYcB9MJxlvn9Itq/RoP9T2BEjH6vuJhPvGjHzPOrNdzedeJn7Xa8XYP+4kMohihEBMHurHimJIsyhpT2oQqEYWuNDS/pWKIddcoA28aENwP6/8Fa72q+5hlV3UyvWfeRwFskE2SYW45IjUyS/SIE0iyAP5Q17Iq/PbeXb+Om8T65STz/wgH8r59x+pT6kZ</latexit>

PMI(cherry, result) = log

2

(0.0008/0.0415/0.0404) =

1.07

<latexit sha1_base64="DQKFvkWo54BRyQ/Ichg13Bmd6Kk=">AAACPnicbVDPSyMxGM2ou2rXXasevQTLQgWdzdRi9SCIXvQgVLAqtKVk0q81mPlB8o1YhvnLvPg3ePPoxYMiXj2aaXtw1QeBl/e9j+Q9P1bSIGP3zsTk1I+f0zOzhV9zv//MFxcWT02UaAENEalIn/vcgJIhNFCigvNYAw98BWf+5X4+P7sCbWQUnuAghnbA+6HsScHRSp1ioxVwvNBBWj86zMothGtMu7IvkatsjY7uGkyiMFulO1RF/U6lzFzGahv0H2VuZbtaGRJWZdXcsc7cGusUS7knB/1KvDEpkTHqneJdqxuJJIAQheLGND0WYzvlGqVQkBVaiYGYi0veh6alIQ/AtNNh/Iz+tUqX9iJtT4h0qH7cSHlgzCDwrTMPaz7PcvG7WTPB3lY7lWGcIIRi9FAvURQjmndJu1KDQDWwhAst7V+puOCaC7SNF2wJ3ufIX8lpxfU2XXZcLe3ujeuYIctkhZSJR2pklxyQOmkQQW7IA3kiz86t8+i8OK8j64Qz3lki/8F5ewdzYKl8</latexit>

(23)

From sparse vectors to dense vectors

Still, the vectors we get from word-word occurrence matrix are sparse

(most are 0’s) & long (vocabulary size)

Alternative: we want to represent words as short (50-300 dimensional) &

dense (real-valued) vectors

The basis of all the modern NLP systems

employees =

0

B

B

B

B

B

B

B

B

B

B

B

B

@

0.286

0.792

0.177

0.107

10.109

0.542

0.349

0.271

0.487

1

C

C

C

C

C

C

C

C

C

C

C

C

A

(24)

Why dense vectors?

Short vectors are easier to use as features in ML systems

Dense vectors may generalize better than explicit counts

Sparse vectors can’t capture high-order co-occurrence

co-occurs with “car”, co-occurs with “automobile”

They should be similar but they aren’t because “car” and “automobile” are

distinct dimensions

In practice, they work better!

(25)

How to get dense vectors?

Singular value decomposition (SVD) of PPMI weighted co-occurrence matrix

(26)

How to get dense vectors?

Singular value decomposition (SVD) of PPMI weighted co-occurrence matrix

Each row of the matrix W is a -dimensional vector

for each word

w

k

This idea originates from Latent Semantic Analysis

(Deerwester

et al., 1990)

(applied on word-document matrix)

Alternative approach: learning word vectors directly from text

Popular methods: word2vec

(Mikolov et al., 2013)

,

Glove

(Pennington et

al., 2014)

,

FastText

(Bojanowski et al., 2017)

Key idea: Instead of counting how often each word co-occurs with another word

and perform matrix factorization, we use the dense vector of to predict (a

machine learning problem!)

w

(27)

Count-based vs prediction-based word vectors

Recommended reading:

(Baroni et al., 2014)

(28)
(29)

Word embeddings

= Learned representations from text for representing words

Output:

v

cat

=

0

B

B

@

0.224

0.130

0.290

0.276

1

C

C

A

<latexit sha1_base64="ZS11t+SATcIQYaaJ4VZuEjXjz0Y=">AAACOXicbZDPShxBEMZrjIk65s8aj3poIoFcsvRsQlSIIHjxuIKrws6y9PTWro09PUN3jbgM8wx5m1zyFt4ELx4U8ZoXSM+uiFE/aPj4VRVd9SW5Vo44vwhmXs2+fjM3vxAuvn33/kNj6eOByworsSMzndmjRDjUymCHFGk8yi2KNNF4mJzs1PXDU7ROZWafxjn2UjEyaqikII/6jfZpv4wJz6j0pKq2wjjBkTJlngqy6qwKv/Jmq/WdxXHIm9E3XpsabfIpaq3/CGM0g4eBfmONN/lE7LmJ7s3a9uqvmAFAu984jweZLFI0JLVwrhvxnHqlsKSkxiqMC4e5kCdihF1vjUjR9crJ5RX77MmADTPrnyE2oY8nSpE6N04T3+n3O3ZPazV8qdYtaLjRK5XJC0Ijpx8NC80oY3WMbKAsStJjb4S0yu/K5LGwQpIPO/QhRE9Pfm4OWs3Ih7rn0/gJU83DCnyCLxDBOmzDLrShAxJ+wyVcw03wJ7gKboO7aetMcD+zDP8p+PsPf3eqbQ==</latexit><latexit sha1_base64="X7JObiHYNXwbsISLOmkjXbSsJws=">AAACOXicbZBNSyNBEIZ7/Fh1dHejHvXQrCx42dCTFT9AQfDiMYJRIRNCT6cSG3t6hu4aMQzzG/w3Xrz5E7wJXjwo4lW825OIrLovNLw8VUVXvVGqpEXGbryR0bHxbxOTU/70zPcfPyuzcwc2yYyAhkhUYo4ibkFJDQ2UqOAoNcDjSMFhdLJT1g9PwViZ6H3sp9CKeU/LrhQcHWpX6qftPEQ4w9yRotjywwh6UudpzNHIs8L/w6q12goNQ59Vg7+sNCXaYENUW1v1Q9Cd94F2ZYlV2UD0qwnezNL24nm4/HJ1Xm9XrsNOIrIYNArFrW0GLMVWzg1KoaDww8xCysUJ70HTWc1jsK18cHlBfzvSod3EuKeRDui/EzmPre3Hket0+x3bz7US/q/WzLC73sqlTjMELYYfdTNFMaFljLQjDQhUfWe4MNLtSsUxN1ygC9t3IQSfT/5qDmrVwIW659LYJENNkgXyiyyTgKyRbbJL6qRBBLkgt+SePHiX3p336D0NW0e8t5l58kHe8yuUTqy7</latexit><latexit sha1_base64="X7JObiHYNXwbsISLOmkjXbSsJws=">AAACOXicbZBNSyNBEIZ7/Fh1dHejHvXQrCx42dCTFT9AQfDiMYJRIRNCT6cSG3t6hu4aMQzzG/w3Xrz5E7wJXjwo4lW825OIrLovNLw8VUVXvVGqpEXGbryR0bHxbxOTU/70zPcfPyuzcwc2yYyAhkhUYo4ibkFJDQ2UqOAoNcDjSMFhdLJT1g9PwViZ6H3sp9CKeU/LrhQcHWpX6qftPEQ4w9yRotjywwh6UudpzNHIs8L/w6q12goNQ59Vg7+sNCXaYENUW1v1Q9Cd94F2ZYlV2UD0qwnezNL24nm4/HJ1Xm9XrsNOIrIYNArFrW0GLMVWzg1KoaDww8xCysUJ70HTWc1jsK18cHlBfzvSod3EuKeRDui/EzmPre3Hket0+x3bz7US/q/WzLC73sqlTjMELYYfdTNFMaFljLQjDQhUfWe4MNLtSsUxN1ygC9t3IQSfT/5qDmrVwIW659LYJENNkgXyiyyTgKyRbbJL6qRBBLkgt+SePHiX3p336D0NW0e8t5l58kHe8yuUTqy7</latexit><latexit sha1_base64="yUhkDlYwUUEoQ+3MeiaCkTTY5/M=">AAACOXicbZBNSyNBEIZ71PVj3NWsHr00BsHLhp4oxgUFwYvHCEaFTAg9nUps7OkZumvEMMzf8uK/8CZ48bCLePUP2JME8euFhpenquiqN0qVtMjYvTc1PfNjdm5+wV/8+WtpufJ75dQmmRHQEolKzHnELSipoYUSFZynBngcKTiLLg/L+tkVGCsTfYLDFDoxH2jZl4KjQ91K86qbhwjXmDtSFPt+GMFA6jyNORp5Xfh/WK1e36Zh6LNasMVKU6K/bIzqjR0/BN17G+hWqqzGRqJfTTAxVTJRs1u5C3uJyGLQKBS3th2wFDs5NyiFgsIPMwspF5d8AG1nNY/BdvLR5QXdcKRH+4lxTyMd0fcTOY+tHcaR63T7XdjPtRJ+V2tn2N/t5FKnGYIW44/6maKY0DJG2pMGBKqhM1wY6Xal4oIbLtCF7bsQgs8nfzWn9VrgQj1m1YO9SRzzZI2sk00SkAY5IEekSVpEkBvyQP6R/96t9+g9ec/j1ilvMrNKPsh7eQWaI6kG</latexit>

v

dog

=

0

B

B

@

0.124

0.430

0.200

0.329

1

C

C

A

<latexit sha1_base64="7A8Mq63LMTMc+l3UeNcP10h1a/Y=">AAACOXicbVBNSxxBFHyjJjGTDzd6zKVRhFyy9KyCBiIIuXgRNsRVYWdZenrero09PUP3G3EZ5m958V94C3jQgyJe8wfSuyuSqAUNRdUr+r1KCq0ccf47mJmde/X6zfzb8N37Dx8XGp8W911eWokdmevcHibCoVYGO6RI42FhUWSJxoPk+MfYPzhB61Ru9mhUYC8TQ6MGSgryUr/RPulXMeEpVWk+rOutME5wqExVZIKsOq3Dr7wZtdZZHIe8ub7Gx8RLLc6n0lrrWxijSR8D/cYKb/IJ2HMSPZCVbbb76woA2v3GRZzmsszQkNTCuW7EC+pVwpKSGuswLh0WQh6LIXY9NSJD16sml9ds1SspG+TWP0Nsov6bqETm3ChL/KTf78g99cbiS163pMFmr1KmKAmNnH40KDWjnI1rZKmyKEmPPBHSKr8rk0fCCkm+7NCXED09+TnZbzUj3+5P38Z3mGIePsMyfIEINmAbdqANHZBwBpdwA7fBeXAd3AX309GZ4CGzBP8h+PMXGHGq4A==</latexit><latexit sha1_base64="uVvmVkONvo7ZgSwrHlzj6B+jPNQ=">AAACOXicbVDPaxNBGJ1tba1rf2zr0ctgKHhpmE0CtdBCQRAvQkSTFrJLmJ39kgyZnV1mvg0Ny/5bXvwvvAlePCjixYP/gJOkiDZ9MPB473vM972kUNIiY5+9jc0HW9sPdx75j3f39g+Cw6O+zUsjoCdylZvrhFtQUkMPJSq4LgzwLFFwlUxfLvyrGRgrc/0e5wXEGR9rOZKCo5OGQXc2rCKEG6zSfFzXF36UwFjqqsg4GnlT+yesGbY6NIp81uy02YI4qcXYSmq3zvwIdPo3MAwarMmWoOskvCWNS/rm3a9p/Ko7DD5FaS7KDDQKxa0dhKzAuOIGpVBQ+1FpoeBiyscwcFTzDGxcLS+v6bFTUjrKjXsa6VL9N1HxzNp5lrhJt9/E3vUW4n3eoMTRi7iSuigRtFh9NCoVxZwuaqSpNCBQzR3hwki3KxUTbrhAV7bvSgjvnrxO+q1m6Np969o4JyvskKfkGXlOQnJKLslr0iU9IsgH8oV8I9+9j95X74f3czW64d1mnpD/4P3+A5WBq/0=</latexit><latexit sha1_base64="uVvmVkONvo7ZgSwrHlzj6B+jPNQ=">AAACOXicbVDPaxNBGJ1tba1rf2zr0ctgKHhpmE0CtdBCQRAvQkSTFrJLmJ39kgyZnV1mvg0Ny/5bXvwvvAlePCjixYP/gJOkiDZ9MPB473vM972kUNIiY5+9jc0HW9sPdx75j3f39g+Cw6O+zUsjoCdylZvrhFtQUkMPJSq4LgzwLFFwlUxfLvyrGRgrc/0e5wXEGR9rOZKCo5OGQXc2rCKEG6zSfFzXF36UwFjqqsg4GnlT+yesGbY6NIp81uy02YI4qcXYSmq3zvwIdPo3MAwarMmWoOskvCWNS/rm3a9p/Ko7DD5FaS7KDDQKxa0dhKzAuOIGpVBQ+1FpoeBiyscwcFTzDGxcLS+v6bFTUjrKjXsa6VL9N1HxzNp5lrhJt9/E3vUW4n3eoMTRi7iSuigRtFh9NCoVxZwuaqSpNCBQzR3hwki3KxUTbrhAV7bvSgjvnrxO+q1m6Np969o4JyvskKfkGXlOQnJKLslr0iU9IsgH8oV8I9+9j95X74f3czW64d1mnpD/4P3+A5WBq/0=</latexit><latexit sha1_base64="QWAMEHLqCoErtEma6Wi/777uLqM=">AAACOXicbVDLSiNBFK12fMT2lRmXbgqD4MZQHQUVHAi4cRnBqJAOobr6JhZWVzdVt8XQ9G/NZv5idgNuXCji1h+w8kB8HSg4nHMPde+JMiUtMvbfm/kxOze/UFn0l5ZXVteqP3+d2zQ3AtoiVam5jLgFJTW0UaKCy8wATyIFF9H18ci/uAFjZarPcJhBN+EDLftScHRSr9q66RUhwi0WcTooy99+GMFA6iJLOBp5W/o7rB409mgY+qy+t8tGxEkNxibSbuPQD0HHb4FetcbqbAz6lQRTUiNTtHrVf2GcijwBjUJxazsBy7BbcINSKCj9MLeQcXHNB9BxVPMEbLcYX17SLafEtJ8a9zTSsfo+UfDE2mESuUm335X97I3E77xOjv2DbiF1liNoMfmonyuKKR3VSGNpQKAaOsKFkW5XKq644QJd2b4rIfh88ldy3qgHrt1TVmseTeuokA2ySbZJQPZJk5yQFmkTQf6QO/JAHr2/3r335D1PRme8aWadfID38gqQ96kA</latexit>

v

the

=

0

B

B

@

0.234

0.266

0.239

0.199

1

C

C

A

<latexit sha1_base64="odYGt+syjpaXyhzBR2lH0qeQ+vM=">AAACOHicbZBBSxtBFMffWtuma1tje6yHoSJ4adjVogYUBC/etGBUyIYwO3lJBmdnl5m3krDsZ+in6cWP0VvpxYMiXv0EnU2CWPUPAz/+7z3mvX+cKWkpCP54c6/mX795W3vnL7z/8HGxvvTpxKa5EdgSqUrNWcwtKqmxRZIUnmUGeRIrPI3P96v66QUaK1N9TOMMOwkfaNmXgpOzuvXDi24REY6ooCGW5a4fxTiQusgSTkaOSj9orG98Z1FUwebmDDaaFXwLGmGz6Ueoew/93fpK0AgmYs8hnMHK3vLPiAHAUbf+O+qlIk9Qk1Dc2nYYZNQpuCEpFJZ+lFvMuDjnA2w71DxB2ykmh5ds1Tk91k+Ne5rYxH08UfDE2nESu06339A+rVXmS7V2Tv3tTiF1lhNqMf2onytGKatSZD1pUJAaO+DCSLcrE0NuuCCXte9CCJ+e/BxO1huhS/CHS2MHpqrBF/gKaxDCFuzBARxBCwT8gr9wDTfepXfl3Xp309Y5bzbzGf6Td/8PLj6qUQ==</latexit><latexit sha1_base64="kC++ZoeREatr7v57EgIo3XrJSss=">AAACOHicbZDPattAEMZXSds46p84ybE9LA0FX2okOzgxJGDIpbe6UDsBy5jVemwvWa3E7sjYCD2DnyaXHvsIuYVeekgpvRZ678o2pXX6wcKPb2bYmS9MpDDoeXfO1vajx092Srvu02fPX+yV9w+6Jk41hw6PZayvQmZACgUdFCjhKtHAolDCZXh9UdQvp6CNiNVHnCfQj9hYiZHgDK01KL+fDrIAYYYZTiDPz90ghLFQWRIx1GKWu161Vj+mQVBAo7GGerOAt17VbzbdANTwT/+gfORVvaXoQ/DXcNR6tQgqvz4v2oPybTCMeRqBQi6ZMT3fS7CfMY2CS8jdIDWQMH7NxtCzqFgEpp8tD8/pG+sM6SjW9imkS/fviYxFxsyj0Hba/SZms1aY/6v1Uhyd9jOhkhRB8dVHo1RSjGmRIh0KDRzl3ALjWthdKZ8wzTjarF0bgr958kPo1qq+TfCDTeOMrFQiL8lrUiE+OSEt8o60SYdwckO+kHvyzfnkfHW+Oz9WrVvOeuaQ/CPn529DFayf</latexit><latexit sha1_base64="kC++ZoeREatr7v57EgIo3XrJSss=">AAACOHicbZDPattAEMZXSds46p84ybE9LA0FX2okOzgxJGDIpbe6UDsBy5jVemwvWa3E7sjYCD2DnyaXHvsIuYVeekgpvRZ678o2pXX6wcKPb2bYmS9MpDDoeXfO1vajx092Srvu02fPX+yV9w+6Jk41hw6PZayvQmZACgUdFCjhKtHAolDCZXh9UdQvp6CNiNVHnCfQj9hYiZHgDK01KL+fDrIAYYYZTiDPz90ghLFQWRIx1GKWu161Vj+mQVBAo7GGerOAt17VbzbdANTwT/+gfORVvaXoQ/DXcNR6tQgqvz4v2oPybTCMeRqBQi6ZMT3fS7CfMY2CS8jdIDWQMH7NxtCzqFgEpp8tD8/pG+sM6SjW9imkS/fviYxFxsyj0Hba/SZms1aY/6v1Uhyd9jOhkhRB8dVHo1RSjGmRIh0KDRzl3ALjWthdKZ8wzTjarF0bgr958kPo1qq+TfCDTeOMrFQiL8lrUiE+OSEt8o60SYdwckO+kHvyzfnkfHW+Oz9WrVvOeuaQ/CPn529DFayf</latexit><latexit sha1_base64="nKGHRyDYykkoHfjg5UdytxCiaVE=">AAACOHicbZBNSyNBEIZ7/Hb8iu7RS7NB8GKYUVEDCsJe9qbCRoVMCD2dStKkp2forgkJw/wsL/4Mb8tePLiIV3+BPXEQv15oeHiriq56w0QKg57315manpmdm19YdJeWV1bXKusblyZONYcGj2Wsr0NmQAoFDRQo4TrRwKJQwlU4+FXUr4agjYjVHxwn0IpYT4mu4Ayt1a6cDdtZgDDCDPuQ5yduEEJPqCyJGGoxyl2vtru3T4OggIODEvbqBex4Nb9edwNQnbf+dqXq1byJ6FfwS6iSUuftyl3QiXkagUIumTFN30uwlTGNgkvI3SA1kDA+YD1oWlQsAtPKJofndMs6HdqNtX0K6cR9P5GxyJhxFNpOu1/ffK4V5ne1Zordo1YmVJIiKP76UTeVFGNapEg7QgNHObbAuBZ2V8r7TDOONmvXhuB/PvkrXO7WfJvghVc9PS7jWCCb5CfZJj45JKfkNzknDcLJDflHHsh/59a5dx6dp9fWKaec+UE+yHl+AUjqqOo=</latexit>

v

language

=

0

B

B

@

0.290

0.441

0.762

0.982

1

C

C

A

<latexit sha1_base64="b4xc+Okp2p3fvc/1+aow+HYTGjA=">AAACPXicbZBNaxsxEIZn03y42yZ1m2MvIibQSxetCU0CKRh66aWQktgJeI3RymNHRKtdpNkQs+wf66X/obfecsmhpfTaa2U7hHy9IHh4ZwbNvGmhlSPOfwZLz5ZXVtcaz8MXL9c3XjVfv+m5vLQSuzLXuT1NhUOtDHZJkcbTwqLIUo0n6fmnWf3kAq1TuTmmaYGDTEyMGispyFvD5vHFsEoIL6nSwkxKMcG6/hgmKU6UqYpMkFWXdcij9j5nSRK+59HOTjwjHu1+aC9gf68dJmhGt/3DZotHfC72GOIbaHXYl6MrADgcNn8ko1yWGRqSWjjXj3lBg0pYUlJjHSalw0LIc79e36MRGbpBNb++ZtveGbFxbv0zxObu3YlKZM5Ns9R3+v3O3MPazHyq1i9pvDeolClKQiMXH41LzShnsyjZSFmUpKcehLTK78rkmbBCkg889CHED09+DL12FPMo/urTOICFGvAWtuAdxLALHfgMh9AFCd/gCn7B7+B7cB38Cf4uWpeCm5lNuKfg3386Saz9</latexit><latexit sha1_base64="hZMi7uSxdZH/uZAh+zd/W8RR0ZM=">AAACPXicbZBNSxxBEIZ7jPFjNLqao5cmInhx6FnED1AQAiEXQYmrws6w9PTWrs329AzdNcsuw/yxXPIfvHnLJYeI6NGrvbsiifpCw8NbVXTVm+RKWmTsxpv6MP1xZnZu3l9Y/LS0XFtZPbdZYQQ0RKYyc5lwC0pqaKBEBZe5AZ4mCi6S3tdR/aIPxspMn+EwhzjlXS07UnB0Vqt21m+VEcIAS8V1t+BdqKpDP0qgK3WZpxyNHFQ+C+r7jEaRv8WC7e1wRCzY3alPYH+v7keg2y/9rdo6C9hY9C2Ez7B+RI9/PPTibyet2nXUzkSRgkahuLXNkOUYl9ygFAoqPyos5Fz03HpNh5qnYONyfH1FN5zTpp3MuKeRjt1/J0qeWjtME9fp9ruyr2sj871as8DOXlxKnRcIWkw+6hSKYkZHUdK2NCBQDR1wYaTblYorbrhAF7jvQghfn/wWzutByILw1KVxQCaaI2vkC9kkIdklR+Q7OSENIshP8pv8JbfeL++Pd+fdT1qnvOeZz+Q/eY9Pt1muGg==</latexit><latexit sha1_base64="hZMi7uSxdZH/uZAh+zd/W8RR0ZM=">AAACPXicbZBNSxxBEIZ7jPFjNLqao5cmInhx6FnED1AQAiEXQYmrws6w9PTWrs329AzdNcsuw/yxXPIfvHnLJYeI6NGrvbsiifpCw8NbVXTVm+RKWmTsxpv6MP1xZnZu3l9Y/LS0XFtZPbdZYQQ0RKYyc5lwC0pqaKBEBZe5AZ4mCi6S3tdR/aIPxspMn+EwhzjlXS07UnB0Vqt21m+VEcIAS8V1t+BdqKpDP0qgK3WZpxyNHFQ+C+r7jEaRv8WC7e1wRCzY3alPYH+v7keg2y/9rdo6C9hY9C2Ez7B+RI9/PPTibyet2nXUzkSRgkahuLXNkOUYl9ygFAoqPyos5Fz03HpNh5qnYONyfH1FN5zTpp3MuKeRjt1/J0qeWjtME9fp9ruyr2sj871as8DOXlxKnRcIWkw+6hSKYkZHUdK2NCBQDR1wYaTblYorbrhAF7jvQghfn/wWzutByILw1KVxQCaaI2vkC9kkIdklR+Q7OSENIshP8pv8JbfeL++Pd+fdT1qnvOeZz+Q/eY9Pt1muGg==</latexit><latexit sha1_base64="QL8ADr7tR9srk1Wpqs+QUhsdqR4=">AAACPXicbZBLSywxEIXT6r1q672OunQTHAQ3Nulh8AEKghuXCo4K08OQztSMwXS6SarFoek/5sb/4M6dGxeKuHVr5oH4OhD4OFVFqk6cKWmRsXtvYnLqz9/pmVl/bv7f/4XK4tKpTXMjoCFSlZrzmFtQUkMDJSo4zwzwJFZwFl8eDOpnV2CsTPUJ9jNoJbynZVcKjs5qV06u2kWEcI2F4rqX8x6U5Z4fxdCTusgSjkZelz4LajuMRpG/wYJ6PRwQC7Y2ayPY2a75EejOR3+7UmUBG4r+hHAMVTLWUbtyF3VSkSegUShubTNkGbYKblAKBaUf5RYyLi7dek2HmidgW8Xw+pKuOadDu6lxTyMdup8nCp5Y209i1+n2u7DfawPzt1ozx+52q5A6yxG0GH3UzRXFlA6ipB1pQKDqO+DCSLcrFRfccIEucN+FEH4/+Sec1oKQBeExq+7vjuOYIStklayTkGyRfXJIjkiDCHJDHsgTefZuvUfvxXsdtU5445ll8kXe2zuyz6sd</latexit>

f : V

<latexit sha1_base64="v4LU3fRnmQUVApJPSDxJstTKueI=">AAACBnicbVDLSsNAFJ3UV62vqEsRBovgqiQiKK6KblxWsQ9oYplMJu3QyUyYmSgldOXGX3HjQhG3foM7/8ZJm4W2HrhwOOde7r0nSBhV2nG+rdLC4tLySnm1sra+sbllb++0lEglJk0smJCdACnCKCdNTTUjnUQSFAeMtIPhZe6374lUVPBbPUqIH6M+pxHFSBupZ+9H57AFPUn7A42kFA/Qi5EeBEF2M74Le3bVqTkTwHniFqQKCjR69pcXCpzGhGvMkFJd10m0nyGpKWZkXPFSRRKEh6hPuoZyFBPlZ5M3xvDQKCGMhDTFNZyovycyFCs1igPTmd+oZr1c/M/rpjo68zPKk1QTjqeLopRBLWCeCQypJFizkSEIS2puhXiAJMLaJFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AAcARecgjq4Ag3QBBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gff4AqouYng==</latexit><latexit sha1_base64="v4LU3fRnmQUVApJPSDxJstTKueI=">AAACBnicbVDLSsNAFJ3UV62vqEsRBovgqiQiKK6KblxWsQ9oYplMJu3QyUyYmSgldOXGX3HjQhG3foM7/8ZJm4W2HrhwOOde7r0nSBhV2nG+rdLC4tLySnm1sra+sbllb++0lEglJk0smJCdACnCKCdNTTUjnUQSFAeMtIPhZe6374lUVPBbPUqIH6M+pxHFSBupZ+9H57AFPUn7A42kFA/Qi5EeBEF2M74Le3bVqTkTwHniFqQKCjR69pcXCpzGhGvMkFJd10m0nyGpKWZkXPFSRRKEh6hPuoZyFBPlZ5M3xvDQKCGMhDTFNZyovycyFCs1igPTmd+oZr1c/M/rpjo68zPKk1QTjqeLopRBLWCeCQypJFizkSEIS2puhXiAJMLaJFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AAcARecgjq4Ag3QBBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gff4AqouYng==</latexit><latexit sha1_base64="v4LU3fRnmQUVApJPSDxJstTKueI=">AAACBnicbVDLSsNAFJ3UV62vqEsRBovgqiQiKK6KblxWsQ9oYplMJu3QyUyYmSgldOXGX3HjQhG3foM7/8ZJm4W2HrhwOOde7r0nSBhV2nG+rdLC4tLySnm1sra+sbllb++0lEglJk0smJCdACnCKCdNTTUjnUQSFAeMtIPhZe6374lUVPBbPUqIH6M+pxHFSBupZ+9H57AFPUn7A42kFA/Qi5EeBEF2M74Le3bVqTkTwHniFqQKCjR69pcXCpzGhGvMkFJd10m0nyGpKWZkXPFSRRKEh6hPuoZyFBPlZ5M3xvDQKCGMhDTFNZyovycyFCs1igPTmd+oZr1c/M/rpjo68zPKk1QTjqeLopRBLWCeCQypJFizkSEIS2puhXiAJMLaJFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AAcARecgjq4Ag3QBBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gff4AqouYng==</latexit><latexit sha1_base64="v4LU3fRnmQUVApJPSDxJstTKueI=">AAACBnicbVDLSsNAFJ3UV62vqEsRBovgqiQiKK6KblxWsQ9oYplMJu3QyUyYmSgldOXGX3HjQhG3foM7/8ZJm4W2HrhwOOde7r0nSBhV2nG+rdLC4tLySnm1sra+sbllb++0lEglJk0smJCdACnCKCdNTTUjnUQSFAeMtIPhZe6374lUVPBbPUqIH6M+pxHFSBupZ+9H57AFPUn7A42kFA/Qi5EeBEF2M74Le3bVqTkTwHniFqQKCjR69pcXCpzGhGvMkFJd10m0nyGpKWZkXPFSRRKEh6hPuoZyFBPlZ5M3xvDQKCGMhDTFNZyovycyFCs1igPTmd+oZr1c/M/rpjo68zPKk1QTjqeLopRBLWCeCQypJFizkSEIS2puhXiAJMLaJFcxIbizL8+T1nHNdWru9Um1flHEUQZ74AAcARecgjq4Ag3QBBg8gmfwCt6sJ+vFerc+pq0lq5jZBX9gff4AqouYng==</latexit>

! R

d

V: a pre-defined vocabulary

d: dimension of word vectors (e.g. 300)

Text corpora:

Wikipedia + Gigaword 5: 6B tokens

Twitter: 27B tokens

Common Crawl: 840B tokens

Input: a large text corpora, V, d

Each word is represented by a low-dimensional (e.g., d = 300), real-valued vector

Each coordinate/dimension of the vector doesn’t have a particular interpretation

(30)

Word embeddings

Basic property: similar words have similar vectors

word = “sweden”

ranges between -1 and 1

(31)

Word embeddings

(Pennington et al, 2014): GloVe: Global Vectors for Word Representation

(32)

Word embeddings

They have some other nice properties too!

<latexit sha1_base64="es0YwS886huHywvI34iQv96Vwjg=">AAACNXicbVC7TgMxEPTxDOEVoKSxeEg0RHcUQImgoaAIEgGkJIp8ziZY8dmHvReITvkpCvgA/oAKCgoQoqWCGidBvEeyNJqZ1XonjKWw6Pt33sDg0PDIaGYsOz4xOTWdm5k9tDoxHIpcS22OQ2ZBCgVFFCjhODbAolDCUdjc6fpHLTBWaHWA7RgqEWsoURecoZOqub1WNS0jnGMaMdXp0FX6KZzpvlRmcWz0+ZfRFKrxM3qaALhoNbfo5/0e6F8SfJDFraXXy+vW+Fuhmrsp1zRPIlDIJbO2FPgxVlJmUHAJnWw5sRAz3mQNKDmqWAS2kvau7tBlp9RoXRv3FNKe+n0iZZG17Sh0yYjhif3tdcX/vFKC9c1KKlScICjeX1RPJEVNuxXSmjDAUbYdYdwI91fKT5hhHF3RWVdC8Pvkv+RwLR+s5/1918Y26SND5skCWSEB2SBbZJcUSJFwckFuyQN59K68e+/Je+5HB7yPmTnyA97LO4dnsro=</latexit>

(33)

Word embeddings

They have some other nice properties too!

(Mikolov et al, 2013): Exploiting Similarities among Languages for Machine Translation

<latexit sha1_base64="NBjnMjX2WTMYbZaOLaPgzLpJ6NY=">AAACEnicbVBNS8NAEN34WetX1KOXxSK0l5KIqMeiF48V7Ae0pWy2m3bpJht2J6Ul5Dd48a948aCIV0/e/Ddu2wja+mDg8d4MM/O8SHANjvNlrayurW9s5rby2zu7e/v2wWFdy1hRVqNSSNX0iGaCh6wGHARrRoqRwBOs4Q1vpn5jxJTmMryHScQ6AemH3OeUgJG6dmlUbAMbQ0JjAkqmJdwmUaTkGDfwj+WbZWmpaxecsjMDXiZuRgooQ7Vrf7Z7ksYBC4EKonXLdSLoJEQBp4Kl+XasWUTokPRZy9CQBEx3ktlLKT41Sg/7UpkKAc/U3xMJCbSeBJ7pDAgM9KI3Ff/zWjH4V52Eh1EMLKTzRX4sMEg8zQf3uGIUxMQQQhU3t2I6IIpQMCnmTQju4svLpH5Wdi/Kzt15oXKdxZFDx+gEFZGLLlEF3aIqqiGKHtATekGv1qP1bL1Z7/PWFSubOUJ/YH18A4VNngE=</latexit>

References

Related documents