Multilingual Semantics from Big Data on the Web. Learning Multilingual

156  Download (0)

Full text

(1)

Learning Multilingual Semantics from Big Data on the Web

Gerard de Melo

Assistant Professor, Tsinghua University

http://gerard.demelo.org

Learning Multilingual Semantics from Big Data on the Web

Gerard de Melo

Assistant Professor, Tsinghua University

(2)
(3)
(4)

Big Data on the Web

Big Data on the WebBig Data on the Web Big Data on the Web

(5)

From Big Data to

From Big Data to

Multilingual Semantics?

Multilingual Semantics?

From Big Data to

From Big Data to

Multilingual Semantics?

Multilingual Semantics?

Image:

(6)

Manual Knowledge Organization Manual Knowledge Organization

Image: http://commons.wikimedia.org/wiki/File:Mundaneum_Tir%C3%A4ng_Karteikaarten.jpg Universal Bibliographic Repertory

(Repertoire Bibliographique Universel, RBU) by Paul Otlet and Henri La Fontaine in 1895

index cards with answers to queries

Universal Bibliographic Repertory

(Repertoire Bibliographique Universel, RBU) by Paul Otlet and Henri La Fontaine in 1895

(7)

Manual Knowledge Organization Manual Knowledge Organization

Image: Mundaneum

Universal Bibliographic Repertory

(Repertoire Bibliographique Universel, RBU) by Paul Otlet and Henri La Fontaine in 1895

index cards with answers to queries

Universal Bibliographic Repertory

(Repertoire Bibliographique Universel, RBU) by Paul Otlet and Henri La Fontaine in 1895

index cards with answers to queries

Alex Wright: This was a sort of “analog search engine”

Alex Wright: This was a sort of “analog search engine”

(8)

Zipfian Distribution

Zipfian DistributionZipfian Distribution Zipfian Distribution

(9)

Big Data on the Web

Big Data on the WebBig Data on the Web Big Data on the Web

(10)

Goal: Large Yet

Goal: Large Yet

Reasonably Clean Knowledge

Reasonably Clean Knowledge

Goal: Large Yet

Goal: Large Yet

Reasonably Clean Knowledge

Reasonably Clean Knowledge

(11)

Outline Outline

Large-Scale Knowledge Graphs

Semantics in Action Models for the Future

(12)

Outline Outline

Large-Scale Knowledge Graphs

Semantics in Action Models for the Future

(13)

Lexical Knowledge

Portuguese-Chinese Dictionary by Ruggieri et al. (1580s) The first European-Chinese dictionary

(14)

Provides translations, antonyms, etc.

Wiktionary Wiktionary

(15)

Wiktionary Wiktionary

(16)

Wiktionary Wiktionary

(17)

e.g. “salary” < Lat. “salarius” < Lat. “sal” (salt)

Etymological Wordnet Etymological Wordnet

(18)

LREC 2014

LREC 2014

Etymological Wordnet Etymological Wordnet

(19)

LREC 2014

LREC 2014

Etymological Wordnet Etymological Wordnet

(20)

Etymological Wordnet Etymological Wordnet Old English Example Old English Example

(21)

Lexical Ambiguities Lexical Ambiguities

(22)

Hipsters in London Images: https://www.flickr.com/photos/poisonbabyfood/4274634681 https://www.facebook.com/alexander.balabanov.82 Lexical Ambiguities Lexical Ambiguities

(23)

Reunion

Lexical Ambiguities Lexical Ambiguities

(24)

Reunion

Images:

https://commons.wikimedia.org/wiki/File:Reunions_Class_of_82_2007.jpg

https://commons.wikimedia.org/wiki/File:Riviere_Langevin_Trou_Noir_P1440224-35.jpg and many more...

and many more...

Lexical Ambiguities Lexical Ambiguities

(25)
(26)

Multilingual Lexical Knowledge

(27)

UWN: Universal Wordnet

Before:

manual work over two decades but not many large wordnets

Before:

manual work over two decades but not many large wordnets

Our Approach:

● Exploit translation

resources on the Web

● Learn regression model

with sophisticated graph-based features

Our Approach:

● Exploit translation

resources on the Web

● Learn regression model

with sophisticated graph-based features

(28)

UWN: Universal Wordnet

(29)

UWN: Universal Wordnet

over 1,000,000 words in over 100 languages

CIKM 2009

CIKM 2009 ICGL 2008ICGL 2008

Best Paper Award

Best Paper Award

ICGL 2008

ICGL 2008

Best Paper Award

Best Paper Award

(30)
(31)

UWN: Getting Started UWN: Getting Started

Simple API for JVM Languages

val uwn = new UWN(new File("plugins/"))

for (m <- uwn.getMeanings("souris", "fra")) println(m)

Or Just Download the TSV File

Simple API for JVM Languages

val uwn = new UWN(new File("plugins/"))

for (m <- uwn.getMeanings("souris", "fra")) println(m)

(32)

Adding Other Sources Gerard de Melo Language-specific, Language-specific, Domain-specific, Domain-specific, Arbitrary Databases Arbitrary Databases Language-specific, Language-specific, Domain-specific, Domain-specific, Arbitrary Databases Arbitrary Databases

(33)

Adding Other Sources

Adding Other SourcesAdding Other Sources Adding Other Sources

(34)

Adding Other Sources

Adding Other SourcesAdding Other Sources Adding Other Sources

Rob Matthews: printed small sample of Wikipedia

Actually, a printed Wikipedia corresponds to 2000 Britannica volumes Source: http://www.labnol.org/internet/wikipedia-printed-book/9136/ Actually, a printed Wikipedia corresponds to 2000 Britannica volumes Source: http://www.labnol.org/internet/wikipedia-printed-book/9136/

(35)

ACL 2010 AAAI 2013 ACL 2010 AAAI 2013 Use Identity Links to connect What is equivalent

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

(36)

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

ACL 2010 AAAI 2013

ACL 2010 AAAI 2013

(37)

Merging Structured Data Merging Structured Data

Trentino

(38)

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

One bad link is

One bad link is

enough to make a enough to make a connected component connected component inconsistent inconsistent

One bad link is

One bad link is

enough to make a enough to make a connected component connected component inconsistent inconsistent ACL 2010 AAAI 2013 ACL 2010 AAAI 2013

(39)

Source: Peter Mika

Entity Integration: Challenges

Entity Integration: Challenges

(40)

Merging Structured Data Merging Structured Data

Distinctness Assertions

Di =

({en: Province of Trento, en:Trentino},

{en:Trentino-South Tyrol,

en:Trentino-Alto Adige/Südtirol})

Distinctness Assertions

Di =

({en: Province of Trento, en:Trentino}, {en:Trentino-South Tyrol, en:Trentino-Alto Adige/Südtirol}) ACL 2010 AAAI 2013 ACL 2010 AAAI 2013

(41)

How to reconcile How to reconcile equivalence equivalence and and distinctness distinctness evidence? evidence? How to reconcile How to reconcile equivalence equivalence and and distinctness distinctness evidence? evidence? a) ignore some a) ignore some equivalence information equivalence information

(delete certain edges)

(delete certain edges)

a) ignore some

a) ignore some

equivalence information

equivalence information

(delete certain edges)

(delete certain edges) b) ignore some

b) ignore some

distinctness information

distinctness information

(remove node from

(remove node from

distinctness assertion) distinctness assertion) b) ignore some b) ignore some distinctness information distinctness information

(remove node from

(remove node from

distinctness assertion)

distinctness assertion)

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

ACL 2010 AAAI 2013

ACL 2010 AAAI 2013

(42)

Min. cost solution:

Min. cost solution:

NP-hard

NP-hard

APX-hard

APX-hard

Min. cost solution:

Min. cost solution:

NP-hard

NP-hard

APX-hard

APX-hard

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

ACL 2010 AAAI 2013

ACL 2010 AAAI 2013

(43)

Finally, use region growing

Finally, use region growing

algorithm in the spirit

algorithm in the spirit

of Leighton & Rao 1988

of Leighton & Rao 1988

Finally, use region growing

Finally, use region growing

algorithm in the spirit

algorithm in the spirit

of Leighton & Rao 1988

of Leighton & Rao 1988 Linear Program Relaxation

Linear Program Relaxation

Linear Program Relaxation

Linear Program Relaxation

Approximation Guarantee:

Approximation Guarantee:

4ln(nq+1)

4ln(nq+1)

for n distinctness assertions,

for n distinctness assertions,

q=max |D

q=max |Di,ji,j||

but independent of |D but independent of |Dii| !| ! Approximation Guarantee: Approximation Guarantee: 4ln(nq+1) 4ln(nq+1)

for n distinctness assertions,

for n distinctness assertions,

q=max |D q=max |D i,j i,j|| but independent of |D but independent of |Dii| !| !

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

(44)

Linear Program Relaxation

Linear Program Relaxation

Linear Program Relaxation

Linear Program Relaxation

Nice:

Nice:

This generalizes the

This generalizes the

Hungarian Algorithm Hungarian Algorithm to various advanced to various advanced types of non-standard types of non-standard matchings matchings (cf. de Melo. AAAI 2013) (cf. de Melo. AAAI 2013) Nice: Nice:

This generalizes the

This generalizes the

Hungarian Algorithm Hungarian Algorithm to various advanced to various advanced types of non-standard types of non-standard matchings matchings (cf. de Melo. AAAI 2013) (cf. de Melo. AAAI 2013)

Merging Structured Data

Merging Structured DataMerging Structured Data Merging Structured Data

(45)

Separated Concepts Separated Concepts (Multilingual Wikipedia) (Multilingual Wikipedia) Separated Concepts Separated Concepts (Multilingual Wikipedia) (Multilingual Wikipedia)

(46)

Application: Lexvo.org Semantic Web Semantic Web Journal 2014 Journal 2014 Semantic Web Semantic Web Journal 2014 Journal 2014

(47)

Lexvo.org Lexvo.org Semantic Web Semantic Web Journal 2014 Journal 2014 Semantic Web Semantic Web Journal 2014 Journal 2014

(48)

Semantic Web Semantic Web Journal 2014 Journal 2014 Semantic Web Semantic Web Journal 2014 Journal 2014 Interdisciplinary Interdisciplinary Work, e.g. in Work, e.g. in Digital Humanities Digital Humanities Interdisciplinary Interdisciplinary Work, e.g. in Work, e.g. in Digital Humanities Digital Humanities Lexvo.org Lexvo.org

(49)

Taxonomic Organization

a user wants a list of

Art Schools in Europe

(50)

Multilingual Taxonomies a Swedish user wants a list of „Konstskolor i Europa

(51)

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach

(52)

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach

(53)

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach

(54)

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach Taxonomic Integration: Taxonomic Integration: MENTA Approach MENTA Approach

(55)

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

De Melo & Weikum (2010).

CIKM Best Interdisciplinary Paper Award

Predict Individual Taxonomic Links: Article → Category Category → WordNet Predict Individual Taxonomic Links: Article → Category Category → WordNet Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(56)

Predict Individual Taxonomic Links: Article → Category Category → WordNet Predict Individual Taxonomic Links: Article → Category Category → WordNet Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(57)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(58)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA Image: https://de.wikipedia.org/wiki/Datei:Bersntol_palae.jpg Fersental

(Bersntol, Valle dei Mòcheni)

Fersental

(59)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(60)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA https://de.wikipedia.org/wiki/Datei:Language_distribution_Trentino_2011.png Fersental

(Bersntol, Valle dei Mòcheni)

Fersental

(61)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(62)

Use Identity Constraint Algorithm to form

equivalence classes

Use Identity Constraint Algorithm to form

equivalence classes

Markov Chain Random Walk with Restarts

to Rank Parents

Markov Chain Random Walk with Restarts

to Rank Parents Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(63)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(64)

Taxonomic Integration: Taxonomic Integration: MENTA MENTA Taxonomic Integration: Taxonomic Integration: MENTA MENTA

(65)

Bansal et al.

Bansal et al.

ACL 2014. Best Paper Runner-Up

ACL 2014. Best Paper Runner-Up

Bansal et al.

Bansal et al.

ACL 2014. Best Paper Runner-Up

ACL 2014. Best Paper Runner-UpBansal et al.ACL 2014. Best Paper Runner-UpBansal et al.ACL 2014. Best Paper Runner-UpBansal et al.ACL 2014. Best Paper Runner-UpBansal et al.ACL 2014. Best Paper Runner-Up Belief Propagation

Belief Propagation

exploiting Kirchhoff’s

exploiting Kirchhoff’s

Matrix Tree Theorem

Matrix Tree Theorem

for efficient handling of

for efficient handling of

tree factor tree factor Belief Propagation Belief Propagation exploiting Kirchhoff’s exploiting Kirchhoff’s

Matrix Tree Theorem

Matrix Tree Theorem

for efficient handling of

for efficient handling of

tree factor

tree factor

Chu-Liu-Edmonds

Chu-Liu-Edmonds

directed spanning tree

directed spanning tree

algorithm for decoding

algorithm for decoding

Chu-Liu-Edmonds

Chu-Liu-Edmonds

directed spanning tree

directed spanning tree

algorithm for decoding

algorithm for decoding New Algorithm:

Structured Output Prediction New Algorithm:

(66)

UWN/MENTA

CIKM 2010

CIKM 2010

Best Paper Award

Best Paper Award

CIKM 2010

CIKM 2010

Best Paper Award

Best Paper Award Biggest (ontological) Biggest (ontological) taxonomy taxonomy Biggest (ontological) Biggest (ontological) taxonomy taxonomy

(67)

UWN/MENTA

multilingual extension of WordNet for

word senses and taxonomical information over 200 languages

(68)

Outline Outline

Large-Scale Knowledge Graphs

Semantics in Action

(69)

Language Education

Language EducationLanguage Education Language Education

(70)

UWN UWNUWN UWN http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

(71)

UWN UWNUWN UWN http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

(72)

UWN UWNUWN UWN http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

(73)

UWN UWNUWN UWN http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

(74)

Application: Sense-Disambiguated Application: Sense-Disambiguated Example Sentences Example Sentences Application: Sense-Disambiguated Application: Sense-Disambiguated Example Sentences Example Sentences

(75)

Application: Sense-Disambiguated Application: Sense-Disambiguated Example Sentences Example Sentences Application: Sense-Disambiguated Application: Sense-Disambiguated Example Sentences Example Sentences

(76)

Application: Sense-Disambiguated Application: Sense-Disambiguated Example Sentences Example Sentences Application: Sense-Disambiguated Application: Sense-Disambiguated Example Sentences Example Sentences

(77)

Application:

Monolingual Language Users Application:

(78)

Application:

Monolingual Language Users Application:

(79)

Thesauri Thesauri

(80)

Thesauri Thesauri

(81)

Application: Machine Translation Application: Machine Translation

OpenWN-PT:

Used by Google Translate

OpenWN-PT:

(82)

Machine Learning Machine Learning

Examples

Incorrect Correct

(83)

Machine Learning Machine Learning Examples Learning Learning Incorrect

(84)

Machine Learning Machine Learning Examples Learning Learning Incorrect

Correct ClassifierModel

(85)

Machine Learning Machine Learning

Examples ProbablyIncorrect!

Learning

Learning PredictionPrediction

Incorrect

(86)

Better Machine Learning Better Machine Learning

Examples ProbablyIncorrect!

Learning

Learning PredictionPrediction

Incorrect

Correct ClassifierModel

Better Classifier!

+

Better Labels for Test Data

(87)
(88)
(89)
(90)

UWN Senses in MT? Issue: Senses should be less fine-grained Issue: Senses should be less fine-grained

(91)

No Word Left Behind

(92)

No Word Left Behind

(93)

No Word Left Behind

(94)

No Word Left Behind

(95)

Similar: Part-Of-Speech Tagging Similar: Part-Of-Speech Tagging

● British fans gathered at the stadium to... ADJECTIVE “Didgeridoo” is similar to: “horn” (NOUN) “drums” (NOUN) “accordion” (NOUN) “Didgeridoo” is similar to: “horn” (NOUN) “drums” (NOUN) “accordion” (NOUN)

Didgeridoo fans gathered at the park to...

(96)

Similar: Part-Of-Speech Tagging Similar: Part-Of-Speech Tagging

● British fans gathered at the stadium to... ADJECTIVE Gaelic “didiridiú” translates to “didgeridoo” (NOUN) in English Gaelic “didiridiú” translates to “didgeridoo” (NOUN) in English ...Astrálach is ea an didiridiú ???

(97)

Sentence Level Sentence Level

(98)

Sentence Level Sentence Level

(99)

Sentence Level Sentence Level

(100)

What about

Document-Level Tasks? What about

Document-Level Tasks?

(101)

“new” 1.0 “york” 1.0 “jaguar” 1.0 “automobile” 0.0 “car” 0.0 “10th” 1.0 “street” 1.0 “show” 1.0 ... ... New_York 1.0 Jaguar (car) 0.0 Jaguar (animal) 1.0 Automobile/Car 0.0 10th Street 1.0 Performance 1.0 ... ...

“10th street new york jaguar show” Similar:

“10th New show in York” “New Jaguar show”

“Show New Street in York”

“10th street new york jaguar show” Similar:

“10th street nyc jaguar show”

Document Level Document Level

(102)

“new” 1.0 “york” 1.0 “jaguar” 1.0 “automobile” 0.0 “car” 0.0 “10th” 1.0 “street” 1.0 “show” 1.0 ... ... New_York 1.0 Jaguar (car) 0.0 Jaguar (animal) 1.0 Automobile/Car 0.0 10th Street 1.0 Performance 1.0 ... ... Animal 0.5 Vehicle 0.0

“10th street new york jaguar show” Similar:

“10th New show in York” “New Jaguar show”

“Show New Street in York”

“10th street new york jaguar show” Similar:

“10th street nyc jaguar show” “10th street nyc animal show”

“Exposición de jaguares Nueva York”

Expansion (de Melo &

Siersdorfer 2007)

Document Level Document Level

(103)

Given: training documents with class labels

Goal: guess class labels for test documents in some other language

Result: better than plain machine translation. See de Melo & Siersdorfer 2007.

Multilingual Tasks:

Cross-Lingual Text Classification Multilingual Tasks:

(104)

Underlying frame: Commercial transfer

Capture the “who-did-what-to-whom” Microsoft bought the patent from Nokia. Nokia sold the patent to Microsoft.

The patent was acquired by Microsoft [from Nokia]. The patent was sold [by Nokia] to Microsoft.

Sentence-Level Semantics Sentence-Level Semantics

Buyer: Microsoft Seller: Nokia

(105)

FrameBase.org

Bringing knowledge into a standard form based on natural language (FrameNet)

Bringing knowledge into a standard form based on natural language (FrameNet)

ESWC 2015 Best Student Paper Nominee ESWC 2015 Best Student Paper Nominee

(106)

Relation Integration Relation Integration X isAuthorOf Y Y writtenBy X X wrote Y Y writtenInYear Z ESWC 2015 Best Student Paper Nominee ESWC 2015 Best Student Paper Nominee

(107)

Relation Integration Relation Integration

YAGO: isMarriedTo predicate

YAGO: isMarriedTo predicate

Freebase: Marriage Entity

Freebase: Marriage Entity

Challenge: Modelling Differences Challenge: Modelling Differences

(108)

Search Interfaces

“Which companies were created during the last century in Silicon Valley ?”

YAGO2:

WWW 2011 Best Demo Award

YAGO2:

WWW 2011 Best Demo Award

(109)

Answering Questions

IBM's Jeopardy!-winning Watson system

(110)

Answering Questions

IBM's Jeopardy!-winning Watson system

(111)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

(112)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

The Roman Empire was remarkably

multicultural, with ”a rather astonishing cohesive capacity” to create a sense of shared identity while encompassing diverse peoples within its political

system over a long span of time.

(113)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

The Roman Empire was remarkably

multicultural, with ”a rather astonishing cohesive capacity” to create a sense of shared identity while encompassing diverse peoples within its political

system over a long span of time. syntactic

(114)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

The Roman Empire was remarkably

multicultural, with ”a rather astonishing cohesive capacity” to create a sense of shared identity while encompassing diverse peoples within its political

system over a long span of time. syntactic semantic!

(115)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

The Roman Empire was remarkably

multicultural, with ”a rather astonishing cohesive capacity” to create a sense of shared identity while encompassing diverse peoples within its political

system over a long span of time. semantic!

syntactic syntactic?

(116)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

The Roman Empire was remarkably

multicultural, with ”a rather astonishing cohesive capacity” to create a sense of shared identity while encompassing diverse peoples within its political

system over a long span of time. semantic!

syntactic syntactic? ?

(117)

What Goes into Word Vectors?

What Goes into Word Vectors?What Goes into Word Vectors? What Goes into Word Vectors?

The Roman Empire was remarkably

multicultural, with ”a rather astonishing cohesive capacity” to create a sense of shared identity while encompassing diverse peoples within its political

system over a long span of time. semantic! syntactic ? Word2Vec Solution: Subsampling Word2Vec Solution: Subsampling syntactic?

(118)

Word2Vec Approach

Word2Vec ApproachWord2Vec Approach Word2Vec Approach Alexandre Duret-Lutz https://www.flickr.com/photos/gadl/110845690/ Take everything we can get Take everything we can get

(119)

Our Proposal:

Our Proposal:

Extract the Most Valuable Parts

Extract the Most Valuable Parts

Our Proposal:

Our Proposal:

Extract the Most Valuable Parts

Extract the Most Valuable Parts

(120)

…Greek and Roman mythology...

Our Proposal:

Our Proposal:

Extract the Most Valuable Parts

Extract the Most Valuable Parts

Our Proposal:

Our Proposal:

Extract the Most Valuable Parts

Extract the Most Valuable Parts

semantic!

look for semantically salient contexts in text!

look for semantically salient contexts in text!

(121)

Two Worlds Two Worlds

Jiaqiang Chen and Gerard de Melo 2015

Distributional Semantics:

(122)

Proposed Research Program: Joint Training

Proposed Research Program: Joint Training

Better

Word Embeddings Joint Training

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(123)

Proposed Research Program: Joint Training

Proposed Research Program: Joint Training

Better

Word Embeddings Joint Training

Jiaqiang Chen and Gerard de Melo 2015

Use parallel threads

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(124)

Preliminary Experiments: Preliminary Experiments: Joint Training Joint Training Preliminary Experiments: Preliminary Experiments: Joint Training Joint Training

Recently lots of related work: E.g. Faruqui et al., Hill & Korhonen,

Wang et al., Johansson & Nieto Piña

Recently lots of related work: E.g. Faruqui et al., Hill & Korhonen,

Wang et al., Johansson & Nieto Piña

(125)

Preliminary Experiments: Preliminary Experiments: Joint Training Joint Training Preliminary Experiments: Preliminary Experiments: Joint Training Joint Training

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(126)

Preliminary Experiments: Preliminary Experiments: Joint Training Joint Training Preliminary Experiments: Preliminary Experiments: Joint Training Joint Training

Use negative sampling

Use negative sampling

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(127)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction

Jiaqiang Chen and Gerard de Melo 2015

Variant 1: Definition Extraction

Variant 1: Definition Extraction

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(128)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction

Jiaqiang Chen and Gerard de Melo 2015 Definitions

befuddle: to becloud and confuse as with liquor befuddled: dazed by alcoholic drink

befuddled: confused and vague used especially of thinking beg: to ask earnestly for, to entreat or supplicate for, to beseech

Variant 1: Definition Extraction

Variant 1: Definition Extraction

Source: GCIDE

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(129)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Synonyms

effectual: effectual efficacious effective

effectuality: effectiveness effectivity effectualness efficacious: effectual

efficaciousness: efficacy

Jiaqiang Chen and Gerard de Melo 2015

Variant 1: Definition Extraction

Variant 1: Definition Extraction

Source: GCIDE

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(130)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction

Jiaqiang Chen and Gerard de Melo 2015

Variant 2: List Extraction

Variant 2: List Extraction

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(131)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction

Jiaqiang Chen and Gerard de Melo 2015

● Look for repeated occurrences of commas ● Short units of roughly equal length

● noun phrases, adjectives

Variant 2: List Extraction

Variant 2: List Extraction

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(132)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction

Jiaqiang Chen and Gerard de Melo 2015

● Look for repeated occurrences of commas ● Short units of roughly equal length

● noun phrases, adjectives ● Also: Hearst patterns, e.g.

“cities such as New York, London, ...”

Variant 2: List Extraction

Variant 2: List Extraction

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(133)

Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction Preliminary Experiments: Preliminary Experiments: Information Extraction Information Extraction

Jiaqiang Chen and Gerard de Melo 2015 Extracted Lists

player captain manager director vice-chairman

group race culture religion organisation person person Italian Mexican Chinese Creole French

Self-Portraits Portraits iris Still-Lives with Sunflowers view from the Asylum Works after Millet Vineyards

ballscrews leadscrews worm gear screwjacks linear actuator

Cleveland Essex Lincolnshire Northamptonshire Nottinghamshire Thames Valley South Wales

ant.py dimdriver.py dimdriverdatafile.py

dimdriverdatasetdef.py dimexception.py dimmaker.py dimoperators.py dimparser.py dimrex.py dimension.py

Variant 2: List Extraction

(134)

Preliminary Experiments: Preliminary Experiments: Setup Setup Preliminary Experiments: Preliminary Experiments: Setup Setup Wikipedia 2010

normalize to lower case and remove special characters Contain 1,205,009,010 words

Select words appearing at least 50 times Vocabulary size 220,521

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(135)

Preliminary Experiments: Preliminary Experiments: Setup Setup Preliminary Experiments: Preliminary Experiments: Setup Setup Wikipedia 2010

normalize to lower case and remove special characters Contain 1,205,009,010 words

Select words appearing at least 50 times Vocabulary size 220,521

Balance Components

simply by controlling starting learning rates: 0.05 for CBOW, varying rates for extracted information

Balance Components

simply by controlling starting learning rates: 0.05 for CBOW, varying rates for extracted information

Vector dim. 300

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(136)

Preliminary Experiments: Preliminary Experiments: Results on WS353 Results on WS353 Preliminary Experiments: Preliminary Experiments: Results on WS353 Results on WS353

Positive effect from 0.001 until around 0.04

Positive effect from 0.001 until around 0.04

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(137)

Preliminary Experiments: Preliminary Experiments: Example Example Preliminary Experiments: Preliminary Experiments: Example Example

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(138)

Preliminary Experiments: Preliminary Experiments: Example Example Preliminary Experiments: Preliminary Experiments: Example Example

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

Best Paper Award at NAACL 2015

Vector Space Modeling Workshop

(139)

Outline Outline

Large-Scale Knowledge Graphs

Semantics in Action

(140)

History Repeating?

History Repeating?History Repeating? History Repeating? SMT SMT NMTNMT Phrase-Based SMT Hierarchical Phrases WSD, MEANT etc. Phrase-Based SMT Hierarchical Phrases

(141)

Well-Known Issues

Well-Known IssuesWell-Known Issues Well-Known Issues

(142)

Source: The New Yorker Future: Future: Learning Common-Sense Learning Common-Sense Future: Future: Learning Common-Sense Learning Common-Sense

(143)

Learning Common-Sense

Learning Common-SenseLearning Common-Sense Learning Common-Sense WebChild AAAI 2014 WSDM 2014 AAAI 2011 WebChild AAAI 2014 WSDM 2014 AAAI 2011

(144)

Lexical Intensity Orderings Lexical Intensity Orderings

hot hot warm warm fiery fiery scorching scorching < < < weak strong TACL 2013 TACL 2013

(145)

Knowlywood: Human Activities Knowlywood: Human Activities

CIKM 2015

(146)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships

(147)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships x x x x petronia sparrow parched arid x dry x bird http://www.wikihow.com/Read-a-Book-to-a-Baby-or-Infant#/Image:Read-a-Book-to-a-Baby-or-Infant-Step-5.jpg

(148)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships x x x x petronia sparrow parched arid x dry x bird http://www.wikihow.com/Read-a-Book-to-a-Baby-or-Infant#/Image:Read-a-Book-to-a-Baby-or-Infant-Step-5.jpg

Should account for relationships (incl. affordances,

causality, etc.)

Should account for relationships (incl. affordances,

(149)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships

Assume that she is learning just from text

Assume that she is learning just from text

(150)

1. Gather large amounts of Patterns

2. Use Web-Scale Data (Google N-Grams, derived from 10^12 words of text)

Hearst-style Bootstrapping with

large

numbers of seeds

Gerard de Melo

Information Extraction from Text Information Extraction from Text

(151)

Extension to Relationships Extension to Relationships

Commonsense word relationships extracted from Google 1T n-grams

24 relations bootstrapped via ConceptNet → 1,158,141 triples

(152)

Extension to Relationships Extension to Relationships

earring hasProperty gorgeous concept definedAs theory

sonar partOf submarine predator desires food

Commonsense word relationships extracted from Google 1T n-grams

24 relations bootstrapped via ConceptNet → 1,158,141 triples

(153)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships

(154)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships

(155)

Extension to Relationships

Extension to RelationshipsExtension to Relationships Extension to Relationships

(156)

Summary

Large-Scale Knowledge Graphs ► Universal WordNet/MENTA:

large multilingual taxonomy ► Etymological WordNet

Semantics in Action, e.g. ► Lexvo.org

► Question Answering

with YAGO Future Perspectives

► Vector Representations ► Common-Sense for NLU

More Information: www.demelo.org gdm@demelo.org More Information: www.demelo.org gdm@demelo.org Gerard de Melo

Figure

Updating...

References

Related subjects :
Outline : UWNUWNUWN