• No results found

large scale dataset

A Large-Scale Dataset of Popular Open Source Projects

A Large-Scale Dataset of Popular Open Source Projects

... build large-scale datasets of selected, high quality, real project ...a large-scale dataset, of 4349 projects in 11 general-purpose programming languages gathered from Github ...

7

How well do Computers Solve Math Word Problems? Large Scale Dataset Construction and Evaluation

How well do Computers Solve Math Word Problems? Large Scale Dataset Construction and Evaluation

... both scale and diversity. In this paper, we build a large-scale dataset which is more than 9 times the size of previous ones, and contains many more problem ...the dataset are semi- ...

10

DiscoFuse: A Large Scale Dataset for Discourse Based Sentence Fusion

DiscoFuse: A Large Scale Dataset for Discourse Based Sentence Fusion

... Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neu- ral models. In this paper, we ...

13

A SURVEY ON PRIVACY PRESERVATIONTECHNIQUES FOR DATA CLUSTERING K MEANS OVER LARGE SCALE DATASET

A SURVEY ON PRIVACY PRESERVATIONTECHNIQUES FOR DATA CLUSTERING K MEANS OVER LARGE SCALE DATASET

... huge scale information handling ideal models is MapReduce where, it has been widely looked into and successively received for huge information applications as of late ...

5

KPTimes: A Large Scale Dataset for Keyphrase Generation on News Documents

KPTimes: A Large Scale Dataset for Keyphrase Generation on News Documents

... Keyphrases are single or multi-word lexical units that best summarise a document (Evans and Zhai, 1996). As such, they are of great importance for indexing, categorising and browsing digital libraries (Witten et al., ...

6

FEVER: a Large scale Dataset for Fact Extraction and VERification

FEVER: a Large scale Dataset for Fact Extraction and VERification

... The pipeline presented and evaluated in the pre- vious section is one possible approach to the task proposed in our dataset, but we envisage differ- ent ones to be equally valid and possibly bet- ter performing. ...

11

BIGPATENT: A Large Scale Dataset for Abstractive and Coherent Summarization

BIGPATENT: A Large Scale Dataset for Abstractive and Coherent Summarization

... Finally, we examine the entity recurrence pat- tern which captures how many entities, first occur- ring in the t th sentence, are repeated in subsequent (t + i th ) sentences. Table 3 (right) shows that, on average, 2.3 ...

10

Improving Open Domain Dialogue Systems via Multi Turn Incomplete Utterance Restoration

Improving Open Domain Dialogue Systems via Multi Turn Incomplete Utterance Restoration

... versation dataset from internet communities, and each of the conversations contains at least six ut- ...and large-scale dataset with 200K annotated ...a dataset offers a new way of ...

10

Structured Two-Stream Attention Network for Video Question Answering

Structured Two-Stream Attention Network for Video Question Answering

... Our STA model achieves state-of-the-art performance on a large-scale dataset: TGIF-QA dataset. To summarize, our major contributions include: 1) We propose a new architec- ture, Structured ...

8

Countering Position Bias in Instructor Interventions in MOOC Discussion Forums

Countering Position Bias in Instructor Interventions in MOOC Discussion Forums

... a large scale dataset, we conclusively show that instructor interventions exhibit strong position bias, as measured by the position where the thread appeared on the user in- terface at the time of ...

8

Fast distributed video deduplication via locality-sensitive hashing with similarity ranking

Fast distributed video deduplication via locality-sensitive hashing with similarity ranking

... The exponentially growing amount of video data being produced has led to tremendous challenges for video deduplication technology. Nowadays, many different deduplication approaches are being rapidly developed, but they ...

11

JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context based Code Generation

JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context based Code Generation

... We define two code generation related tasks us- ing JuICe: (1) generating API sequences, and (2) full code generation. The first task is more relaxed and aims to assist users by generating a sequence of all function ...

11

Large Scale Data Handling in Biology - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials

Large Scale Data Handling in Biology - Free Computer, Programming, Mathematics, Technical Books, Lecture Notes and Tutorials

... Later on, the number of readers may be increased. In non-academic institutions the entire architecture is mostly based on external vendor’s informatics applications, so it should be flexible and scalable to fit a variety ...

55

Marine and coastal ecosystem services on the science–policy–practice nexus: challenges and opportunities from 11 European case studies

Marine and coastal ecosystem services on the science–policy–practice nexus: challenges and opportunities from 11 European case studies

... The fourth set of challenges was linked to data and methodological gaps in MCES assessments. These entailed (1) lack of empirical or modeled data, parti- cularly geo-referenced socio-economic data (A12; B12; D12; E12; ...

18

Convolutional aggregation of local evidence for large pose face alignment

Convolutional aggregation of local evidence for large pose face alignment

... Cats&Dogs dataset is a subset of the Oxford-IIIT-Pet dataset [19] which contains a rich variety of cats/dogs breeds, making the dataset particularly ...Our dataset contains 1511 images of ...

12

Size, shape and spatial arrangement of mega-scale glacial lineations from a large and diverse dataset

Size, shape and spatial arrangement of mega-scale glacial lineations from a large and diverse dataset

... The ice – bed interface is a key control on fast ice flow, with associated impacts on ice sheet mass balance and sea level. This is where mega-scale glacial lineations are formed and their study is likely to ...

17

MultiWOZ   A Large Scale Multi Domain Wizard of Oz Dataset for Task Oriented Dialogue Modelling

MultiWOZ A Large Scale Multi Domain Wizard of Oz Dataset for Task Oriented Dialogue Modelling

... As more and more speech oriented applications are commercially deployed, the necessity of build- ing an entirely data-driven conversational agent becomes more apparent. Various corpora were gathered to enable data-driven ...

11

Spider: A Large Scale Human Labeled Dataset for Complex and Cross Domain Semantic Parsing and Text to SQL Task

Spider: A Large Scale Human Labeled Dataset for Complex and Cross Domain Semantic Parsing and Text to SQL Task

... text-to-SQL dataset that contains both databases with multiple tables in different domains and com- plex SQL queries It tests the ability of a system to generalize to not only new SQL queries and database schemas ...

11

DART: A Large Dataset of Dialectal Arabic Tweets

DART: A Large Dataset of Dialectal Arabic Tweets

... new large manually-annotated multi-dialect dataset of Arabic tweets that is publicly ...(DART) dataset has about 25K tweets that are annotated via crowdsourcing and it is well-balanced over five main ...

5

Fusion Architecture of Database for Large and Diverse Dataset

Fusion Architecture of Database for Large and Diverse Dataset

... In past decades the data model used for a standard storage system is a relational data model. Predominantly it uses table and record structure to store data. Recent advancements in the nature and behavior of data in ...

10

Show all 10000 documents...

Related subjects