• No results found

[PDF] Top 20 OCR Post Processing for Low Density Languages

Has 10000 "OCR Post Processing for Low Density Languages" found on our website. Below are the top 20 most common "OCR Post Processing for Low Density Languages".

OCR Post Processing for Low Density Languages

OCR Post Processing for Low Density Languages

... Most correction methods are not suitable for low density languages as they rely on lexicons. Goshtasby and Ehrich (1988) present a lexicon-free method based on probabilistic relaxation labeling. ... See full document

8

OCR and post correction of historical Finnish texts

OCR and post correction of historical Finnish texts

... for OCR are: fonts differ in different materials, lack of orthographic stan- dard (same words spelled differently), material quality (some documents can have deformations) and a lexicon of known historical ... See full document

7

Generating a Training Corpus for OCR Post Correction Using Encoder Decoder Model

Generating a Training Corpus for OCR Post Correction Using Encoder Decoder Model

... specific, low-resource ...to processing texts with a limited de- gree of OCR corruption, ...older OCR engines can prove too ...general-purpose OCR post-correction tool should be ... See full document

9

Upcycle Your OCR: Reusing OCRs for Post OCR Text Correction in Romanised Sanskrit

Upcycle Your OCR: Reusing OCRs for Post OCR Text Correction in Romanised Sanskrit

... an OCR based solution for digitising Romanised ...a Post-OCR text correction approach and is de- void of any OCR-specific feature ...Google OCR on the Saha´sran¯ama ...for OCR ... See full document

11

Challenges in Speech Recognition and Translation of High Value Low Density Polysynthetic Languages

Challenges in Speech Recognition and Translation of High Value Low Density Polysynthetic Languages

... these languages might not conform to estab- lished ...syntactic processing of polysynthetic languages pose specific challenges due to the blur between the more usual morphology-syntax distinction ... See full document

11

Morphological Neural Pre  and Post Processing for Slavic Languages

Morphological Neural Pre and Post Processing for Slavic Languages

... As regards the evaluation of the whole transla- tion process, results appear not so easy to evalu- ate. If we take Polish as an example (but the oth- er languages had similar behaviour) we see that pure BLEU ... See full document

5

A Comparative Study of Extremely Low Resource Transliteration of the World’s Languages

A Comparative Study of Extremely Low Resource Transliteration of the World’s Languages

... extremely low-resource nature of the data (on the order of a few hundred training examples), the task proved to be quite ...and post-processing perform comparably on ... See full document

6

Borrowing Language Resources for Development of Automatic Speech Recognition for Low- and Middle-Density Languages

Borrowing Language Resources for Development of Automatic Speech Recognition for Low- and Middle-Density Languages

... those languages having larger biphoneme ...Two languages may be said to be proximate if they are closely related linguistically or if the populations of speakers intermingle or otherwise habituate ... See full document

6

Low resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation

Low resource Post Processing of Noisy OCR Output for Historical Corpus Digitisation

... Automated tools handle most of the corpus without human intervention, but also identify scenarios where automated accuracy tends to be lowest. In these difficult cases, the automatic tools restrict the problem to a ... See full document

9

CEA LIST: Processing Low Resource Languages for CoNLL 2018

CEA LIST: Processing Low Resource Languages for CoNLL 2018

... In order to understand the impact of the param- eters on the architecture, we performed some random search optimization on hyperparameters, for a subset of languages (Russian, Hebrew, French, Uyghur, Vietnamese, ... See full document

11

Multi modular domain tailored OCR post correction

Multi modular domain tailored OCR post correction

... of OCR is highly dependent on the quality of the printed source ...of OCR systems is not good enough to serve as a basis for Digital Humanities (DH) ...be post-corrected in a time-consuming and ... See full document

11

Remote Elicitation of Inflectional Paradigms to Seed Morphological Analysis in Low Resource Languages

Remote Elicitation of Inflectional Paradigms to Seed Morphological Analysis in Low Resource Languages

... world’s languages, no structured, com- plete inflectional paradigms in a machine-readable format are available for human language technology (HLT) ap- ...15 languages, with materials ready for eliciting ... See full document

5

Acronym recognition and processing in 22 languages

Acronym recognition and processing in 22 languages

... 22 languages and produced various types of ...different languages, we have to bear in mind that these statistics are biased to some extent by the choices we have ...some languages may more frequently ... See full document

8

Semantic Processing of Compounds in Indian Languages

Semantic Processing of Compounds in Indian Languages

... Languages at IIT Bombay have developed a tool for automatic extraction of Multi Word Expressions from a corpus that uses minimum linguistic tools such as morphological analysers, and POS taggers. The candidates ... See full document

14

Pre Processing Images of Public Signage for OCR Conversion

Pre Processing Images of Public Signage for OCR Conversion

... For processing the dataset images, we have used OpenCV, an open-source library with state-of-the-art computer vision capabilities [3], allowing for quick implementation of our ... See full document

11

Align Me: A framework to generate Parallel Corpus Using OCRs and Bilingual Dictionaries

Align Me: A framework to generate Parallel Corpus Using OCRs and Bilingual Dictionaries

... To show the effectiveness of ’Active Learning’ in the alignment task, we have used ’Word Level Error’ than ’Sentence Level Error’. Even if a single word of a sentence have a mis-alignment, all the other words of that ... See full document

5

A KNN Improved Art Network Approach for Handwritten Character Recognition under Noise

A KNN Improved Art Network Approach for Handwritten Character Recognition under Noise

... In this paper, an effective hybrid approach is presented to perform the digital character recognition for noisy image. This recognition system is defined using KNN improved Art Network Approach. In this section, ... See full document

7

Transfer Learning across Low Resource, Related Languages for Neural Machine Translation

Transfer Learning across Low Resource, Related Languages for Neural Machine Translation

... A common strategy to improve learning of low- resource languages is to use resources from re- lated languages (Nakov and Ng, 2009). However, adapting these resources is not trivial. NMT of- fers some ... See full document

6

OCR based Image Processing with Audio Output for Visually Challenged People

OCR based Image Processing with Audio Output for Visually Challenged People

... 4) Tesseract: Tesseract is an Open Source OCR engine which is useful in text detection. It is the first engine to provide this type of image processing. First process is Adaptive Thresholding in which the ... See full document

6

Are post-treatment low-density lipoprotein subclass pattern analyses potentially misleading?

Are post-treatment low-density lipoprotein subclass pattern analyses potentially misleading?

... A practical, clinical challenge regarding advanced lipid testing is that the use of lipoprotein pattern analysis for pre-treatment diagnostic purposes may have very differ- ent clinical implications than the use of ... See full document

10

Show all 10000 documents...