Gupta et al., 2013 - Google Patents
Improving mt system using extracted parallel fragments of text from comparable corporaGupta et al., 2013
View PDF- Document ID
- 17886930087185393434
- Author
- Gupta R
- Pal S
- Bandyopadhyay S
- Publication year
- Publication venue
- Proceedings of the Sixth Workshop on Building and Using Comparable Corpora
External Links
Snippet
In this article, we present an automated approach of extracting English-Bengali parallel fragments of text from comparable corpora created using Wikipedia documents. Our approach exploits the multilingualism of Wikipedia. The most important fact is that this …
- 238000000034 method 0 description 6
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2872—Rule based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
- G06F17/277—Lexical analysis, e.g. tokenisation, collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/289—Use of machine translation, e.g. multi-lingual retrieval, server side translation for client devices, real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2863—Processing of non-latin text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2795—Thesaurus; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Ma | Champollion: A Robust Parallel Text Sentence Aligner. | |
| US8548794B2 (en) | Statistical noun phrase translation | |
| CN104750687B (en) | Improve method and device, machine translation method and the device of bilingualism corpora | |
| Gupta et al. | Improving mt system using extracted parallel fragments of text from comparable corpora | |
| Bertoldi | Phrase-based statistical machine translation with pivot languages | |
| US9311299B1 (en) | Weakly supervised part-of-speech tagging with coupled token and type constraints | |
| Pal et al. | A hybrid word alignment model for phrase-based statistical machine translation | |
| Shen et al. | Effective use of linguistic and contextual information for statistical machine translation | |
| Udupa et al. | “They Are Out There, If You Know Where to Look”: Mining Transliterations of OOV Query Terms for Cross-Language Information Retrieval | |
| Paulik et al. | Sentence segmentation and punctuation recovery for spoken language translation | |
| Mrinalini et al. | Pause-based phrase extraction and effective OOV handling for low-resource machine translation systems | |
| Oflazer | Statistical machine translation into a morphologically complex language | |
| Besacier et al. | The lig english to french machine translation system for iwslt 2012 | |
| Mara | English-Wolaytta Machine Translation using Statistical Approach | |
| Talman et al. | The University of Helsinki submissions to the WMT19 news translation task | |
| Shen et al. | The JHU workshop 2006 IWSLT system | |
| Afli et al. | Building and using multimodal comparable corpora for machine translation | |
| Khemakhem et al. | Integrating morpho-syntactic features in English-Arabic statistical machine translation | |
| Mediani et al. | The kit english-french translation systems for iwslt 2011 | |
| Phuoc et al. | Building a bidirectional english-vietnamese statistical machine translation system by using moses | |
| Hatori et al. | Japanese pronunciation prediction as phrasal statistical machine translation | |
| Pal et al. | How sentiment analysis can help machine translation | |
| Fattah et al. | Stemming to improve translation lexicon creation form bitexts | |
| Afli et al. | Multimodal comparable corpora for machine translation | |
| Srivastava et al. | Segmenting long sentence pairs to improve word alignment in english-hindi parallel corpora |