Pickard, 2020 - Google Patents
Comparing word2vec and GloVe for automatic measurement of MWE compositionalityPickard, 2020
View PDF- Document ID
- 4936820352761078028
- Author
- Pickard T
- Publication year
- Publication venue
- Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
External Links
Snippet
This paper explores the use of word2vec and GloVe embeddings for unsupervised measurement of the semantic compositionality of MWE candidates. Through comparison with several human-annotated reference sets, we find word2vec to be substantively superior …
- 238000005259 measurement 0 title abstract description 3
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/3066—Query translation
- G06F17/30669—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30634—Querying
- G06F17/30657—Query processing
- G06F17/30675—Query execution
- G06F17/30684—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2765—Recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/28—Processing or translating of natural language
- G06F17/2809—Data driven translation
- G06F17/2827—Example based machine translation; Alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
- G06F17/2715—Statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/30707—Clustering or classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F17/30705—Clustering or classification
- G06F17/3071—Clustering or classification including class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30864—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
- G06F17/30867—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Tran et al. | JAIST: Combining multiple features for answer selection in community question answering | |
| Asghar et al. | Creating sentiment lexicon for sentiment analysis in Urdu: The case of a resource‐poor language | |
| Utiyama et al. | Reliable measures for aligning Japanese-English news articles and sentences | |
| US8380489B1 (en) | System, methods, and data structure for quantitative assessment of symbolic associations in natural language | |
| CN103514213B (en) | Term extraction method and device | |
| Sharjeel et al. | COUNTER: corpus of Urdu news text reuse | |
| Pickard | Comparing word2vec and GloVe for automatic measurement of MWE compositionality | |
| CN110991181B (en) | Method and apparatus for enhancing labeled samples | |
| Awajan | Keyword extraction from arabic documents using term equivalence classes | |
| Kanan et al. | Extracting named entities using named entity recognizer and generating topics using latent dirichlet allocation algorithm for arabic news articles | |
| Awajan | Semantic similarity based approach for reducing Arabic texts dimensionality | |
| Alemneh et al. | Dictionary based amharic sentiment lexicon generation | |
| Dung | Natural language understanding | |
| Hakkani-Tur et al. | Statistical sentence extraction for information distillation | |
| Saqib et al. | Semi supervised method for detection of ambiguous word and creation of sense: Using WordNet | |
| Ali et al. | Word embedding based new corpus for low-resourced language: Sindhi | |
| Atlam et al. | A new approach for Arabic text classification using Arabic field‐association terms | |
| Al-Arfaj et al. | Arabic NLP tools for ontology construction from Arabic text: An overview | |
| Gayen et al. | Automatic identification of Bengali noun-noun compounds using random forest | |
| Ion | PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora. | |
| Bao et al. | Exploring attentive Siamese LSTM for low-resource text plagiarism detection | |
| Chen et al. | Chinese named entity abbreviation generation using first-order logic | |
| Abd Rahim et al. | A Summarisation Tool for Hotel Reviews | |
| Lee et al. | Automatic Generation of Vocabulary Lists with Multiword Expressions | |
| Aljuaid | Arabic-English corpus for cross-language textual similarity detection |