+

Pickard, 2020 - Google Patents

Comparing word2vec and GloVe for automatic measurement of MWE compositionality

Pickard, 2020

View PDF
Document ID
4936820352761078028
Author
Pickard T
Publication year
Publication venue
Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons

External Links

Snippet

This paper explores the use of word2vec and GloVe embeddings for unsupervised measurement of the semantic compositionality of MWE candidates. Through comparison with several human-annotated reference sets, we find word2vec to be substantively superior …
Continue reading at eprints.whiterose.ac.uk (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • G06F17/30669Translation of the query language, e.g. Chinese to English
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/30675Query execution
    • G06F17/30684Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • G06F17/2715Statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/30707Clustering or classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • G06F17/30864Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
    • G06F17/30867Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores

Similar Documents

Publication Publication Date Title
Tran et al. JAIST: Combining multiple features for answer selection in community question answering
Asghar et al. Creating sentiment lexicon for sentiment analysis in Urdu: The case of a resource‐poor language
Utiyama et al. Reliable measures for aligning Japanese-English news articles and sentences
US8380489B1 (en) System, methods, and data structure for quantitative assessment of symbolic associations in natural language
CN103514213B (en) Term extraction method and device
Sharjeel et al. COUNTER: corpus of Urdu news text reuse
Pickard Comparing word2vec and GloVe for automatic measurement of MWE compositionality
CN110991181B (en) Method and apparatus for enhancing labeled samples
Awajan Keyword extraction from arabic documents using term equivalence classes
Kanan et al. Extracting named entities using named entity recognizer and generating topics using latent dirichlet allocation algorithm for arabic news articles
Awajan Semantic similarity based approach for reducing Arabic texts dimensionality
Alemneh et al. Dictionary based amharic sentiment lexicon generation
Dung Natural language understanding
Hakkani-Tur et al. Statistical sentence extraction for information distillation
Saqib et al. Semi supervised method for detection of ambiguous word and creation of sense: Using WordNet
Ali et al. Word embedding based new corpus for low-resourced language: Sindhi
Atlam et al. A new approach for Arabic text classification using Arabic field‐association terms
Al-Arfaj et al. Arabic NLP tools for ontology construction from Arabic text: An overview
Gayen et al. Automatic identification of Bengali noun-noun compounds using random forest
Ion PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora.
Bao et al. Exploring attentive Siamese LSTM for low-resource text plagiarism detection
Chen et al. Chinese named entity abbreviation generation using first-order logic
Abd Rahim et al. A Summarisation Tool for Hotel Reviews
Lee et al. Automatic Generation of Vocabulary Lists with Multiword Expressions
Aljuaid Arabic-English corpus for cross-language textual similarity detection
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载