Stars
Scrapy, a fast high-level web crawling & scraping framework for Python.
The fastai book, published as Jupyter Notebooks
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
An open source library for deep learning end-to-end dialog systems and chatbots.
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
Aspect Based Sentiment Analysis, PyTorch Implementations. 基于方面的情感分析,使用PyTorch实现。
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
PyTorch Library for Active Learning to accompany Human-in-the-Loop Machine Learning book
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the u…
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer…
Supplementary Materials for ``Quantitative Social Science: An Introduction''
Quote extraction for modular journalism (JournalismAI collab 2021)
🕵🏽♀️ Identifying the author behind New York Time’s op-ed from inside the Trump White House.
Applying NLP transfer learning techniques to predict Tweet stance toward a topic
Detect the sentiment captured in short pieces of text
NAACL 2019 (Oral): Code for "Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings"
Course materials: POIR 613 - Computational Social Science - USC Fall 2019
Unsupervised method for extracting quotation-speaker pairs from large news corpora.
Code and data for the WSDM '21 paper "Quotebank: A Corpus of Quotations from a Decade of News"
Usage examples for ARC resources
Utilities for retrieving whitehouse.gov transcripts and matching news quotes to them
Text as Data Course Taught at Yale University, November 15 2019
A Dataset for Direct Quotation Extraction and Attribution in News Articles.