Stars
Quote extraction for modular journalism (JournalismAI collab 2021)
ConvoKit is a toolkit for extracting conversational features and analyzing social phenomena in conversations. It includes several large conversational datasets along with scripts exemplifying the u…
Utilities for retrieving whitehouse.gov transcripts and matching news quotes to them
work on reading/analyzing presidential addresses and media coverage, at MPI-SWS 2014
Code and data for the WSDM '21 paper "Quotebank: A Corpus of Quotations from a Decade of News"
A Dataset for Direct Quotation Extraction and Attribution in News Articles.
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
Usage examples for ARC resources
NAACL 2019 (Oral): Code for "Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings"
The fastai book, published as Jupyter Notebooks
PyTorch Library for Active Learning to accompany Human-in-the-Loop Machine Learning book
Scrapy, a fast high-level web crawling & scraping framework for Python.
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
Detect the sentiment captured in short pieces of text
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
An open source library for deep learning end-to-end dialog systems and chatbots.
chsuong / CNTK
Forked from microsoft/CNTKMicrosoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
Course materials: POIR 613 - Computational Social Science - USC Fall 2019
Supplementary Materials for ``Quantitative Social Science: An Introduction''
Applying NLP transfer learning techniques to predict Tweet stance toward a topic
Text as Data Course Taught at Yale University, November 15 2019
Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"