Stars
A Survey on Data Selection for Language Models
Implementation of TSDS: Data Selection for Task-Specific Model Finetuning
The repository contains the code to reproduce the experiments for Rapidash, an efficient system to detect violations to denial constraints.
A benchmark of data-centric tasks from across the machine learning lifecycle.
Code repository for EMNLP 2021 paper 'Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods'
Large scale graph learning on a single machine.
Picket is a system that safeguards against data corruptions during both training and deployment of machine learning models over tabular data.
A knowledge base construction engine for richly formatted data
A Keras Implementation of Fast-Neural-Style