Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Cheng, Deric; Bae, Juhan; Bullock, Justin; Kristofferson, David

Computer Science > Computers and Society

arXiv:2501.12642 (cs)

[Submitted on 22 Jan 2025]

Title:Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Authors:Deric Cheng, Juhan Bae, Justin Bullock, David Kristofferson

View PDF

Abstract:This report investigates Training Data Attribution (TDA) and its potential importance to and tractability for reducing extreme risks from AI. First, we discuss the plausibility and amount of effort it would take to bring existing TDA research efforts from their current state, to an efficient and accurate tool for TDA inference that can be run on frontier-scale LLMs. Next, we discuss the numerous research benefits AI labs will expect to see from using such TDA tooling. Then, we discuss a key outstanding bottleneck that would limit such TDA tooling from being accessible publicly: AI labs' willingness to disclose their training data. We suggest ways AI labs may work around these limitations, and discuss the willingness of governments to mandate such access. Assuming that AI labs willingly provide access to TDA inference, we then discuss what high-level societal benefits you might see. We list and discuss a series of policies and systems that may be enabled by TDA. Finally, we present an evaluation of TDA's potential impact on mitigating large-scale risks from AI systems.

Subjects:	Computers and Society (cs.CY)
Cite as:	arXiv:2501.12642 [cs.CY]
	(or arXiv:2501.12642v1 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2501.12642

Submission history

From: Deric Cheng [view email]
[v1] Wed, 22 Jan 2025 05:03:51 UTC (2,652 KB)

Computer Science > Computers and Society

Title:Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computers and Society

Title:Training Data Attribution (TDA): Examining Its Adoption & Use Cases

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators