-
Exploring the limits of strong membership inference attacks on large language models
Jamie Hayes, Ilia Shumailov, Christopher A. Choquette-Choo, Matthew Jagielski, George Kaissis, Milad Nasr, Sahra Ghalebikesabi, Meenatchi Sundaram Mutu Selva Annamalai, Niloofar Mireshghallah, Igor Shilov, Matthieu Meeus, Yves-Alexandre de Montjoye, Katherine Lee, Franziska Boenisch, Adam Dziedzic, A. Feder Cooper
arXiv preprint arXiv:2505.18773 (2025)
-
If open source is to win, it must go public
Tan, Joshua, Vincent, Nicholas, Elkins, Katherine, Sahlgren, Magnus
arXiv preprint arXiv:2507.09296 (2025)
-
OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens
Liu, Jiacheng, Blanton, Taylor, Elazar, Yanai, Min, Sewon, Chen, YenSung, Chheda-Kothary, Arnavi, Tran, Huy, Bischoff, Byron, Marsh, Eric, Schmitz, Michael, others
arXiv preprint arXiv:2504.07096 (2025)
-
LLM Dataset Inference: Did you train on my dataset?
Pratyush Maini, Hengrui Jia, Nicolas Papernot, Adam Dziedzic
arXiv preprint arXiv:2406.06443 (2024)
-
Poisoning Web-Scale Training Datasets is Practical
Nicholas Carlini, Matthew Jagielski, Christopher A. Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, Florian Tramèr
(2024)
-
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec, Yuntao Bai, Zachary Witten, Marina Favaro, Jan Brauner, Holden Karnofsky, Paul Christiano, Samuel R. Bowman, Logan Graham, Jared Kaplan, Sören Mindermann, Ryan Greenblatt, Buck Shlegeris, Nicholas Schiefer, Ethan Perez
(2024)
-
What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions
arXiv preprint arXiv:2405.13954 (2024)
-
Quantifying Memorization Across Neural Language Models
Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, Chiyuan Zhang
International Conference on Learning Representations (2023)
-
The Dimensions of Data Labor: A Road Map for Researchers, Activists, and Policymakers to Empower Data Producers
Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency (2023)
-
Beyond neural scaling laws: beating power law scaling via data pruning
Sorscher, Ben, Geirhos, Robert, Shekhar, Shashank, Ganguli, Surya, Morcos, Ari
Advances in Neural Information Processing Systems (2022)
-
Training Data Influence Analysis and Estimation: A Survey
Zayd Hammoudeh, Daniel Lowd
arXiv preprint arXiv:2212.04612 (2022)
-
Beta Shapley: A Unified and Noise-Reduced Data Valuation Framework for Machine Learning
arXiv preprint arXiv:2110.14049 (2021)
-
Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies
Vincent, Nicholas and Li, Hanlin and Tilly, Nicole and Chancellor, Stevie and Hecht, Brent
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (2021)
-
Extracting Training Data from Large Language Models
Carlini, Nicholas, Tramer, Florian, Wallace, Eric, Jagielski, Matthew, Herbert-Voss, Ariel, Lee, Katherine, Roberts, Adam, Brown, Tom B., Song, Dawn, Erlingsson, {\'U}lfar, Oprea, Alina, Papernot, Nicolas
Proceedings of USENIX Security Symposium (2021)
-
Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor
Tran, Chau, Champion, Kaylea, Forte, Andrea, Hill, Benjamin Mako, Greenstadt, Rachel
2020 IEEE Symposium on Security and Privacy (SP) (2020)
-
Language Models are Few-Shot Learners
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei
arXiv preprint arXiv:2005.14165 (2020)
-
Data Shapley: Equitable Valuation of Data for Machine Learning
International Conference on Machine Learning (2019)
-
Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations
Obermeyer, Ziad, Powers, Brian, Vogeli, Christine, Mullainathan, Sendhil
Science (2019)
-
Fairness and Abstraction in Sociotechnical Systems
Selbst, Andrew D., Boyd, Danah, Friedler, Sorelle A., Venkatasubramanian, Suresh, Vertesi, Janet
Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT) (2019)
-
Mapping the Potential and Pitfalls of "Data Dividends" as a Means of Sharing the Profits of Artificial Intelligence
Nicholas Vincent, Yichun Li, Renee Zha, Brent Hecht
arXiv preprint arXiv:1912.00757 (2019)
-
On the Accuracy of Influence Functions for Measuring Group Effects
Pang Wei Koh, Kai-Siang Ang, Hubert H. K. Teo, Percy Liang
Advances in Neural Information Processing Systems (2019)
-
Privacy, anonymity, and perceived risk in open collaboration: A study of service providers
McDonald, Nora, Hill, Benjamin Mako, Greenstadt, Rachel, Forte, Andrea
Proceedings of the 2019 CHI conference on human factors in computing systems (2019)
-
Reconciling modern machine-learning practice and the classical bias–variance trade-off
Belkin, Mikhail, Hsu, Daniel, Ma, Siyuan, Mandal, Soumik
Proceedings of the National Academy of Sciences (2019)
-
Towards Efficient Data Valuation Based on the Shapley Value
Ruoxi Jia, Dah-Yuan Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Neslihan M. Gurel, Carl J. Spanos
International Conference on Artificial Intelligence and Statistics (2019)
-
A Reductions Approach to Fair Classification
Alekh Agarwal, Alina Beygelzimer, Miroslav Dudik, John Langford, Hanna Wallach
International Conference on Machine Learning (2018)
-
Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification
Buolamwini, Joy, Gebru, Timnit
Proceedings of the Conference on Fairness, Accountability and Transparency (FAT*) (2018)
-
Should We Treat Data as Labor? Moving Beyond 'Free'
Imanol Arrieta-Ibarra, Leonard Goff, Diego Jimenez-Hernandez, Jaron Lanier, E. Glen Weyl
AEA Papers and Proceedings (2018)
-
Deep learning scaling is predictable, empirically
Hestness, Joel, Narang, Sharan, Ardalani, Newsha, Diamos, Gregory, Jun, Heewoo, Kianinejad, Hassan, Patwary, Md, Ali, Mostofa, Yang, Yang, Zhou, Yanqi
arXiv preprint arXiv:1712.00409 (2017)
-
Big Data's Disparate Impact
Barocas, Solon, Selbst, Andrew D.
California Law Review (2016)
-
The Algorithmic Foundations of Differential Privacy
Cynthia Dwork, Aaron Roth
Foundations and Trends in Theoretical Computer Science (2014)
-
Robust De-anonymization of Large Sparse Datasets
Narayanan, Arvind, Shmatikov, Vitaly
Proceedings of the IEEE Symposium on Security and Privacy (2008)
-
Simple Demographics Often Identify People Uniquely
Sweeney, Latanya
Carnegie Mellon University, Data Privacy Working Paper (2000)