AI Conference Reading List

Reading List: Data Governance in the Age of Generative AI

The Sources of LLM Data

In the Wake of Generative AI, Industry-Led Standards for Data Scraping Are a Must

Center for Data Innovation

Generative AI Is Scraping Your Data. So, Now What?

Dark Reading

Revealed: The Authors Whose Pirated Books Are Powering Generative AI

The Atlantic

AI Researchers Uncover Ethical, Legal Risks to Using Popular Data Sets

The Washington Post

AI2 Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining 

Medium

 

The Continuum of Closed and Open LLMs and their Implications for Data Governance

Open-Sourcing Highly Capable Foundation Models

Governance Ai

Will Open Source AI Shift Power from ‘Big Tech’? It Depends.

Tech Policy Press

GitHub And Others Call For More Open-Source Support in EU AI Law

The Verge

Opening up ChatGPT: Tracking Openness, Transparency, And Accountability in Instruction-Tuned Text Generators

Association For Computing Machinery

The Open-Source AI Boom is Built on Big Tech’s Handouts. How Long Will it Last? ”

Technology Review

How to Promote Responsible Open Foundation Models

Stanford University

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Cornell University

The Foundation Model Transparency Index

Center for Research on Foundation Models

Evaluating LLMs is a minefield”

Princeton University

What the executive order means for openness in AI

AI Snake Oil

Pen Sourcing The AI Revolution

Demos UK

 

 

Data Openness and Society

The Paradox of Open

Open Future

AI is Tearing Wikipedia Apart

Vice

ChatGPT Stole Your Work. So What Are You Going to Do?

WIRED

My Books Were Used to Train Meta’s Generative AI. Good.

The Atlantic

Right to be Forgotten in The Era of Large Language Models: Implications, Challenges, And Solutions

Cornell University

Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow

Cornell University

The Privacy Bias Trade-Off

Stanford University

Workers Could be the Ones to Regulate AI

Financial Times

  

  

What are Governments Doing to Close the Data Governance Gaps

A law for foundation models: the EU AI Act can improve regulation for fairer competition

OEDC.AI

France bets big on open-source AI

Politico

 

AI Safety Summit: Introduction (HTML)

GOV.UK

China Bets on Open-Source Technologies to Boost Domestic Innovation

Merics

New Ideas for Shared Data Governance

Data Dysphoria: The Governance Challenge Posed by Large Learning Models

Social Science Research Network

AI_Commons

Open Future

Stewarding The Sum of All Knowledge in The Age of AI

Open Future

Core Considerations for Exploring AI Systems as Digital Public Goods

Digital Public Goods

Generative AI And The Digital Commons

Cornell University

Datasheets for Datasets

Cornell University

 

Anthropic Thinks ‘Constitutional AI’ is The Best Way to Train Models

TechCrunch

Regulating ChatGPT And Other Large Generative AI Models

Cornell University

Did ChatGPT Really Say That?”: Provenance in The Age of Generative AI.

Harvard University Library Innovation Lab

New Synthetic Data Techniques Could Change The Way AI Models are Trained

Semafor

 

Speaking in Tongues: Teaching Local Languages to Machines

Digital Impact Alliance

Data Governance in the Age of Large-Scale Data-Driven Language Technology

Cornell University

Data Governance in the Age of Generative AI

Adam Zable - Koch Research Fllow

Adam Zable

Director of Emerging Technologies – Digital Trade & Data Governance Hub