MIT MAS.S68!

Instructors

Doug Beeferman

Co-Instructors are listed in alphabetical order, and their specific contributions are as follows:

Doug co-initiated the class, co-developed the syllabus and lecture materials, gave two programming workshops, and assisted with homework development and project guidance.
Suyash co-initiated the class, lead homework design and grading, and co-developed lecture materials.
Hang co-initiated the class, developed the course website, and co-developed syllabus, lecture materials, and homeworks.
Jad co-developed syllabus and lecture materials, developed and managed the course project, and contributed to the design of homeworks.
Shayne co-developed the syllabus, lectures, and homeworks, and spearheaded course content on language model evaluation, as well as the course content public release.
Hope co-initiated and spearheaded the class, coordinated activities of other co-instructors, and co-developed syllabus and lecture materials.
Deb provided general guidance and support for development of the class.

Schedule

The current class schedule is below (subject to change)

Date	Description	Course Materials
Feb 8	Part 1: Background on LLMs [Slides] Introduction and motivation Class structure and logistics Language modeling overview Definitions A brief history of LMs LLM fundamentals Overview of ways to train, tune, and prompt LLMs Fine-tuning, zero-shot prompts, few-shot prompts, chain-of-thought prompts Prompt tuning Examples of LLM prompts Part 2: Get your hands dirty with ChatGPT [Slides] Create a prompting task in groups.	Required Readings: Percy Liang's introduction to LLMs
Feb 15	Part 1: Evaluating models [Slides] How can we best evaluate these models for accuracy, fairness, bias, robustness, and other factors? Speaker: Rishi Bommasani (Stanford) Title: Holistically Evaluating Language Models on the Path to Evaluating Foundation Models Part 2: LLMs in Applications [Slides] People are increasingly interacting with human-facing tools that incorporate LLMs, like ChatGPT, writing assistants, and character generators. How might we go about evaluating these systems and their impacts on people? In this session we will consider 10 recent commercial and research applications of LLMs. Students will be asked to come prepared to critique the designs of one of these applications along different dimensions that we will describe in week 1.	Required Readings: Holistic Evaluation of Language Models (HELM) Recommended Readings: On the Opportunities and Risks of Foundation Models Discovering Language Model Behaviors with Model-Written Evaluations All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text Beyond Accuracy: Behavioral Testing of NLP Models with CheckList How to do human evaluation: A brief introduction to user studies in NLP Dynabench: Rethinking Benchmarking in NLP Word Embeddings Quantify 100 years of Gender and Ethnic Stereotypes
Feb 22	Part 1: Using LLMs for Consensus Across Preferences [Slides] Speaker: Michiel Bakker (DeepMind) Title: Fine-tuning Language Models to Find Agreement among Humans with Diverse Preferences Part 2: Project Pitch [Slides] Students present their project idea and form teams.	Required Readings: Fine-tuning Language Models to Find Agreement among Humans with Diverse Preferences Engaging Politically Diverse Audiences on Social Media
Mar 1	Part 1: Emergent Abilities of LLMs [Slides] This talk will cover broad intuitions about how large language models work. First, we will begin by examining some examples of what language models can learn by reading the internet. Second, we will consider why language models have gained traction recently and what new abilities they have that were not present in the past. Third, we will cover how language models can perform complex reasoning tasks. Finally, the talk will discuss how language models can have an improved user interface via instruction following. Speaker: Jason Wei (OpenAI) Title: Emergence in Large Language Models Part 2: NLP Evaluation Methods and Red Teaming [Slides]	Required Readings: Emergent Abilities of Large Language Models Chain of Thought Prompting Elicits Reasoning in Large Language Models Scaling Instruction-Finetuned Language Models Recommended Readings: Dissociating Language and Thought in Large Language Models: A Cognitive Perspective Discovering Latent Knowledge in Language Models Without Supervision Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation Learning to Summarize with Human Feedback
Mar 8	Part 1: Evaluating Human-model Interactions [Slides] Speaker: Mina Lee (Stanford) Title: Designing and Evaluating Language Models for Human Interaction Part 2: Human Experiments and Evaluation Methods [Slides]	Required Readings: Evaluating Human-Language Model Interaction CoAuthor: Designing a Human-AI Collaborative Writing Dataset for Exploring Language Model Capabilities Recommended Readings: Power to the People? Opportunities and Challenges for Participatory AI
Mar 15	Media Lab Research Panel Speaker: Ziv Epstein (PhD Student at MIT Media Lab, Human Dynamics) Title: Social Science Methods for Understanding Generative AI Speaker: Matt Groh (PhD Student at MIT Media Lab, Affective Computing) Title: Deepfake Detection Speaker: Trudy Painter (UROP/MEng at MIT Media Lab, Viral Communications) Title: Latent Lab: Generative ML as an Exploration Partner Speaker: Belén Saldias Fuentes (PhD Student at MIT Media Lab, MIT Center for Constructive Communication) Title: Community-aligned Content Moderation with Rationale Generation Speaker: Hang Jiang (PhD Student at MIT Media Lab, MIT Center for Constructive Communication) Title: CommunityLM: Probing Partisan Worldviews from Language Models	Related Readings: Ziv Epstein: Who Gets Credit for AI-Generated Art? Deceptive AI Systems That Give Explanations Are Just as Convincing as Honest AI Systems in Human-Machine Decision Making Matthew Groh: Deepfake Detection by Human Crowds, Machines, and Machine-informed Crowds Human Detection of Political Deepfakes across Transcripts, Audio, and Video Trudy Painter: Latent Lab Belén Saldías Fuentes: Human-AI Collaboration for Content Curation @ Reddit Hang Jiang: CommunityLM: Probing Partisan Worldviews from Language Models Relevant work from Eric Chu (CCC): Language Models Trained on Media Diets Can Predict Public Opinion
Mar 22	Part 1: AI-Mediated Communication [Slides] This talk will discuss the phenomenon of AI-Mediated Communication (AI-MC) and its potential impact on human communication outcomes, language use, and interpersonal trust. The author outlines early experimental findings showing that AI involvement can shift written content and opinions, change message ownership, impact blame assignment, and affect trust evaluations, highlighting the need for new approaches to the development and deployment of these technologies. Speaker: Mor Naaman (Cornell Tech) Title: "My AI must have been broken": Understanding our Future of AI-Mediated Communication Part 2: Discussion on the public policies on AI-generated content. [Slides]	Required Readings: Human Heuristics for AI-Generated Language Are Flawed AI-Mediated Communication: How the Perception that Profile Text was Written by AI Affects Trustworthiness Recommended Readings: Interacting with Opinionated Language Models Changes Users’ Views Artificial Intelligence Can Persuade Humans on Political Issues
Mar 29	Break
Apr 5	Part 1: LLMs as Simulated Agents [Slides] Speaker: John Horton (MIT) Title: Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? Part 2: Discussion on the call for 6-month AI morotorium: "Pause Giant AI Experiments: An Open Letter". [Slides]	Required Readings: Large Language Models as Simulated Economic Agents: What Can We Learn from Homo Silicus? Recommended Readings: Language Models as Agent Models Out of One, Many: Using Language Models to Simulate Human Samples Quantifying the Narrative Flow of Imagined versus Autobiographical Stories Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies Generative Agents: Interactive Simulacra of Human Behavior Can AI Language Models Replace Human Participants? Social Simulacra: Creating Populated Prototypes for Social Computing Systems Want To Reduce Labeling Cost? GPT-3 Can Help
Apr 12	Societal Impacts of LLMs [Slides]	Required Readings: Anatomy of an AI System On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Lessons from the GPT-4chan Controversy GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models Recommended Readings: Foundation Models and Fair Use Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models Ethical and Social Risks of Harm from Language Models GPT-4 Chan Controversy Evaluating Verifiability in Generative Search Engines “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI What Happens When ChatGPT Starts to Feed on Its Own Writing?
Apr 19	Risks and Tools for Transparency [Slides]	Required Readings: Auditing Large Language Models: A Three-layered Approach Using Algorithm Audits to Understand AI Google denies Bard was trained with ChatGPT data Assessing the Risks of Language Model “Deepfakes” to Democracy Recommended Readings: How ChatGPT Hijacks Democracy How generative AI impacts democratic engagement A Watermark for Large Language Models Extracting Training Data from Large Language Models Locating and Editing Factual Associations in GPT Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned Datasheets for Datasets
April 26	Final project presentations I
May 3	Final project presentations II
May 10	No Class (work on final papers)
May 17	Project Submission Deadline

MAS.S68: Generative AI for Constructive Communication
Evaluation and New Research Methods

Spring 2023

Description

Logistics

Instructors

Schedule

MAS.S68: Generative AI for Constructive Communication Evaluation and New Research Methods

Spring 2023

Description

Logistics

Instructors

Schedule

MAS.S68: Generative AI for Constructive Communication
Evaluation and New Research Methods