CSCI 5541, NLP

Fall 2023, Tuesday and Thursday, 11:15am to 12:30pm, Mechanical Engineering 108

Course Information

Natural Language Processing (NLP) is an interdisciplinary field that is based on theories in linguistics, cognitive science, and social science. The main focus of NLP is building computational models for applications such as machine translation and dialogue systems that can then interact with real users. Research and development in NLP therefore also includes considering important issues related to real-world AI systems, such as bias, controllability, interpretability, and ethics. This course will cover a broad range of topics related to NLP, from theories to computational models and applications to data annotation and evaluation. Students will read papers on those topics, create an annotated dataset, and implement algorithms on applications they are interested in. There will be a semester-long class project where you collect your own dataset, ensure it is accurate, develop a model using existing computing tools, evaluate the system, and consider its ethical and societal impacts. The grade will be evaluated based on the course project, participation, and assignments.

8980 vs 5980 vs 5541: Some lectures across the three classes will be shared but they have different focuses; 5980 (NLP with Deep Learning) focuses on more "processing" parts of NLP, particularly with deep learning methods. Students will gain an instruction to cutting-edge techniques in deep learning for NLP. 8980 (Intro to NLP Research) covers broad aspects of NLP research as an interdisciplinary problem, including theory grounding, data annotation, error analysis, and applications to different fields. 5541 (NLP, current course) is an introductory class to cover some basic NLP techniques with applications such as question answering, dialogue, and machine translation.

All class material will be posted on the class page. We will use Canvas for homework submissions and grading, and Slack for discussion and QA.

Instructor: Dongyeop Kang (a.k.a DK)
Class meets: Tuesday and Thursday, 11:15am to 12:30pm, Mechanical Engineering 108
TAs: Zae Myung Kim (kim01756@umn.edu); Shirley Anugrah Hayati (hayat023@umn.edu)
Office hours: DK: Friday, 4pm - 4:30pm in Shepherd 259; Zae: Monday, 5pm - 5:30 PM via Zoom; Shirley: Wednesday, 10am-10:30 PM via Google Meet
Class page: https://dykang.github.io/classes/csci5541/F23
Slack: csci5541f23.slack.com/
Canvas: canvas.umn.edu/courses/391352

Grading and Late Policy

Grading

50% Homework (hw1 and hw2 for individual, hw3 and hw4 for team)
30% Project (team)
10% Reading Assignment (individual)
10% Class Participation

Late policy for deliverables

Each student will be granted 2 late days to use for homeworks over the duration of the semester. After all free late days are used up, penalty is 25% for each additional late day. However, projects submitted late after all late days have been used will receive no credit.

Schedule

We will cover basic NLP representations to build text classifiers P(y|x) and language models (P(x)), with some advanced topics. You will develop your own NLP systems during the semester-long project. Pleaes pay attention to due dates and reading assignment.

Date	Topic	Readings
Sep 5	Class Overview
Sep 7	Intro to NLP
Sep 12	Text Classification (1) (updated) Tutorial on Scikit-Learn programming (1) (Zae) HW0 out	Determining the sentiment of opinions From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
Sep 14	Text Classification (2) (updated) Tutorial on PyTorch programming (2) (Zae) HW0 due	Beyond Accuracy: Behavioral Testing of NLP Models with CheckList Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica Style is NOT a single variable: Case Studies for Cross-Style Language Understanding
Sep 19	Finetuning a Classifier (Shirley) Tutorial on HuggingFace's Transformers (Shirley) HW1 out	Text classifier with NLTK and Scikit-Learn
Sep 21	Lexical Semantics	FrameNet II: Extended Theory and Practice and FrameNet Project OntoNotes: The 90% Solution WordNet
Sep 26	Distributional Semantics and Word Vectors (1)	From Frequency to Meaning: Vector Space Models of Semantics Efficient Estimation of Word Representations in Vector Space Gensim's word2vec tutorial
Sep 28	Distributional Semantics and Word Vectors (2) HW1 due	Chapter 3 of Jurafsky and Martin A Neural Probabilistic Language Model Long Short-Term Memory
Oct 3	Language Models (1): Ngram LM, Neural LM HW2 out	Linguistic Regularities in Continuous Space Word Representations GloVe: Global Vectors for Word Representation Retrofitting Word Vectors to Semantic Lexicons
Oct 5	Language Models (2): RNNs, Search Algorithms	The Curious Case of Neural Text Degeneration Mutual Information and Diverse Decoding Improve Neural Machine Translation
Oct 10	Language Models 3: Search in Training, Evaluation	Sequence Level Training with Recurrent Neural Networks An Actor-Critic Algorithm for Sequence Prediction Training language models to follow instructions with human feedback
Oct 12	Project Guideline HW2 due (Oct 15, Sunday)
Oct 17	Contextualized Word Embeddings HW3 out Project Team Formation Due	Deep contextualized word representations BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding A Primer in BERTology: What we know about how BERT works
Oct 19	Deep Dive on Transformer (1)	Attention is All you Need Tutorial on Illustrated Transformer Language Models are Unsupervised Multitask Learners
Oct 24	Deep Dive on Transformer (2)	Language Models are Few-Shot Learners Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Oct 26	Project Proposal Pitch and Discussion (1) HW3 due (Oct 29, Sunday)	Slides Deck for Group A A list of teams for Group A will be announced.
Oct 31	Project Proposal Pitch and Discussion (2)	Slides Deck for Group B A list of teams for Group B will be announced.
Nov 2	Pretraining Project proposal due	Scaling Laws for Neural Language Models On the Opportunities and Risks of Foundation Models On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
Nov 7	Prompting (1) HW4 out	Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing Calibrate Before Use: Improving Few-Shot Performance of Language Models Prefix-Tuning: Optimizing Continuous Prompts for Generation
Nov 9	Prompting (2)	Training language models to follow instructions with human feedback Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Nov 14	Data Annotation	Annotation Artifacts in Natural Language Inference Data Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
Nov 16	Ethics in AI (Shirley)	The Ethics of Artificial Intelligence On Calibration of Modern Neural Networks Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings Does BERT Learn as Humans Perceive? Understanding Linguistic Styles through Lexica "Why Should I Trust You?": Explaining the Predictions of Any Classifier Differential Privacy
Nov 21	Advanced Topic: Instructing and augmenting LLMs (Zae) Project Midterm Office-Hour Due (Due) HW4 due (Nov 21, Tuesday)	Training language models to follow instructions with human feedback Augmented Language Models: a Survey Toolformer: Language Models Can Teach Themselves to Use Tools Internet-augmented language models through few-shot prompting for open-domain question answering
Nov 23	No Class (Thanksgiving)
Nov 28	Advanced Topics
Nov 30	Final Project Poster: Session A	Evaluating modern Vision Language Model Zero-shot Performance on the TQA dataset, Flashcard Generator (Jashwin Acharya, Nick Schnabel, Ahmed Shahkhan) Aligning Large Language Models through Efficient Reinforcement Learning, Language Model Alignment (Siliang Zeng) Decoding Questions: A Comparative Study of NLP Models on QA Datasets, Semanticons (Gayathri Balaji, Joshua Jose, Naga Hemachand Chinta, Vaishnavi Venkatasubramanian) Evaluating Resume Efficacy and Optimal Features Using LLM, NLP Vision (Caleb Wiebolt, Ross Volkov, Ben Davidson) Grammar Checking on Generative Text Project Proposal, TBD (Ryan Oak, Cole Pastor, Max Meyer) Title Generation for Fictional Stories, Title Fight (Jacob Malin, Tony Diep, Oscar Wiestling, Cody Cayetano) AIdentification: Using ChatGPT for Author Attribution, VJK (Vivek Kethineni, Junhan Wu, Kate Pappas, Jackson Kary) Evaluating Large Language Model Performance on Subjective Metrics of Text Generation, Word Wizards (Ashwin Wariar, Dominic Deiman, Mitch Gansemer, Sean Mccarty)
Dec 5	Final Project Poster: Session B	Detecting Sensationalized Headlines in News Articles, Clickbait Analysis (Ishaan Gupta, Ishan Shetty, Max Gieseke) Critical Analysis of Hate Speech Detection Models, Golish Project (Jonathan Paraschou, Sammer Hassan, Christian Golish) Analysis of Large Language Model for Numerical Reasoning, LM Bros (Harry Hong, Jong Inn Park, Jooyong Lee) Between the Lines: Decoding Sarcasm in Headlines, Verbavores (Hamed Hagi, Daniel Swarts, Tianhong Zhang) Can Intermediate Reasoning Chains Rationalize Better in MultiModals?, Pilot (Mani Deep Cherukuri, Shunichi Sawamura, Shashank Sharma, Nicole Vu) An Adversarial Dataset for Fine-Tuning a Language Model, The Adversaries (Sam Penders, Jianing Wen, Benjamin Withey) ChatGPT: Logical Genius or Educated Guesser, Transformative Attentors (Ke-Chin Chen, Rohan Shanbhag, Marco Tabacman) Journal to Wiki Text Style transfer: Simplifying the medical literature for broader comprehension, Word Nerds (Alex Jonas, Annie Lam, Matthew L. Senjem, Luis Silva)
Dec 7	No Class (EMNLP) Project Final Report Due (Dec 8)

Homework Details (50%)

Collaboration is required (maximum of 4 people). Questions should be communicated with TAs, and please use the shared Slack channels (e.g., #hw1) to share them with others. The use of outside resources (books, research papers, websites, etc.) or collaboration (students, professors, chatGPT, etc.) must be explicitly acknowledged. The deadline is by midnight (11:59PM) of the due date. Check out the notes to students. Check out the homework description and link to canvas below:

HW0: Building a MLP-based text classifier with pytorch (0 points, Individual, due: ~~Sep 15, Friday~~ Sep 17, Sunday) (, )
HW1: Finetuning a text classifier using HuggingFace (15 points, Individual, due: ~~Sep 29 ,Friday~~ Oct 1, Sunday) (, )
HW2: Building ngram language models (LMs) from scratch (15 points, Individual, due: ~~Oct 13 ,Friday~~ Oct 15, Sunday) (, )
HW3: Generating and evaluating text from pretrained LMs (10 points, Team, due: ~~Oct 24 ,Tuesday~~ Oct 29, Sunday) (, )
HW4: Prompting with large language models (LLMs) (10 points, Team, due: ~~Nov 17, Friday~~ ~~Nov 19, Sunday~~ Nov 21, Tuesday) (, )

Project Details (30%)

Please carefully read the project description first .

Every group member (maximum of 4 people) should submit their report, link to code (or a zipped code), and presentation slides/poster on Canvas before the deadline. Your project will be evaluated in the following criteria (check out link to canvas ):

Proposal report (5 points, due: Nov 3) ()
Midterm office hour participation (5 points, due: Nov 21) ()
Poster presetnation (5 points, due: Nov 29) ()
Final report (15 points, due: Dec 8) ()

For both proposal and final reports, please use official ACL style templates (Overleaf or links). Your final project report will be evaluated based on this rubrick. Note that your report and slides would be publicly shared. A course project would be one of the following types:

Critical analysis of existing model/dataset (default project),
New research results judged suitable for acceptance to a NLP or ML workshop,
Collection of your own dataset on new problems or adversarial datasets that can fool the existing systems ,
An in-depth literature survey on emerging topics,
Interactive demonstration (e.g., Chrome Extension, Flask) or visualization of existing systems,
New open-source repository or dataset with a high impact on the community

You can find some of the previous years' project reports and posters below:

Simulating Everyone's Voice: Exploring ChatGPTs Ability to Simulate Human Annotators, CSCI 5541 S23
Vision & Language-guided Generalized Object Grasping, CSCI 5541 S23
Who is speaking? Discriminating Artificial and Human-Generated Text with A Natural Language Processing Approach, CSCI 5541 S23
Generalizability of FLAN-T5 Model Using Composite Task Prompting, CSCI 5541 S23
Comparing the Effectiveness of Fine-tuning vs. One-Shot Learning on the Kidz Bopification Task, CSCI 5541 S23
Exploring Hallucination in LLMs: A Study of GPT-3.5 and GPT-4 to Enhance Fact-Based Results, CSCI 5541 S23
Generating Controllable Long-dialogue with Coherence, CSCI 5980 F22
Understanding Narrative Transportation in Fantasy Fanfiction, CSCI 8980 S22
Exploring Episodic Memory through Cross-modal representations, CSCI 8980 S22

Reading Details (10%)

For each reading assignment, you will choose one paper from the readling list from the lectures before the deadline, and submit a short (1-page) summary to Canvas (), including the following information:

Paper title
An overview of the paper with novel contributions and major findings
Weakness of the proposed method
Ideas for potential improvements and general thoughts

The deadline and canvas link are as follows:

Reading assignment #1 (2.5 points, due: Sep 19) ()
Reading assignment #2 (2.5 points, due: Oct 24) ()
Reading assignment #3 (2.5 points, due: Nov 9) ()
Reading assignment #4 (2.5 points, due: Nov 28) ()

Class Participation (10%)

Your class participation is thoroughly evaluated. Put your profile picture on Canvas and Slack so we can match you for the final evaluation. The following metrics will be used to grade your participation:

Participation and discussion in class
Discussions in Slack and during Office Hours for both instructors and TAs
Discussion and QA during the presentation of the project proposal and poster

Prerequisites

Required: CSCI 2041 Advanced Programming Principles

Recommended: CSCI 5521 Introduction to Machine Learning or any other course that covers fundamental machine learning algorithms.

Furthermore, this course assumes:

Good coding ability, corresponding to at least a third or fourth-year undergraduate CS major. Assignments will be in Python.
Background in basic probability, linear algebra, and calculus.

Notes to students

Academic Integrity

Assignments and project reports for the class must represent individual effort unless group work is explicitly allowed. Verbal collaboration on your assignments or class projects with your classmates and instructor is acceptable. But, everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. If you have any doubts about whether a particular action may be construed as cheating, ask the instructor for clarification before you do it. Cheating in this course will result in a grade of F for course and the University policies will be followed.

Students with Disabilities

If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources Center (DRC).

COVID-19

All students are expected to abide by campus policies regarding COVID-19 including masking and vaccination requirements. This is an in-person class with daily in-person activities, but we may consider a hybrid or online option. If you're feeling sick, stay at home and catch up with the course materials instead of coming to class!

Book

Textbook is not required but the following books are primarily referred:

Jurafsky and Martin, Speech and Language Processing, 3rd edition [online]
Jacob Eisenstein. Natural Language Processing

Resources

The course materials are inspired by the slides of Chris Manning at Stanford, David Bamman at UC Berkeley, and Graham Neubig at CMU.