CSCI 8980-06 Intro to NLP

Spring 2022, Tuesday and Thursday, 4:00pm to 5:15pm, Keller Hall 2-260

Course Information

Natural Language Processing (NLP) is an interdisciplinary field that is based on theories in linguistics and cognitive/social science. The main focus of NLP is building computational models for applications such as machine translation and dialogue systems that can then interact with real users. Research and development in NLP therefore also includes considering important issues related to real-world AI, such as bias, ethics, controllability, and interpretability. This course will cover a broad range of topics related to NLP, from theories to computational models to data annotation and evaluation, leading to in-depth discussions with students. Students will read papers on those topics, create linguistically annotated data, and implement algorithms on applications they are interested in. Note that I will teach "NLP with Deep Learning" in Fall 2022 for those who are interested in computational aspects of NLP.

There will be a semester-long class project where you collect your own dataset, ensure it is accurate, develop a model using existing computing tools, evaluate the system, and consider its ethical and societal impacts. Every class, I will give a 30-minute lecture and students lead a discussion on the reading assignment for the rest. The grade will be evaluated based on the course project, participation, and assignments.

All class material will be posted on Canvas and on the class page. We will use Canvas for homework submissions and grading, Slack for discussion and QA. Please use Slack channels rather than personal emails or messages to ask questions. This helps other students, who may have the same question. Personal emails may not be answered. If you cannot make it to office hours, please use Slack to make an appointment.

Dongyeop Kang (a.k.a DK)
Class meets
Tuesday and Thursday, 4:00pm to 5:15pm in Keller Hall 2-260
Office hours
Friday, 3:00pm to 4:30pm in Shepherd 259

Class page


We will cover basic models and represetnations, applications, and advanced topics.
Pleaes pay attention to due dates and project presentations. You can use DK's office hours for project discussion. 🍬 is an optional reading.

Date Topic Readings (schedule)
Jan 18
Class Overview [slides]
HW1 out (Paper Presentation)
Jan 20
Text Classification [slides]
🍬Text classifier with NLTK and Scikit-Learn
Jan 25
Topic Modeling [slides]
HW2 out (Paper Replication)
🍬Blei, D. M. (2012) Probabilistic topic models Communications of the ACM, 55(4), 77-84.
🍬K-Means Clustering with scikit-learn
Jan 27
Language Models [slides]
Project consultation (Office Hour)
Feb 1
Lexical Semantics [slides]
Project Description out [slides]
  • Ruppenhofer, J., Ellsworth, M., Schwarzer-Petruck, M., Johnson, C. R., & Scheffczyk, J. (2016).FrameNet II: Extended Theory and Practice International Computer Science Institute, and FrameNet Project
  • Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2006, June) OntoNotes: The 90% Solution In Proceedings of the human language technology conference of the NAACL, Companion Volume: Short Papers (pp. 57-60).
🍬Word Senses and WordNet
Feb 3
Distributional Semantics [slides]
Project consultation (Office Hour)
🍬Gensim's word2vec tutorial
Feb 8
Contextualized Word Embeddings [slides]
🍬Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., ... & Liu, P. J. (2019).Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
🍬Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., ... & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach
🍬Smith, N. A. (2019). Contextual Word Representations: A Contextual Introduction
🍬 Fine-tuning tutorial on HuggingFace
Feb 10
Discourse [slides]
HW2 due, Feb 10 11:59pm
Project consultation (Office Hour)
Feb 15
Machine Translation [slides]
🍬 Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., ... & Zettlemoyer, L. (2020). Multilingual Denoising Pre-training for Neural Machine Translation> Transactions of the Association for Computational Linguistics, 8, 726-742.
🍬 Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural Machine Translation by Jointly Learning to Align and Translate
Feb 17
Question Answering and Reasoning [slides]
HW3 out (Error Analysis)
Project Proposal Due, Feb 17 11:59pm
Feb 22
Dialogue [slides]
🍬 Rashkin, H., Smith, E. M., Li, M., & Boureau, Y. L. (2018).Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset
🍬 Lewis, M., Yarats, D., Dauphin, Y. N., Parikh, D., & Batra, D. (2017). Deal or No Deal? End-to-End Learning for Negotiation Dialogues
🍬 Kang, D., Balakrishnan, A., Shah, P., Crook, P., Boureau, Y. L., & Weston, J. (2019). Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue
🍬 Dialogue system development frameworks: ParlAI and ConvKit
Feb 24
Summarization [slides]
🍬 Rush, A. M., Chopra, S., & Weston, J. (2015). A Neural Attention Model for Abstractive Sentence Summarization
Mar 1
No class
Mar 3
Styles [slides]
Mar 8 No class: Spring Break
Mar 10 No class: Spring Break
HW3 Due, Mar 11 11:59pm (extended)
Mar 15
Mid-way Project Presentation (Group A)
Mar 17
Mid-way Project Presentation (Group B)
Mar 22
Generation [slides]
Kathleen McKeown's keynote speech at ACL 2020, Rewriting the Past: Assessing the Field through the Lens of Language Generation
Mar 22
Coreference and IE
🍬 NLP concepts with spaCy
🍬 NeuralCoref 4.0
Mar 24
Dataset Annotation [slides]
Mar 29
Hypothesis testing and Evaluation [slides]
Student discussion
Mar 31
Social NLP
Guest lecture by Anjalie Field (CMU)
[slides] [recording]
Apr 5
Biases and ethics
Guest lecture by Dr. Jieyu Zhao (UMD) [slides] [recording]
Apr 7
Robust and Adversarial NLP
Guest lecture by Eric Wallace (UC Berkeley) [recording]
Apr 12
Guest lecture by Dr. Sumanth Dathathri (Google DeepMind) [recording]
HW4 out (Data annotation)
Apr 14
Data-centric NLP
Guest lecture by Dr. Swabha Swayamdipta (AI2/USC)
[recording will not be available]
Apr 19
Language Grounding to Vision and Robotics
Guest lecture by Dr. Yonatan Bisk (CMU) [recording will not be available]
Apr 21
Final project presentation (A)
April 26
Final project presentation (B)
April 28
Final project presentation (C)
HW4 Due, May 5, 11:59pm
Project report Due, May 5 11:59pm
Important but non-covered topics:
Human-in-the-loop and Interactive NLP

Grading and Late Policy


  • 40% Homeworks (total four homeworks)
  • 50% Final Project
  • 10% (potential bonus) Class Participation
    • Active participation in class discussions and project presentations

Late policy for deliverables

Each student will be granted 3 late days to use for homeworks over the duration of the semester. After all free late days are used up, penalty is 25% for each additional late day. However, projects submitted late after all late days have been used will receive no credit.

Homework Details (40%)

HW1: Paper Presentation (10%)

Please check the list of papers in the Readings tab in the schedule and place your name on two papers this sheet. Presenters are limited to two per class, so do not assign yourself if there are already two presenters except for Jan 27 (first two papers on Jan 27 will be presented on Jan 25). .

You are responsible for presenting the papers in class and leading the discussion. During every class, two students present for 20 minutes each, including QA and discussion. First, make an overview of the paper (10 minutes) and prepare three points for discussion (10 minutes), such as limitations of the proposed method, future directions, links to other similar papers, etc.

Please upload your slides here before the class. It is possible to borrow slides from authors, but you must have a deep understanding of the work and provide potential discussion points. The filename of your slides should be 0120_{Paper Title}_{Your Name}_{first,second}.{pptx,pdf} Sometimes there are more than two comparative or incremental papers assigned in one bullet point; then, you have to make a comparison between them and get a bonus point (2%). In some classes like Jan 25, there are no specific papers to read so we discuss papers from next lecture's topics.

HW2: Paper Replication (10%)

Due: Feb 10 11:59pm

You will get a taste of NLP leaderboard culture in this homework. You need to choose one of the following NLP tasks and replicate/reimplement the model. I strongly recommend you use existing code written by authors that appear in Papers-with-Code leaderboard or to use some basic Transformer models implemented in HuggingFace libraries. Do not spend too much time replicating the code. Instead, run an existing code on your target dataset, ensure you use the same evaluation metrics as the paper, and compare your results to the paper's.

Note that you will be treated as cheating if you do not correctly cite any tool or paper you consulted. The homework would serve as the foundation for your homework 3 and 4, and possibly your project.

Choose one of the following models and datasets. If you would like to choose other tasks and datasets, please talk to DK by Jan 28.

Tasks Datasets
Sentiment classification
Natural Language Inference
Commonsense Reasoning
Dialogue, Summarization, and Style Transfer
  • GYAFC (leaderboard, paper)
  • QA and Visual QA
  • VQA 2.0 (leaderboard, paper)
  • Semantic Evaluation

    Please upload your code and report to Canvas by Feb 10 11:59pm.

    • Code: a zipped file containing your training/inference scripts.
    • Report: 2 - 3 pages, including model description with references, link to the original codes you referred to, evaluation metrics, performance comparison with other models in the leaderboard, training/inference time, sweeping of hyperparameters (e.g., learning parameter, dropout rate), and other details of the experiment.


    HW3: Error Analysis (10%)

    Due: Mar 11 11:59pm

    You will now analyze the errors of your model implemented in the previous homework. HW3 consists of four steps where each step has a bonus point so please be creative and try other analysis techniques.

    • Step #1: collect and featurize errors
    • You first store incorrectly predicted samples (no more than 500) by your system in HW2 into a google spreadsheet. For each sample, you need to store the following information as separate columns:
      • Input text
      • Ground-truth label
      • Predicted label with confidence score (i.e., softmax output from your classifier to the ground-truth label)
      • (bonus points) other metadata or linguistic features using Spacy or other tools
        E.g., length of the sentence, POS tags, named entities, sentiment score, etc

    • Step #2: label error types and fixes
    • You go through each row and manually label them in the following categories:
      • Types/Causes of errors, e.g., incorrect annotation and over-generalization
      • Potential solutions to fix the cause, e.g., more training samples and some rules
      Rank your annotations by frequencies and show two tables of distributions of error types and solutions
      (Bonus point) be creative in thinking of new error types and potential solutions.

    • Step #3: visualize errors
    • You visualize the errors and correctly predicted samples in a 2-dimensional semantic space and explore an overall view of how they are projected. Semantic space :
      • Take vector representations of correct and incorrect samples from the classifier’s output (HuggingFace's model output class)
      • Project them in reduced dimensions using PCA or t-SNE (paper, code) (i.e., 768 dimension -> 2 dimension)
      (Bonus point) Dataset map space :
      • Project the samples in dataset map space (paper1 and paper2)
      • where x-axis is confidence scores to ground-truth label and y-axis is variances of classifier’s prediction over the epochs of training

    • Step #4: analyze
    • You need to summarize important findings observed from the previous steps of analyses. Please discuss limitations of the current model used in your HW2 and potential future directions to address the errors.
      (bonus point) Try a different out-of-distribution dataset on the same task
      • e.g., apply movie review to sentiment classifier trained on SST2
      • e.g., apply medical text to entailment classifier trained on MNLI

    Please upload your annotated spreadsheet and report to Canvas by March 3 11:59pm.

    • Spreadsheet: maximum 500 error samples with annotated errors, extracted features, and labeled types/fixes.
    • Report: maximm 4 pages, including distribution of features/types/fixes, visualizations, and in-depth analysis and discussion.


    HW4: Data Annotation (10%)

    Due: May 5, 11:59pm
    n this assignment, you will learn how data annotation works in NLP research and how important they are in NLP model development. Your group will form a group of three or four people, collect 300 adversarial samples on a target task that can fool the system you built in homework 2, annotate them by each of your team members, calculate inter-annotator agreement (IAA), and write down a short report on your experience.
    Please read this description carefully.

    Project Details (50%)

    The class project is meant for a group of students (2~3) to taste a full pipeline of NLP research, from data annotation to model development to experiment and error analysis to visualization to discussion on limitations and ethical issues. Please read the project description slides.

    A course project would be one of the following types:

    • New research results judged suitable for acceptance to a top NLP or ML conference like ACL/EMNLP/NeurIPS/ICLR,
    • Evaluation and critical analysis of existing work on a new dataset,
    • An in-depth literature survey, or
    • New open-source repository or dataset with a high impact on the community

    Your project will be evaluated in the following criteria:

    • Proposal and literature review (10%), Due: Feb 17, 11:50pm
      • Maximum 3 pages
    • Midterm presentation (10%), Mar 15 and 17
      • 10-min presnetation and 5-min QA
      • Check out the presentation schedule
      • Upload your slides here before the class
      • Expected content to be presented:
        • Specific feedback you like to get from audience
        • Motivation
        • Problem definition
        • Novel contribution compared to prior work
        • Proposed methods
        • Initial results
        • Plan for the second half of the semester
    • Final presentation (10%), April 21, 26, and 28
      • 15-min presnetation and 10-min QA
      • Check out the final presentation schedule
      • Upload your slides here before your presentation
      • Expected content to be presented:
        • Motivation, problem definition, and novel contribution compared to prior work
        • Proposed methods with "motivational examples"
        • Experimental setups and final results
        • Discussion on limitations, ethical issues, etc
        • Conclusion and future directions
    • Final report and code (20%), Due: May 5, 11:50pm
      • Maximum 8 pages
      • Rubrick for evaluation

    Every group member should submit their report, link to code, and presentation slides on Canvas before the deadline. For both proposal and final reports, please use official ACL style templates (Overleaf or links). Note that your report and slides would be publicly shared on this page.


    CSCI 5521 Introduction to Machine Learning or any other course that covers fundamental machine learning algorithms.

    Furthermore, this course assumes:

    • Good coding ability, corresponding to at least a third or fourth-year undergraduate CS major. Assignments will be in Python.
    • Background in basic probability, linear algebra, and calculus.

    Notes to students

    Academic Integrity

    Assignments and project reports for the class must represent individual effort unless group work is explicitly allowed. Verbal collaboration on your assignments or class projects with your classmates and instructor is acceptable. But, everything you turn in must be your own work, and you must note the names of anyone you collaborated with on each problem and cite resources that you used to learn about the problem. If you have any doubts about whether a particular action may be construed as cheating, ask the instructor for clarification before you do it. Cheating in this course will result in a grade of F for course and the University policies will be followed.

    Students with Disabilities

    If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources Center (DRC).


    All students are expected to abide by campus policies regarding COVID-19 including masking and vaccination requirements. This is an in-person class with daily in-person activities, but we may consider a hybrid or online option. If you're feeling sick, stay at home and catch up with the course materials instead of coming to class!


    Textbook is not required but the following books are primarily referred:
    • Jurafsky and Martin, Speech and Language Processing, 3rd edition [online]
    • Jacob Eisenstein. Natural Language Processing
    The course materials are inspired by the slides of Dan Jurafsky at Stanford, David Bamman at UC Berkeley, and Noah Smith at University of Washington.