Course Description

​Hands-on introduction to computational methods in empirical linguistic analysis and natural language processing. Topics include language modeling, text classification, linguistic annotation, computational semantics, and machine translation. Students will implement and apply computational models to real linguistic datasets, and conclude the course with a final project.


Learning Objectives

  • Understanding of key issues in several core areas of research in computational linguistics.
  • Ability to implement a set of foundational algorithms for linguistic understanding from scratch.
  • Experience with empirical evaluation of computational models, including error analysis.
  • Practical familiarity with software packages for language processing in Python.


Schedule

Week Dates Content Materials
1 3/30 Introduction
what is computational linguistics?, course overview and policies, Quest login

For Reference

Relevant Readings

Slides [1]

Assignment 0: Setup and Intro [due 4/3]

2 4/4
4/6
Regular Expressions and Edit Distance

regexes, edit distance, dynamic programming

Before Class

In Class (as time allows)

Relevant Readings

Slides [1, 2]

Assignment 1: Edit Distance [due 4/12]

3 4/11
4/13
Text Normalization and Language Modeling

tokenization, lemmatization and stemming, n-grams, perplexity, maximum likelihood estimation, smoothing

Before Class

In Class (for reference)

Relevant Readings

Slides [1,2]

Assignment 2: Ngram LM [due 4/19]

4 4/18
4/20
Classification Foundations

supervised learning, naive bayes, perceptron, generative vs. discriminative models, data splits, evaluation metrics

Before Class

Relevant Readings

Slides [1]

Assignment 3: Classification [due 4/28]

5 4/25
4/27
Linguistic Structure and Annotation

NLP libraries, POS tagging, parsing, NER, crowdsourcing, annotator agreement, ethical and practical concerns

Software

Relevant Readings

Slides [1]

Midterm Self-Evaluation [due 5/3]

6 5/2
5/4
Computational Semantics 1

association metrics, word sense disambiguation, semantic resources

Before Class

Relevant Readings

Slides [1]

Assignment 4: Bias Audit [due 5/12]

7 5/9
5/11
Computational Semantics 2

vector space semantics and embedding models, similarity metrics

Before Class

Relevant Readings

Slides [1]

Assignment 5: Semantic Similarity [due 5/22]

8 5/16
5/18
Neural Models

feed-forward networks, recurrent networks, neural language models

Before Class

Relevant Readings

Slides [1]

9 5/23
5/25
Transformers

BERT, contextual embeddings, pre-training, fine-tuning, HuggingFace

For Reference

Slides [1, 2]

10(a) 5/30 Topic Models

unsupervised learning, graphical models basics, latent dirichlet allocation, gibbs sampling, k-means clustering, expectation-maximization algorithms

Optional Reading/Viewing

Relevant Readings

Slides [1 (Blei, slides 18-41)]

10(b) 6/1 What's Next?

properties of contemporary LLMs, application areas, empirical and ethical concerns

Relevant Readings

Slides [1]

Final Project or Assignment 6: Topic Modeling [due 6/5]

Final Self-Evaluation [due 6/5]


Materials

All course materials are available for free online. We will refer primarily to these two textbooks:

When assigned readings, before class I encourage you to skim for basic understanding rather than detail. When doing the assignments, refer back to get details.

Some of the course materials are drawn from or inspired by relevant courses at other institutions, which also serve as excellent resources and points of reference, including:


Structure

Lectures, Readings, and Videos

This course is largely lecture-based, but will incorporate components of a "partially flipped" course, where occasionally we will do some reading or watching of videos before class, and then spend time in class working on something together. All lectures will be recorded via Panopto and available in Canvas. In-class attendance is strongly encouraged but not mandatory.


Assignments

The coursework will be structured around assignments to be completed individually, roughly one per week (with a few breaks), aimed at giving you hands-on practice with the material from that week.

Each assignment has a "core" component which represents the basic material I would like everyone in the class to complete. This portion of the assignment has an autograder on Quest that helps you check the correctness of your work. The teaching staff and I will record whether your work passes the autograder and spot-check this part of the assignment.

The core component of each assignment is only considered complete when it passes the autograder; if you are struggling to achieve this, post on Ed or come to office hours to figure it out.

Then each assignment also has an "extensions" component, which provides a list of possible ways the basic assignment could be extended in various directions. You are also always free to invent your own extensions. This component is optional but encouraged. The teaching staff and I will look through this component of your assignments to provided qualitative feedback.

We can be flexible with deadlines if circumstances arise, but it's important to stay on top of these assignments because we will be moving quickly from one topic to another. I'll discuss this more in class, but please note that we take no responsibility for reviewing assignments turned in more than one week late.


Group Work and Peer Evaluation

A few of times throughout the quarter we will include time for various in-class group work activities as well as peer evaluation of one another's assignments.


Final Project

This course includes a final project component. I'm very open as to what this could be. Basically it's an opportunity for you to take a self-directed approach to learning more about some topic in this field. Midway through the quarter I'll ask you for ideas on what you might do, and am always glad to consult on any questions you might have. In terms of structure here's a few possibilities:

  • Develop and carry out an independent project applying methods from this class
  • Use techniques learned in class to advance your existing research
  • Carry out a detailed linguistic error analysis on the outputs of an NLP system or systems
  • Replicate a paper in the field (Rob will provide some examples of good papers for this)
  • Write up a literature survey on a topic in the field

Regardless, the requirement is to present a writeup in ACL 2020 format as well as your code (if any). At minimum, if your project involves substantial coding I expect a 2-4 page writeup explaining what you did; if your project is only written (e.g. lit survey or error analysis), I expect 6-8 pages.

Group projects of up to three members are allowed, however I will expect the effort involved to scale roughly linearly with the number of group members. If you work in a group, you must include a paragraph at the end of your writeup explaining who did what.

Alternatively, we will also provide a "default assignment" which can be done individually in lieu of a final project. I encourage you to take advantage of the opportunity to do a self-directed final project for this course, but the default assignment is a reasonable alternative.


Evaluation

Not a fan of grades, to be honest. Research has shown that traditional numerical/letter grades decrease intrinsic motivation and joy for learning, can undermine performance, and are potentially riddled with implicit bias. For more reading on this topic:

Therefore, grades go against my central goal for this course: getting you excited about and engaged with the wonderful world of computational linguistics. I am much more interested in helping you get what you want to out of the course through qualitative evaluation for your benefit. This will largely come in the form of written and in-person feedback on your work from the teaching team and me, as well as peer evaluation from your classmates.

In the interest of maintaining a healthy working relationship with the registrar, however, I will submit final grades at the end of the quarter. Below are the forms of evaluation we'll do and how much they'll contribute to what I end up submitting.


Self-evaluation (50%)

You know at least as well as I do how the course is going for you, so we'll have two self-evaluations, at the middle and end of the class. In the first week I'll send out a survey in which I ask you to explain your goals for the course; then for each self-evaluation I'll ask you to reflect on your process and progress towards those goals, your participation in the course, and ultimately to give yourself a grade and explain your reasoning.

In doing these evaluations, here are the kinds of questions I'll ask you to consider:

  • Did you turn in your assignments on time? (Note: for me, turning in assignments late with a good reason is equivalent to turning them in on time.)
  • Did you attend (or watch later) in-class lectures as much as you could? (Note: again, for me, missing class with a good reason is equivalent to attending.)
  • Did you keep up with readings, videos, and in-class activites?
  • Did you spend any allotted time in breakout rooms on work for this class?
  • Did your assignments run all the way through, and pass any tests?
  • Did you reach out for help when you needed it? (Note: doing this is positive!!!)
  • Did you collaborate with others to contribute to our classroom community (in breakout rooms, by helping on Ed, or outside of class)?
  • Did you challenge yourself, or did you do the minimum?

I hope to simply take your self-evaluation grades at face value, although if your self-evaluation disagrees significantly with my perception (in either direction) I may contact you or ask you to meet with me to hash out why our impressions differ.

Lastly, I want to note two important baselines, so we're all on the same page. The minimum baseline to pass the course is to complete at least 4 out of 6 assignments and turn in the midterm and final self-evaluations. The minimum baseline to achieve strong performance (e.g., an "A") is to have all base assignments fully pass the autograder, submit a substantial final project (or complete the default final assignment and have it pass the autograder), and to challenge yourself appropriately. Note that both of these are the necessary rather than sufficient baselines.


Effortful Completion (50%)

The teaching team and I, in turn, will be watching your process, providing structures for learning, and trying to help keep you on track. We will record whether (and when) your assignments pass the autograders and what work you did on extensions.

At the end of the quarter, I'll give you a holistic grade for effortful completion of the assignments, peer code review, participation (e.g. on Ed), and your final project. My evaluation is also very liable to be influenced by your self-evaluation and report of your process and progress.


Inclusion Statement

I am committed to creating an inclusive environment that actively values the diversity of backgrounds, identities, and experiences of everyone in the classroom. I welcome you to talk with me if you have any feedback or if there's anything I can do to better support you. If you'd prefer to contact me anonymously you can do so using the form at the bottom of my faculty webpage.

University-Requested Syllabus Inclusions

Academic Integrity Statement

Students in this course are required to comply with the policies found in the booklet, "Academic Integrity at Northwestern University: A Basic Guide". All papers submitted for credit in this course must be submitted electronically unless otherwise instructed by the professor. Your written work may be tested for plagiarized content. For details regarding academic integrity at Northwestern or to download the guide, visit: https://www.northwestern.edu/provost/policies/academic-integrity/index.html


Accessibility Statement

Northwestern University is committed to providing the most accessible learning environment as possible for students with disabilities. Should you anticipate or experience disability-related barriers in the academic setting, please contact AccessibleNU to move forward with the university’s established accommodation process (e: accessiblenu@northwestern.edu; p: 847-467-5530). If you already have established accommodations with AccessibleNU, please let me know as soon as possible, preferably within the first two weeks of the term, so we can work together to implement your disability accommodations. Disability information, including academic accommodations, is confidential under the Family Educational Rights and Privacy Act.


COVID-19 Classroom Expectations Statement

Students, faculty, and staff must comply with University expectations regarding appropriate classroom behavior, including those outlined below and in the COVID-19 Code of Conduct. With respect to classroom procedures, this includes:

  • Policies regarding masking and social distancing evolve as the public health situation changes. Students are responsible for understanding and complying with current masking, testing, Symptom Tracking, and social distancing requirements.
  • In some classes, masking and/or social distancing may be required as a result of an Americans with Disabilities Act (ADAccommodation for the instructor or a student in the class even when not generally required on campus. In such cases, the instructor will notify the class.
  • No food is allowed inside classrooms. Drinks are permitted, but please keep your face covering on and use a straw.
  • Faculty may assign seats in some classes to help facilitate contact tracing in the event that a student tests positive for COVID-19. Students must sit in their assigned seats.

If a student fails to comply with the COVID-19 Code of Conduct or other University expectations related to COVID-19, the instructor may ask the student to leave the class. The instructor is asked to report the incident to the Office of Community Standards for additional follow-up.


Exceptions to Class Modality

Class sessions for this course will occur in person. Individual students will not be granted permission to attend remotely except as the result of an Americans with Disabilities Act (ADA) accommodation as determined by AccessibleNU.

Maintaining the health of the community remains our priority. If you are experiencing any symptoms of COVID do not attend class and update your Symptom Tracker application right away to connect with Northwestern’s Case Management Team for guidance on next steps. Also contact the instructor as soon as possible to arrange to complete coursework.

Students who experience a personal emergency should contact the instructor as soon as possible to arrange to complete coursework.

Should public health recommendations prevent in person class from being held on a given day, the instructor or the university will notify students.


Course Recordings

This class or portions of this class will be recorded by the instructor for educational purpose and available to the class during the quarter. Your instructor will communicate how you can access the recordings. [RV: On Canvas in the Panopto section.] Portions of the course that contain images, questions or commentary/discussion by students will be edited out of any recordings that are saved beyond the current term.


Prohibition of Recording of Class Sessions by Students

Unauthorized student recording of classroom or other academic activities (including advising sessions or office hours) is prohibited. Unauthorized recording is unethical and may also be a violation of University policy and state law. Students requesting the use of assistive technology as an accommodation should contact AccessibleNU. Unauthorized use of classroom recordings – including distributing or posting them – is also prohibited. Under the University’s Copyright Policy, faculty own the copyright to instructional materials – including those resources created specifically for the purposes of instruction, such as syllabi, lectures and lecture notes, and presentations. Students cannot copy, reproduce, display, or distribute these materials. Students who engage in unauthorized recording, unauthorized use of a recording, or unauthorized distribution of instructional materials will be referred to the appropriate University office for follow-up.


Support for Wellness and Mental Health

Northwestern University is committed to supporting the wellness of our students. Student Affairs has multiple resources to support student wellness and mental health. If you are feeling distressed or overwhelmed, please reach out for help. Students can access confidential resources through the Counseling and Psychological Services (CAPS), Religious and Spiritual Life (RSL) and the Center for Awareness, Response and Education (CARE). Additional information on all of the resources mentioned above can be found here:

  • https://www.northwestern.edu/counseling/
  • https://www.northwestern.edu/religious-life/
  • https://www.northwestern.edu/care/