Course Description
Hands-on introduction to computational methods in empirical linguistic analysis and natural language processing.
Topics include language modeling, text classification, linguistic annotation, computational semantics, and machine
translation. Students will implement and apply computational models to real linguistic datasets, and conclude the course
with a final project.
Learning Objectives
- Understanding of key issues in several core areas of research in computational linguistics.
- Ability to implement a set of foundational algorithms for linguistic understanding from scratch.
- Experience with empirical evaluation of computational models, including error analysis.
- Practical familiarity with software packages for language processing in Python.
Schedule
Week |
Dates |
Content |
Materials |
1 |
1/4 |
Introduction
what is computational linguistics?, course overview and policies, Quest login |
For Reference
Relevant Readings
Slides [1]
Assignment 0: Setup and Intro
[due 1/8]
|
2 |
1/9
1/11 |
Regular Expressions and Edit Distance
regexes, edit distance, dynamic programming
|
Before Class
In Class (as time allows)
Relevant Readings
Slides [1,
2]
Assignment 1: Edit Distance
[due 1/17]
|
3 |
1/16
1/18 |
Text Normalization and Language Modeling
tokenization, lemmatization and stemming, n-grams, perplexity, maximum likelihood estimation, smoothing
|
Before Class
In Class (for reference)
Relevant Readings
Slides [1, 2]
Assignment 2: Ngram LM
[due 1/24]
|
4 |
1/23
1/25 |
Classification Foundations
supervised learning, naive bayes, perceptron, generative vs. discriminative models, data splits,
evaluation metrics
|
Before Class
Relevant Readings
Slides [1]
Assignment 3: Classification
[due 2/2]
|
5 |
1/30
2/1 |
Linguistic Structure and Annotation
NLP libraries, POS tagging, parsing, NER, crowdsourcing, annotator agreement, ethical and practical
concerns
|
Software
Relevant Readings
Slides [1]
Midterm Self-Evaluation
[due 2/9]
|
6 |
2/6
2/8 |
Computational Semantics 1
association metrics, word sense disambiguation, semantic resources
|
Before Class
Relevant Readings
Slides [1]
Assignment
4: Bias Audit
[due 2/16]
|
7 |
2/13
2/15 |
Computational Semantics 2
vector space semantics and embedding models, similarity metrics
|
Before Class
Relevant Readings
Slides [1]
Assignment
5: Semantic Similarity
[due 2/26]
|
8 |
2/20
2/22 |
Neural Models and Transformers
feed-forward networks, recurrent networks, neural language models, contextual embeddings, pre-training, fine-tuning
|
Before Class
Relevant Readings
Slides [1, 2]
|
9 |
2/27
2/29 |
Research Projects in NLP, Topic Models
unsupervised learning,
graphical models basics, latent dirichlet allocation, gibbs sampling, k-means clustering, expectation-maximization algorithms |
Optional Reading/Viewing
Relevant Readings
Slides [1, 2 (Blei, slides 18-41)]
|
10 |
3/7 |
What's Next?
properties of contemporary LLMs, application areas, empirical and ethical concerns
|
Relevant Readings
Slides [1]
Final Project or Assignment 6: Topic Modeling [due 3/13]
Final Self-Evaluation [due 3/13]
|
Materials
All course materials are available for free online. We will refer primarily to these two textbooks:
When assigned readings, before class I encourage you to skim for basic understanding rather than detail. When doing
the assignments, refer back to get details.
Some of the course materials are drawn from or inspired by relevant courses at other institutions, which also serve as
excellent resources and points of reference, including:
Structure
Lectures, Readings, and Videos
This course is largely lecture-based, but will incorporate components of a "partially flipped" course, where occasionally we will do some
reading or watching of videos before class, and then spend time in class working on something together. All
lectures will be recorded via Panopto and available in Canvas. In-class attendance is strongly encouraged but not mandatory.
Assignments
The coursework will be structured around assignments to be completed individually, roughly one per week (with a few
breaks), aimed at giving you hands-on practice with the material from that week.
Each assignment has a "core" component which represents the basic material I would like everyone in the class to complete. This portion of the assignment has an autograder on Quest that helps you check the correctness of your work. The teaching staff and I will record whether your work passes the autograder and spot-check this part of the assignment.
The core component of each assignment is only considered complete when it passes the autograder; if you are struggling to achieve this, post on Ed or come to office hours to figure it out.
Then each assignment also has an "extensions" component, which provides a list of possible ways the basic assignment could be extended in various directions. You are also always free to invent your own extensions. This component is optional but encouraged. The teaching staff and I will look through this component of your assignments to provided qualitative feedback.
We can be flexible with deadlines if circumstances arise, but
it's important to stay on top of these assignments because we will be moving quickly from one topic to another. I'll discuss this more in class, but please note that we take no responsibility for reviewing assignments turned in more than one week late.
Group Work and Peer Evaluation
A few of times throughout the quarter we will include time for various in-class group work activities as well as peer evaluation of
one another's assignments.
Final Project
This course includes a final project component. I'm very open as to what this could be. Basically it's an
opportunity for you to take a self-directed approach to learning more about some topic in this field. Midway through the
quarter I'll ask you for ideas on what you might do, and am always glad to consult on any questions you might have.
In terms of structure here's a few possibilities:
- Develop and carry out an independent project applying methods from this class
- Use techniques learned in class to advance your existing research
- Carry out a detailed linguistic error analysis on the outputs of an NLP system or systems
- Replicate a paper in the field (Rob will provide some examples of good papers for this)
- Write up a literature survey on a topic in the field
Regardless, the requirement is to present a writeup in
ACL 2020 format as well
as your code (if any). At minimum, if your project involves substantial coding I expect a 2-4 page writeup explaining
what you did; if your project is only written (e.g. lit survey or error analysis), I expect 6-8 pages.
Group projects of up to three members are allowed, however I will expect the effort involved to scale roughly linearly
with the number of group members. If you work in a group, you must include a paragraph at the end of your writeup
explaining who did what.
Alternatively, we will also provide a "default assignment" which can be done individually in lieu of a final project. I encourage you to take advantage of the opportunity to do a self-directed final project for this course, but the default assignment is a reasonable alternative.
Evaluation
Not a fan of grades, to be honest. Research has shown that traditional numerical/letter grades
decrease intrinsic motivation and joy for learning,
can undermine performance, and
are potentially riddled with implicit bias. For
more reading on this topic:
Therefore, grades go against my central goal for this course: getting you excited about and engaged with the wonderful
world of computational linguistics. I am much more interested in helping you get what you want to out of the course
through qualitative evaluation for your benefit. This will largely come in the form of written and in-person feedback on
your work from the teaching team and me, as well as peer evaluation from your classmates.
In the interest of maintaining a healthy working relationship with the registrar, however, I will submit final grades
at the end of the quarter. Below are the forms of evaluation we'll do and how much they'll contribute to what I
end up submitting.
Self-evaluation (50%)
You know at least as well as I do how the course is going for you, so we'll have two self-evaluations, at the
middle and end of the class. In the first week I'll send out a survey in which I ask you to explain your goals for
the course; then for each self-evaluation I'll ask you to reflect on your process and progress towards those goals,
your participation in the course, and ultimately to give yourself a grade and explain your reasoning.
In doing these evaluations, here are the kinds of questions I'll ask you to consider:
- Did you turn in your assignments on time? (Note: for me, turning in assignments late with a good reason is
equivalent to turning them in on time.)
- Did you attend (or watch later) in-class lectures as much as you could? (Note: again, for me, missing class with a
good reason is equivalent to attending.)
- Did you keep up with readings, videos, and in-class activites?
- Did you spend any allotted time in breakout rooms on work for this class?
- Did your assignments run all the way through, and pass any tests?
- Did you reach out for help when you needed it? (Note: doing this is positive!!!)
- Did you collaborate with others to contribute to our classroom community (in breakout rooms, by helping on Ed, or
outside of class)?
- Did you challenge yourself, or did you do the minimum?
I hope to simply take your self-evaluation grades at face value, although if your self-evaluation disagrees
significantly with my perception (in either direction) I may contact you or ask you to meet with me to hash out why our impressions
differ.
Lastly, I want to note two important baselines, so we're all on the same page. The minimum baseline to pass the course is to complete at least 4 out of 6 assignments and turn in the midterm and final self-evaluations. The minimum baseline to achieve strong performance (e.g., an "A") is to have all base assignments fully pass the autograder, submit a substantial final project (or complete the default final assignment and have it pass the autograder), and to challenge yourself appropriately. Note that both of these are the necessary rather than sufficient baselines.
Effortful Completion (50%)
The teaching team and I, in turn, will be watching your process, providing structures for learning, and trying to help keep you
on track. We will record whether (and when) your assignments pass the autograders and what work you did on extensions.
At the end of the quarter, I'll give you a holistic grade for effortful completion of the assignments, peer
code review, participation (e.g. on Ed), and your final project. My evaluation is also very liable to be influenced by
your self-evaluation and report of your process and progress.
Inclusion Statement
I am committed to creating an inclusive environment that actively values the diversity of backgrounds,
identities, and experiences of everyone in the classroom. I welcome you to talk with me if you have any feedback or if
there's anything I can do to better support you. If you'd prefer to contact me anonymously you can do so using the
form at the
bottom of my faculty webpage.
University-Requested Syllabus Inclusions
Academic Integrity Statement
Students in this course are required to comply with the policies found in the booklet, "Academic Integrity at
Northwestern University: A Basic Guide". All papers submitted for credit in this course must be submitted
electronically unless otherwise instructed by the professor. Your written work may be tested for plagiarized content. For
details regarding academic integrity at Northwestern or to download the guide, visit:
https://www.northwestern.edu/provost/policies/academic-integrity/index.html
Accessibility Statement
Northwestern University is committed to providing the most accessible learning environment as possible for students
with disabilities. Should you anticipate or experience disability-related barriers in the academic setting, please
contact AccessibleNU to move forward with the university’s established accommodation process (e:
accessiblenu@northwestern.edu; p: 847-467-5530). If you already have established accommodations with AccessibleNU, please
let me know as soon as possible, preferably within the first two weeks of the term, so we can work together to implement
your disability accommodations. Disability information, including academic accommodations, is confidential under the
Family Educational Rights and Privacy Act.
COVID-19 Classroom Expectations Statement
Students, faculty, and staff must comply with University expectations regarding appropriate classroom behavior,
including those outlined below and in the COVID-19 Code of Conduct. With respect to classroom procedures, this
includes:
- Policies regarding masking and social distancing evolve as the public health situation changes. Students are
responsible for understanding and complying with current masking, testing, Symptom Tracking, and social distancing
requirements.
- In some classes, masking and/or social distancing may be required as a result of an Americans with Disabilities Act
(ADAccommodation for the instructor or a student in the class even when not generally required on campus. In such
cases, the instructor will notify the class.
- No food is allowed inside classrooms. Drinks are permitted, but please keep your face covering on and use a
straw.
- Faculty may assign seats in some classes to help facilitate contact tracing in the event that a student tests
positive for COVID-19. Students must sit in their assigned seats.
If a student fails to comply with the COVID-19 Code of Conduct or other University expectations related to COVID-19,
the instructor may ask the student to leave the class. The instructor is asked to report the incident to the Office of
Community Standards for additional follow-up.
Exceptions to Class Modality
Class sessions for this course will occur in person. Individual students will not be granted permission to attend
remotely except as the result of an Americans with Disabilities Act (ADA) accommodation as determined by
AccessibleNU.
Maintaining the health of the community remains our priority. If you are experiencing any symptoms of COVID do not
attend class and update your Symptom Tracker application right away to connect with Northwestern’s Case Management Team
for guidance on next steps. Also contact the instructor as soon as possible to arrange to complete coursework.
Students who experience a personal emergency should contact the instructor as soon as possible to arrange to complete
coursework.
Should public health recommendations prevent in person class from being held on a given day, the instructor or the
university will notify students.
Course Recordings
This class or portions of this class will be recorded by the instructor for educational purpose and available to the
class during the quarter. Your instructor will communicate how you can access the recordings.
[RV: On Canvas in the Panopto section.] Portions of the course that contain images, questions or
commentary/discussion by students will be edited out of any recordings that are saved beyond the current term.
Prohibition of Recording of Class Sessions by Students
Unauthorized student recording of classroom or other academic activities (including advising sessions or office hours)
is prohibited. Unauthorized recording is unethical and may also be a violation of University policy and state law.
Students requesting the use of assistive technology as an accommodation should contact AccessibleNU. Unauthorized use of
classroom recordings – including distributing or posting them – is also prohibited. Under the University’s
Copyright Policy, faculty own the copyright to instructional materials – including those resources created specifically
for the purposes of instruction, such as syllabi, lectures and lecture notes, and presentations. Students cannot copy,
reproduce, display, or distribute these materials. Students who engage in unauthorized recording, unauthorized use of a
recording, or unauthorized distribution of instructional materials will be referred to the appropriate University office
for follow-up.
Support for Wellness and Mental Health
Northwestern University is committed to supporting the wellness of our students. Student Affairs has multiple
resources to support student wellness and mental health. If you are feeling distressed or overwhelmed, please reach out
for help. Students can access confidential resources through the Counseling and Psychological Services (CAPS), Religious
and Spiritual Life (RSL) and the Center for Awareness, Response and Education (CARE). Additional information on all of
the resources mentioned above can be found here:
- https://www.northwestern.edu/counseling/
- https://www.northwestern.edu/religious-life/
- https://www.northwestern.edu/care/