Natural Language Processing
Quick Facts and Resources
Something missing or incorrect? Tell us more.
- Name
- Natural Language Processing
- Listed As
- CS-7650
- Credit Hours
- 3
- Available to
- CS students
- Description
- Topics include lexical analysis, parsing, interpretation of sentences, semantic representation, organization of knowledge, inference mechanisms. Newer approaches combining statistical language processing and information retrieval techniques.
- Syllabus
- Syllabus not found.
- Textbooks
- No textbooks found.
lgSVrZ6cNPkcWP7JsDkMcQ==2024-12-09T23:06:18Zfall 2024
Likely grade: A with a near 100% score This course is super easy if you have taken ML in the past. Its on part with ML4T in terms of effort, but you are gonna get to learn the cutting edge in NLP. I took it over DL because I wanted a more applied course that taught me transformers. This class is basically DL applied to natural language processing with the latest techniques arch system architechtures in industry.
The homeworks don't require that much effort week to week, with the exception of hw5 it definitely takes 20hours to finish. It doesn't help that the recitation was super confusing. The quizzes require you to perhaps go 1 level deep to dig into the root cause or 1 level above in terms of inference. They don't require you to make several logical leaps in order to come to the answer and are some are common sense.
The midterm was a paper review, they give you 2 papers to pick from, and ask you to answer questions about the one you picked. The questions really test how deeply you have understand the paper. I wrote very comprehensive answers and made a 12/12, the avg was maybe a 10/12. It was easy to make a good score but really hard to make a perfect score. The final was comprehensive exam of the entire course content with emphasis on the second half. 12 points again. Both exams were open notes and open for a week. The midterm was multiple submissions, the final was not but they give you a downloadable word doc of it so you can work offline.
The lectures: Rields lectures are fantastic, he distills information in a way that even a toddler can understand. But the lectures from Meta that constitute the second half of the course are pure trash. There are 4 Meta researchers that teach most of second half of the course. Out of those only 1 or 2 are any good, the others are pure trash. I used a previous students lecture notes https://lowyx.com/posts/gt-nlp/ to study most of the second half and parts of the first half as well.
Rating: 4 / 5Difficulty: 1 / 5Workload: 10 hours / week
VtyRcZyoXyFfbMYeeTvd/A==2024-12-06T20:04:20Zfall 2024
This was my last course in the program and it was the perfect finale in terms of format, level of commitment, and topic interest. I really wanted something that was fun and a little challenging as my last course, nothing complete fluff or that would make me pull my hair out (what is left of it).
On top of that, I preferably wanted a course with no traditional exam format (proctored, closed book) and of course, no dreaded group work. This one fit the bill perfectly.
Other reviews have touched on the makeup and mechanics of the course, so I won't reiterate those here. I will say that the midterm this semester was pretty awesome, it reminded me of the first part of EdTech where you read a paper and answer a few questions. It was good in the fact that it wasn't overly challenging but it also gave us a good introduction to the topics covered in the papers offered.
I will echo some of the other sentiments about the homework projects. Projects 1 through 4 are pretty low-lift filling in template code and they focus more on the machine learning aspect of setting up the appropriate classes for the model than doing the infrastructure stuff like data preprocessing etc. Time commitment for this part of the course was really low, just a few hours a week (excluding lectures if you watch them).
These assignments ostensibly prepare you for the final project so you can pretty easily implement the actual model, but the bulk of the challenge and the work on the final project (for me anyway) was in doing all that data processing, constructing datasets, training/testing, etc. that you didn't need to do before. So if you're not too familiar with that kind of stuff in Python, then give yourself plenty of time to figure it out before the due date. It was kind of a neat format for this project too because you didn't need to attain a certain accuracy or certain number of correct outputs, you just had to construct the model, show that it was learning, and then answer questions about what your model did and how/why it did that.
The only project recitation I watched was the final one and it was pretty helpful to clear up some of the fuzzy parts of the project for me.
The quizzes at the end of the modules are open everything so they were pretty easy. I think I missed one question all semester and that was because it was weirdly worded and I was misinterpreting the answer choices.
The final exam is essay style and I found it really interesting and fun and also a good learning experience. Which is not how I find closed book proctored exams, so this was great for me. Not sure how I did on it yet but I am not too worried.
So all in all, this was a great low-stress class for my last semester. The material was interesting enough to be fun to work on and lots of opportunity to learn cool stuff for those of us that enjoy programming and NLP but aren't professionals.
If you're proficient in this topic then it's probably a waste of time to take it unless you just want the credits. Pairing it with another class wouldn't be too much of a big deal if you're relatively familiar with Python beforehand.
Oh and you have a choice whether to use a local or Colab environment. I used local for all but the final when a GPU was helpful. Adapting the notebooks for a local environment is not a big deal if you are familiar with the process. I bought 100 credits for Colab runtime and only used like 40 of them for the final project. So 100 would probably get you through the entire semester on Colab but I just hate their IDE so stuck with local as much as possible.
Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week
sTsralZklUgAsWN86aZZzA==2024-08-08T19:25:23Zsummer 2024
This is an overall great course and newly developed. This is great staying away from the hideous old videos for most of the other OMSCS courses.
It's still currently small in numbers and hard to get in, however it will grown soon with more potential TAs graduating every semester.
The material is extensive, very extensive to be honest. In fact too extensive and you barely get tested on anything. Even if about half the videos were taught, you would still get a good education and would not even affect the assignments/exams.
The quizes are mostly hard even though they are usually 2-3 questions, they are tricky so take your time, you only get 2 attempts each.
Midterm and final exams are take home open everything (except for your favourite GPT). These are written questions you need to answer. I found them ok overall but take time to complete.
The 6 homework assignments are the real meat here taking up 70% of the grade. They are all coding with some written question in the final assignment which counts 20% of the overall grade. The first 5 homeworks are autograded on Gradescope but some take quite a while to run.
My suggestion is to use Google Colab with paid subscription, this is the most efficient way of working through these since the starter code is adapted for this platform in mind. You can do it on your environment as well and works ok for some assignments but a pain to adapt for others.
The homework assignments do build up, from only a few lines in the first assignment to quite a sizeable amount of lines in the last one. Be prepared for this. Another tip is to start early HW5 (the last one). I left mine at the last minute along with the exam since the deadline was on Wednesday instead of the usual Sunday. I rushed through the final assignment and the exam merely missing an A by a hairline!
Final note on the videos, the first half of the course with the videos made by the professor are absolutely the best. The second half where Meta AI folks do the videos are absolutely horrible, dry and for most part pointless. I would not suggest you skip them but prepare to be bored to death. I really hope these are done again in the future as this is one of the best classes in OMSCS.
Rating: 4 / 5Difficulty: 3 / 5Workload: 12 hours / week
DRhNRbP80f8K8JAh8aMKfQ==2024-07-21T23:18:03Zspring 2024
Heavy Math, Lot of coding. If you are new to deep learning or AI. Learning curve will be high. It works with other courses too. DON'T TRUST SOME RANDOM 5+ year experienced SWE/MLE says its easy class.
Yeah NLP is now worth to take for all specializations because of GPT HYPE. I don't trust cynical people say its just stochastic parrot. If we think about Electric Vehicle last 10 years. People make same repatory(Its not worth to make EV and difficult to commercialize it).
Assignments, projects, reading papers for final project. It's not easy at ALL. Everything you learn from this course will give you strong benefit. Keep in mind always try hard.
Rating: 4 / 5Difficulty: 4 / 5Workload: 20 hours / week
si1qPcPD8F7O6B+oxH+PpA==2024-05-12T17:48:52Zspring 2024
The explanation of Professor Riedl on lectures is one of the best in OMSCS. This is mainly the only part I liked of this course, the bad this is the assignments.
The assignments took me less than 4 hours each, the final mini-project around 10 hours. Quizzes are open book as well as the final exam.
There were weeks I did not touch anything about this class, the time commitment is so low. I spend less than 4 hours a week for this class, and I feel I did not gain deep knowledge about this field.
This class have to be redesigned at least for this assignments and be at the same rigors as DL, BD4H or HDDA.
Rating: 3 / 5Difficulty: 1 / 5Workload: 4 hours / week
AKIULDq0TCEmiKRoPhhUHg==2024-05-06T03:26:11Zfall 2023
Note this review is actually for spring 2024.
This is an OK class. For background I've been in the program since 2022 and this was taken along with 2 other class while I took a semester off from work to do this full time. I'm in the ML track and so have done 7643 and was doing DL concurrently with this.
LECTURES: The lectures by Prof Reidl are great. The only criticism is that they are a bit long. That said you will always leave the lectures understanding what concept is being taught. I credit this class for really understanding LSTM, Transformers and some other DL things.
There are some Meta AI lectures. These are definitely better than the ones in DL but not great. If Mark's lectures are a 5 then these are a 3. The info is good for the most part and they mostly get the point across
Most of the lecture content in this course is focused on actual NLP. That means the BOW model, machine reading, embeddings, information retrieval and more. This is a survey class, it's not a class about LLMs. You spend only a bit of time on them really. If you want to learn about LLMs take DL and then watch some of Andrej Karparthy's great YouTube videos.
Quizzes: There is a trivial quiz at the end of each section. These are a little challenging because the sections can be very long (2+ hrs) and can be a bit detail oriented. You get two tries on each and they are open so can't complain here.
Homework: Oh boy... This is a very big disappointment. Coming out of ML and seeing what DL is doing with HW I was really left scratching my head. The assignments are all very, very easy jupyter notebooks. Most time is spent debugging the testing framework you are given or trying to understand what you are actually required to do. None of them are super hard but I found them very annoying.
My bigger frustration is that they do a poor job reinforcing lecture. In DL you have to implement a transformer from Pytorch primitives. The furthest you go here in terms of DL is implementing an LSTM. We don't even use the built in PyTorch transformer module at any point. I'm not sure why an NLP class doesn't have you work with one of the most important NLP mechanisms out there right now.
As is I also don't feel the HW does a great job of reinforcing what is in lecture unless you do the notebook while watching the lectures at the same time. They almost need to be thought of as a companion to the lectures. So if that was the goal when creating them that is reasonable.
PROJECT: The project was to implement an attention mechanism. I think the project would have been a great opportunity to either use the PyTorch transformer or build one from scratch. The lecture walk through is good enough for that. Instead we implement a mechanism from 2016 that doesn't really have a purpose in this day and age. The idea is that we are trying to get experience working with attention mechanisms but that's really a small part of it.
I will say it is good that they force you to write your train and test loops from scratch. This aligns with the expectation that the prior notebooks have trained you for the project. I am just not convinced the project does a good job of reinforcing NLP since it's so hard to get good results from this sort of network. I'd give this project a 2/5 in current state.
Exams: Midterm was childish. Felt like something from high school. You are given a set of question you must answer in a word doc like format based on a paper you have read. The problem is that they really don't let you flex your muscles if you are used to writing quality papers like you would do in ML4T, ML or even DL to an extent. They also expected you to identify very specific things, even though you are ostensibly just giving a summary. This was the lowest point of the class for me. 1/5.
Final was open everything short respsone. They ask potentially very deep systems design questions but then expect you to answer 1-2 paragraphs. Based on the grading for midterm being very specific I found this to be impossible. Weirder still is that half the questions seemed to focus on LLMs. This was strange since we spent very little time talking about them in the class. When asking the TAs if we should incorporate stuff from outside the class we were told it shouldn't be needed. Not sure how that works since some questions had virtually no coverage in the course.
Now that said, none of the questions were hard. I just don't understand why we were being asked those questions given everything else we learned in the class.
Summary: So who is this class for? I think this is a good course if you are in the CS track and want to try something ML related that isn't brutal like DL and is not ML4T. This course does a great walkthrough of NNs and can take you from nothing to writing PyTorch code. That's pretty cool.
If you already know NNs, have taken DL or something like that then some of the non NN NLP stuff is interesting. I just don't think the treatment is rigorous enough to really learn a lot. I now know a bunch of concepts but I don't feel I can strongly apply them. The NN side get's completely covered in the 4th section of DL. You will learn nothing new here related to NNs if you have taken DL. Again, you never use a transformer in this class.
The TAs are trying their best. I think this course could be super great if you have the lectures form the base and then have the students go out and extend them. I think lectures can be tightened up and can assume a bit more prior knowledge. I could have skipped the whole module on Bayes having seen it in AI4R for example. Have this go from a 10hr/week course to a 15, clean up the notebooks and make requirements more clear (why is GT allergic to type hints?), and give us a more modern project. Do that and I think you have a winner.
I got a mid A for what it's worth. Many weeks I completely forgot about this course.
Rating: 3 / 5Difficulty: 2 / 5Workload: 9 hours / week
DdWQS12tsZ78dqc1ajCb8g==2024-03-16T18:34:51Zfall 2023
This is my 7th course and by far the best. It does a good balance of theory/lecture, programming assignments and paper overview. It doesn't beat you to death with sadistic assignments and provides a shell where you focus on the conncepts learned and see it working. I finally understood the intuition of a Transformer model and its variants. Prof. Riedl is stupendous and i wish he finished the entire modules.
He should come up with advanced NLP since there is lot of interest on how to quantize and FineTune a LLM models using HITL RLHF or DPO. I think Finetuning itseld can be a course starting with multiple sub-word, PE and attention techniques that can be explored.
Only thing that they can improve is better homework recitation by TA.
Rating: 5 / 5Difficulty: 3 / 5Workload: 10 hours / week
UfLZ7HeMFhLz8He29deVEQ==2023-12-18T04:33:05Zfall 2023
My 8th course in the program and one of my favorites so far. There are about 18 hours of lecture content. Reidl's lectures are very good, I actually went through all of his twice to try and fully absorb the content. A handful of the modules are done by guest lecturers from Meta and are much worse in comparison--some of them seem to be reading straight from teleprompter, most just do a terrible job at explaining things. Most of the content of the course focuses on neural networks, although there is some content on foundational math in the beginning. If you have a strong foundation of ML, I would expect this class to be not too hard. If you don't, you might have to do a little more work to catch up. There are 6 homework assignments that entail filling out various functions in a jupyter notebook, you are given a lot of supporting functionality, and they are graded based off included tests. I'd say they ramp up in complexity pretty steadily. The first 4 are a breeze, and can be done in an afternoon or two very easily. The 5th was somewhere in between. The 6th is a "project" that is still just a notebook with some instructions, but more open ended about the approach you can use and doesn't have any auto-grading. We had an extra half a week for the project, but I spent as much time on the project as I did every other assignment combined, so start early. Everything is done with Pytorch, and you actually build models that do stuff, which makes the learning feel well applied. I feel like I am walking away with a toolkit to do more projects using what I learned--a first for me in this program. There were two exams. They consisted of a series of short answer questions, with unlimited time and open notes/internet to complete. In order to make this somewhat challenging, they're structured to make you apply concepts from the course in more creative scenarios, not just regurgitate facts from the lectures (most of the time). I found it took me quite a bit of time to think through each question, but I learned in the process of completing the exam. It was challenging, but honestly, it's the most fair and effective exam structure I've ever encountered (and my philosophy is generally that exams are a waste of time). Besides the bad lectures from Meta, my criticism of the class, at least in this semester, is that the TAs are still figuring a lot of things out about how to best run the course. They were very slow on grading, and very inconsistent with responding to questions on Ed. Some of the instructions in the homeworks were very confusing, and at times I felt the challenge was in interpreting what was written, not in actually doing the task itself. I expect some of this will resolve as time goes on and they improve the logistics of the course. They also are very strict about not releasing anything early, including lecture content, which is released on a weekly basis and includes short quizzes which must be done in the week they are released. In this semester, we weren't allowed to download lecture videos or slides from the lectures. I don't really understand why, but these things were a big inconvenience to me when I had some international travel during the semester. My time estimate is an average across the semester. In the early part of the course, I would often spend 3-5 hours a week on the course. During exams and toward the end, it was closer to 15-20.
Rating: 4 / 5Difficulty: 3 / 5Workload: 10 hours / week
ImzcgxSUGzJX34bAJBzljg==2023-11-29T22:36:17Zfall 2023
I have liked this course so far. The first half or more of the course is going deeper into NLP concepts I had already seen in DL. The homework assignments are not too hard. The project is very interesting. NLP is a lot easier than ML and DL. I would recommend it for a summer class.
Rating: 4 / 5Difficulty: 3 / 5Workload: 10 hours / week
G7mwFE9ZjXyQeoeJJF5M6g==2023-11-15T08:59:03Zfall 2023
This course does a good job of explaining modern NLP topics, such as word embeddings, RNNs/LSTMs, attention, Transformers, and Key/Value stores. It also covers some info retrieval topics.
The early lectures of the class are very detailed and cover the topics very well. The explanations are stellar.
The homeworks are simple and basic. Can be done in a few hours. This class is so light it can easily be paired with others. The exams are open book and you are given ample time to do them without honor lock. The late policy is very flexible (5 free late days for the semester).
The downsides of this course are vague lectures from Meta towards the end which just throw a lot of terminology in a survey fashion, rather than explain any concepts. The professor should re-record them with his explanations (since he does a great job in the other lectures). And the course could easily have 3x more homework and still be quite doable. I've sometimes ignored the class for weeks and done well.
No topics on pretrained model refinement or Reinforcement from Human feedback.
Overall a good class. I learned a lot. But I would have preferred some more projects that delve into more of the typical NLP tasks. Rather than just a couple of basic tasks.
Rating: 4 / 5Difficulty: 1 / 5Workload: 5 hours / week