GPU Hardware and Software

5.00 / 5 rating3.00 / 5 difficulty12.00 hrs / week

Quick Facts and Resources

Something missing or incorrect? Tell us more.

Name
GPU Hardware and Software
Listed As
CS-7295
Credit Hours
3
Available to
CS students
Description
This course explores the software and hardware aspects of GPU development. Through hands-on projects, you'll gain basic CUDA programming skills, learn optimization techniques, and develop a solid understanding of GPU architecture. Additionally, you'll study compiler principles to comprehend software-related GPU issues and read research papers on hardware challenges. By the end, you'll have enhanced your knowledge of compilers, programming, and computer architecture for modern GPUs.
Syllabus
Syllabus
Textbooks
No textbooks found.
  • RwC4XfLWS/UqhqfB5w2qFA==fall 2025

    The GPU HW/SW course is a well-balanced class that provides hands-on experience with both GPU programming and GPU microarchitecture, along with a bit of compiler-style dataflow analysis. The workload is moderate, the projects are well-scoped, and the TA team is exceptional - making it a strong elective for students interested in systems, architecture, or parallel programming.

    Projects The course is built around five projects, each highlighting a different part of the GPU stack.

    Project 1 – CUDA Matrix Multiply (Intro Project) This is a basic introduction to CUDA and the ICE cluster environment. The assignment is just a warmup to get familiar with the toolchain and cluster workflow. Anyone with basic C/C++ experience should complete it quickly.

    Project 2 – Bitonic sort in CUDA This is the toughest project, but also the most rewarding. You implement Bitonic sort in CUDA and optimize for performance. The optimization component adds some challenge, but the project is very doable and well-defined. Many students consider this the highlight of the course.

    Projects 3 & 4 – GPU Hardware Simulation These were two separate assignments where you modify a simplified GPU simulator to explore architectural concepts like: -modeling GPU cores -adjusting pipeline/latency behavior -experimenting with warp scheduling strategies

    These projects aren’t conceptually difficult, but matching simulator output exactly can be tedious. Fortunately, full precision matching is not required; you receive 95% credit as long as your statistics fall within a small tolerance.

    As of the current semester, these two have reportedly been merged into a single Project 3, and a new Project 4 has been introduced related to ML (attention mechanisms). I don’t have specific details on that new assignment, but historically the simulator projects have been very manageable.

    Project 5 – Dataflow analysis (Reaching Defs & Liveness) Despite the terminology, this is not a compiler-heavy project. You analyze a small set of instructions and compute reaching definitions and liveness information. Students without compiler backgrounds typically do fine. It’s systematic rather than difficult.

    Lectures, Quizzes, and Exam The lectures are (very) short and focus directly on the material needed for the projects. They are not exhaustive or deeply theoretical, but they give you enough context to succeed.

    Quizzes are straightforward and (at least previously) open book. There is one final exam worth 10% of the grade. It was fair and consistent with the quizzes and lectures. There is a policy requiring at least 90% overall, and at least 40% on the final exam to earn an A in the course.

    TA Support The TA team is one of the strongest aspects of the course. The head TA (Scott) is exceptionally responsive, and the entire team is helpful, knowledgeable, and engaged. Ed support is fast and detailed, which significantly improves the project experience.

    Workload & Overall Difficulty Overall, the effort level is medium to medium‑low (likely medium now given the new project,which probably increase the course load by 15-20%). The projects are interesting, the course is not stressful, and the pacing is comfortable. It’s a great “systems” elective that blends GPU software, architecture, and light analysis without overwhelming students.

    Final Verdict Highly recommended! Especially for students interested in GPU programming and architecture, and performance analysis and optimization. The course offers a meaningful hands-on experience with a manageable workload, excellent TA support, and projects that are both fun and practical.

    Rating: 5 / 5Difficulty: 3 / 5Workload: 12 hours / week