Link Search Menu Expand Document

Course Schedule

Paper reading list and presenters

Jan. 27, Tue
Course Overview (Slides)
Chen
  1. How to Read a CS Research Paper by Philip Fong
  2. How to do research by Bill Freeman
  3. How to do write a good paper by Bill Freeman
  4. Novelty in Science by Michael Black
  5. How to speak (video) by Patrick Winston
Jan. 29, Thu
Deep Learning Recap (Slides)
Chen
Feb. 3, Tue
The Unreasonable Effectiveness of Data (Slides)
Chen
Feb. 5, Thu
Visual Concepts (Slides)
Chen
Feb. 5, Thu
Due Presentation signup sheet
Feb. 10, Tue
Overview of Multimodal LLMs (Slides)
Zitian Tang
Feb. 12, Thu
Generative AI for Robot Learning (Slides)
Zilai Zeng
Feb. 19, Thu
Flow Matching and Normalizing Flows (Slides)
Dr. Calvin Luo
Feb. 19, Thu
MP Mini Project
Due on March 12
  1. Mini Project Handout
  2. Submission Form
Feb. 24, Tue
Teaching Video Models to Understand Physics Control (Slides)
Nate Gillman
Feb. 26, Thu
“Emergent” Abilities in Large Pre-trained Models (Slides)
Andrew, Daniel, Taj, and Woody
  1. Emergent Abilities of Large Language Models
  2. Are Emergent Abilities of Large Language Models a Mirage?
Feb. 26, Thu
FINAL Final Project Proposal
Due on March 10
  1. Submission Form
Mar. 3, Tue
Few-shot and In-context Learning (Slides)
Athulith, Benjamin, Kenneth, Vanessa, and Zheyu
  1. Reading Survey
  2. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
  3. Function Vectors in Large Language Models
Mar. 5, Thu
The World After Transformers (1) (Slides)
Armaan, Asher, Chaitanya, Jiayi, and Manan
  1. Reading Survey
  2. Vision Transformers Need Registers
  3. Efficiently Modeling Long Sequences with Structured State Spaces
Mar. 10, Tue
Quo Vadis, Computer Vision? (Slides)
Akash, Benjamin, Faisai, and Lyfey
  1. Reading Survey
  2. VGGT: Visual Geometry Grounded Transformer
  3. V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning
Mar. 12, Thu
Visual Understanding vs. Generation (Slides)
Alexander, Evan, Ruthwik, Om, and Yinghua
  1. Reading Survey
  2. Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
  3. Diffusion Transformers with Representation Autoencoders
Mar. 17, Tue
Final Project Idea Pitch (1)
  1. Slide deck
Mar. 19, Thu
Final Project Idea Pitch (2)
  1. Slide deck
Mar. 31, Tue
Video Generation Meets the Laws of Physics (Slides)
Aashish, Gary, Xiaoyan, Xijie, and Yuqiao
  1. Reading Survey
  2. WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions
  3. Video models are zero-shot learners and reasoners
Apr. 2, Thu
World Models (Slides)
Eric, Ethan, Ioanna, Lihao, and Mark
  1. Reading Survey
  2. Genie: Generative Interactive Environments
  3. Mastering Diverse Domains through World Models
Apr. 3, Fri
INVITED Learning World Models and Agents for High-Cost Environments
Prof. Sherry Yang
Apr. 7, Tue
Videos, Language, and Robots (Slides)
Arin, Chandradithya, Enyan, Peiyan, and Peter
  1. Reading Survey
  2. Learning to Play Minecraft with Video PreTraining (VPT)
  3. π∗0.6: a VLA That Learns From Experience
Apr. 9, Thu
INVITED Assessing Adaptive World Models in Machines with Novel Games
Lance Ying
Apr. 14, Tue
Abstract Reasoning with LLMs (Slides)
Apoorv Khandelwal
Apr. 16, Thu
The World After Transformers (2) (Slides)
Akul, Harshit, Heejeong, Ronit, and Shravya
  1. Reading Survey
  2. Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
  3. Nested Learning: The Illusion of Deep Learning Architectures
May 8, Fri
Final project presentations (Lubrano 1 to 4 pm) (Slides)
May 11, Mon
Due Project submission (Form)