Course Schedule
Paper reading list and presenters
- Jan 31, Tue
- Course Overview (Slides)
- Chen Sun
- (Background) How to Read a CS Research Paper by Philip Fong
- (Background) How to do research by Bill Freeman
- (Background) How to do write a good paper by Bill Freeman
- (Background) How to speak (video) by Patrick Winston
- Feb. 2, Thu
- Deep Learning Recap (Slides)
- Chen Sun
- (Background) Novelty in Science by Michael Black
- (Background) Everything is Connected: Graph Neural Networks
- Feb. 6, Mon
- Due Presentation signup sheet
- Feb. 7, Tue
- Learning with Various Supervision (Slides)
- Chen Sun
- (Background) How to grow a mind: Statistics, structure, and abstraction
- (Background) ICLR Debate with Leslie Kaelbling (video)
- (Background) Learning with not Enough Data by Lilian Weng (part1 / part2)
- Feb. 9, Thu
- The Bitter Lesson (Reading survey / Slides)
- Amina, Ilija, and Raymond
- Revisiting Unreasonable Effectiveness of Data in the Deep Learning Era
- Unbiased Look at Dataset Bias
- (Background) The bitter lesson
- (Background) The Unreasonable Effectiveness of Data
- (Background) Exploring Randomly Wired Neural Networks for Image Recognition
- (Background) NAS evaluation is frustratingly hard
- Feb. 14, Tue
- Semi-supervised Learning (Reading survey / Slides)
- Rosella, Patrick, Lingyu, and Michael
- Mean teachers are better role models
- MixMatch: A Holistic Approach to Semi-Supervised Learning
- (Presentation) Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning
- (Presentation) FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
- (Background) Semi-Supervised Classification with Graph Convolutional Networks
- (Background) Inductive Representation Learning on Large Graphs
- (Background) Transfer Learning in a Transductive Setting
- Feb. 16, Thu
- Transfer Learning (Reading survey / Slides)
- Wasiwasi, Abubakarr, Yiqing, and Jacob
- Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks
- Transfusion: Understanding Transfer Learning for Medical Imaging
- (Background) Big Transfer (BiT): General Visual Representation Learning
- (Background) Rethinking Pre-training and Self-training
- (Background) A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark
- (Background) Rethinking ImageNet Pre-training
- (Background) Natural Language Processing (almost) from Scratch
- Feb. 21, Tue
- University holiday, no class
- Feb. 23, Thu
- Few-shot and In-context Learning (Reading survey / Slides)
- Sheridan, Shreyas, and Zhuo
- Matching Networks for One Shot Learning
- Language Models are Few-Shot Learners
- (Background) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
- (Background) Prototypical Networks for Few-shot Learning
- (Background) Learning to Learn (Blog)
- (Background) Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need?
- (Background) Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
- (Background) Flamingo: a Visual Language Model for Few-Shot Learning
- Feb. 28, Tue
- Multitask Learning (Reading survey / Slides)
- Noah, Alexander, Pinyuan
- Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (Section 1, 2, 4)
- A Generalist Agent
- (Background) Intelligence without representation
- (Background) Multitask Prompted Training Enables Zero-Shot Task Generalization
- (Background) Taskonomy: Disentangling Task Transfer Learning
- (Background) UberNet: Training a Universal Convolutional Neural Network
- (Background) Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
- Mar. 2, Thu
- Transformer and its variants (Reading survey / Slides)
- Daniel, Yuan, and David
- Swin Transformer
- Perceiver: General Perception with Iterative Attention
- (Background) Synthesizer: Rethinking Self-Attention in Transformer Models
- (Background) Long Range Arena: A Benchmark for Efficient Transformers
- (Background) MLP-Mixer: An all-MLP Architecture for Vision
- (Background) Linformer: Self-Attention with Linear Complexity
- (Background) On the Relationship between Self-Attention and Convolutional Layers
- Mar. 7, Tue
- Self-supervised and Multimodal Learning (Slides)
- Chen Sun
- (Background) Self-Supervised Representation Learning by Lilian Weng
- (Background) Contrastive Representation Learning by Lilian Weng
- (Background) Self-Supervised Learning by Andrew Zisserman
- Mar. 9, Thu
- Self-supervised Learning for NLP (Reading survey / Slides)
- Yang, Adrian, and Vignesh
- REALM: Retrieval-Augmented Language Model Pre-Training
- Discovering Latent Knowledge in Language Models Without Supervision
- (Background) Self Supervision Does Not Help Natural Language Supervision at Scale
- (Background) How does in-context learning work?
- (Background) SpanBERT: Improving Pre-training by Representing and Predicting Spans
- (Background) RoBERTa: A Robustly Optimized BERT Pretraining Approach
- (Background) Human Language Understanding & Reasoning
- (Background) Do Large Language Models Understand Us?
- Mar. 9, Thu
- Project Final project signup
- Due on 3/16
- Mar. 10, Fri
- Homework First mini project
- Due on 4/28
- Mar. 14, Tue
- Invited Computer Vision for Global Scale Biodiversity Monitoring
- Sara Beery
- Mar. 16, Thu
- Self-supervised Learning for Images and Videos (Reading survey / Slides)
- Arthur, Robert, and Siyang
- Dimensionality Reduction by Learning an Invariant Mapping
- Time-Contrastive Networks: Self-Supervised Learning from Video
- (Background) BEiT: BERT Pre-Training of Image Transformers
- (Background) Representation Learning with Contrastive Predictive Coding
- (Background) Masked Autoencoders Are Scalable Vision Learners
- (Background) Deep Clustering for Unsupervised Learning of Visual Features
- (Background) Bootstrap your own latent: A new approach to self-supervised Learning
- (Background) Learning image representations tied to ego-motion
- Mar. 21, Tue
- Invited The Future of Computer Vision via Foundation Models and Beyond
- Ce Liu
- Mar. 23, Thu
- Project proposal (Slides)
- Mar. 23, Thu
- Feedback Mid-semester Feedback Form
- Mar. 28, Tue
- Homework Second mini project
- Due on 4/28
- Mar. 28, Tue
- Spring break
- Mar. 30, Thu
- Spring break
- Apr. 4, Tue
- Reinforcement Learning (Slides)
- Chen Sun
- Apr. 6, Thu
- World Models (Reading survey / Slides)
- Ray, Alexander, and Paul
- World Models
- Learning Latent Dynamics for Planning from Pixels
- (Background) Mastering Diverse Domains through World Models
- (Background) Control-Aware Representations for Model-based Reinforcement Learning
- (Background) Shaping Belief States with Generative Environment Models for RL
- (Background) Model-Based Reinforcement Learning: Theory and Practice
- (Background) DayDreamer: World Models for Physical Robot Learning
- Apr. 11, Tue
- Generative Models (Slides)
- Calvin Luo
- (Background) Understanding Diffusion Models: A Unified Perspective
- Apr. 13, Thu
- RL from Human Feedback (Reading survey / Slides)
- Ziyi, Qi, and Christopher
- Deep Reinforcement Learning from Human Preferences
- Training language models to follow instructions with human feedback
- (Background) Why does ChatGPT constantly lie? by Noah Smith
- (Background) ChatGPT Is a Blurry JPEG of the Web by Ted Chiang
- (Background) Stanford CS224N
- (Background) Learning to summarize from human feedback
- (Background) Reinforcement Learning for Language Models
- Apr. 18, Tue
- Learning from Offline Demonstration (Reading survey / Slides)
- Anirudha, Zilai, and Akash
- Learning to Act by Watching Unlabeled Online Videos
- Offline Reinforcement Learning as One Big Sequence Modeling Problem
- (Background) Language Conditioned Imitation Learning over Unstructured Data
- (Background) Building Open-Ended Embodied Agents with Internet-Scale Knowledge
- (Background) Decision Transformer: Reinforcement Learning via Sequence Modeling
- (Background) Understanding the World Through Action
- (Background) Learning Latent Plans from Play
- Apr. 20, Thu
- 3D Generation (Reading survey / Slides)
- Nitya, Linghai, and Yuan
- DreamFusion: Text-to-3D using 2D Diffusion
- RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation
- (Background) Equivariant Diffusion for Molecule Generation in 3D
- (Background) Point-E: A System for Generating 3D Point Clouds from Complex Prompts
- (Background) Text-To-4D Dynamic Scene Generation
- (Background) Zero-Shot Text-Guided Object Generation with Dream Fields
- Apr. 25, Tue
- Compositionality (Reading survey / Slides)
- Lingze, Suraj, and Apoorv
- Learning to Compose Neural Networks for Question Answering
- Compositional Visual Generation with Composable Diffusion Models
- (Background) Measuring and Narrowing the Compositionality Gap in Language Models
- (Background) CREPE: Can Vision-Language Foundation Models Reason Compositionally?
- (Background) COGS: A Compositional Generalization Challenge Based on Semantic Interpretation
- (Background) Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems
- Apr. 27, Thu
- Model Interpretability (Reading survey / Slides)
- Michael, Haoyu, and Qinan
- Do Vision Transformers See Like Convolutional Neural Networks?
- Acquisition of Chess Knowledge in AlphaZero
- (Background) BERT rediscovers the classical NLP pipeline
- (Background) Concept Bottleneck Models
- (Background) Tracr: Compiled Transformers as a Laboratory for Interpretability
- (Background) Interpreting Neural Networks through the Polytope Lens
- (Background) Neural Networks and the Chomsky Hierarchy
- Apr. 28, Fri
- Due Last day to submit mini projects
- May 2, Tue
- Final project office hours
- May 4, Thu
- Final project office hours
- May 11, Thu
- Final project presentations (CIT 368, noon to 2:30pm) (Slides)
- May 12, Fri
- Due Project submission (Form)