Date |
Topic |
Presenter |
Notes |
8/22 |
Introduction
Introduction to compilers, architecture and logistics
|
Charith
Slides |
Todo: Class Statistics Survey |
8/24 |
Compilers
Quick overview of Compiler Construction + Optimizations
|
Charith
Slides |
|
8/29 |
Compiler Optimizations
Anatomy of a Compiler Optimization Pass, DSLs, Domain Specific Optimizations
|
Charith
Slides |
|
8/31 |
DSLs + ML in Architecture
Continuation of discussion on DSLs and examples of ML in architecture
|
Charith
Slides |
Background Reading: A Survey of Machine Learning for Computer Architecture and Systems
Todo: Paper Selection Form (Due: Aug. 31st)
|
9/05 |
Machine Learning Techniques
Quick overview of ML techniques: Neural Networks
|
Charith
Slides |
|
9/07 |
Machine Learning Techniques (Contd.) and Auto-tuning
Quick Overview of ML techniques: Genetic Algorithms, Simulated Annealing, Sequential Decision Making; Introduction to Auto-tuning
|
Charith
Slides |
Background Reading:
A Survey on Compiler Autotuning using Machine Learning (ACM CSUR 2018)
A taxonomy of ML for Systems Problems(IEEE Micro Sept/Oct 2020)
|
9/12 |
Autotuning: Empirical Autotuning
Main Reading: Automatically Tuned Linear Algebra Software (SC 1998)
|
Devansh |
Related Reading:
A Fast Fourier Transform Compiler (PLDI 1999)s
Fast Automatic Generation of DSP Algorithms (ICCS 2001)
The Design and Implementation of FFTW3 (IEEE 2005)
|
9/14 |
Autotuning: Languages for exposing choices
Main Reading: Petabricks: A Language and Compiler for Algorithmic Choice (PLDI 2009)
|
Muyan |
Related Reading:
A framework for adaptive algorithm selection in STAPL (PPoPP 2005)
Halide: A language and compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines (PLDI 2013) A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers (PACT 2021)
|
9/19 |
Autotuning: Techniques
Main Reading: Bliss: Auto-tuning Complex Applications using a Pool of Diverse Lightweight Learning Models (PLDI 2021)
|
Chamika |
Related Reading:
Learning to Generate Fast Signal Processing Implementations (ICML 2001)
Towards Better Understanding of Black-box Auto-tuning: A Comparative Analysis for Storage Systems (ATC 2018) - A systems paper with a good overview of techniques
|
9/21 |
Autotuning: Frameworks
Main Reading: CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research (CGO 2022)
|
Jai |
Related Reading:
AutoTVM: Learning to Optimize Tensor Programs (NeurIPS 2018) OpenTuner: An extensible framework for Program Autotuning (PACT 2014)
|
9/26 |
Autotuning: Scaling Up
Main Reading: GPTuneBand: Multi-task and Multi-fidelity Autotuning for Large-scale High Performance Computing Applications (SIAM 2022)
|
Ben |
Related Reading:
Portable Performance on Heterogeneous Architectures (ASPLOS 2013) |
9/28 |
Autotuning: Diverging Workloads
Main Reading: WACO: Learning Workload-Aware Co-optimization of the Format and Schedule of a Sparse Tensor Program (ASPLOS 2023)
|
Devansh |
Related Reading:
Autotuning Algorithmic Choice for Input Sensitivity (PLDI 2015)
WISE: Predicting the Performance of Sparse Matrix Vector Multiplication with Machine Learning (PPoPP 2023)
|
10/03 |
Autotuning: Increasing Efficiency
Main Reading: AdaTune: Adaptive Tensor Program Compilation Made Efficient (NeurIPS 2020)
|
Yuhao |
Related Reading:
SRTuner: effective compiler optimization customization by exposing synergistic relations (CGO 2022) |
10/05 |
Data-driven Cost Models: Part 1
Main Reading: TLP: A Deep Learning-Based Cost Model for Tensor Program Tuning (ASPLOS 2023)
|
Jun, Wyatt |
Related Reading:
Learning execution through neural code fusion (ICLR 2020) |
10/10 |
Data-driven Cost Models: Part 2
Main Reading: A Learned Performance Model for Tensor Processing Units (MLSys 2021)
|
Noelle |
Related Reading:
A Deep Learning based cost model for automatic code optimization (MLSys 2021) |
10/12 |
Program Embeddings: Part 1
Main Reading: CodeBERT: A Pre-Trained Model for Programming and Natural Languages (EMNLP 2020)
|
Muyan |
Related Reading:
Blended, precise semantic program embeddings (PLDI 2020)
Learning and Evaluating Contextual Embedding of Source Code (ICML 2020)
|
10/17 |
Program Embeddings: Part 2
Main Reading: ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations (ICML 2021)
|
Jun |
Related Reading:
IR2Vec: LLVM IR Based Scalable Program Embeddings (TACO 2020) |
10/19 |
Learned Optimizations: Traditional Compiler Optimizations 1
Main Reading: Compiler Auto-Vectorization with Imitation Learning (NeurIPS 2019)
|
Vir |
Related Reading:
NeuroVectorizer: End-to-end Vectorization with Deep Reinforcement Learning (CGO 2020)
Meta Optimization: improving compiler heuristics (PLDI 2003)
|
10/24 |
Learned Optimizations: Traditional Compiler Optimizations 2
Main Reading: End-to-end Deep Learning of Optimization Heuristics (PACT 2017)
|
Wyatt, Sanjana |
Related Reading:
AutoPhase: Juggling HLS Phase Orderings in Random Forests with Deep Reinforcement Learning (MLSys 2020) |
10/26 |
Learned Optimizations: DSLs Part 1
Main Reading: Learning to Optimize Halide with Tree Search and Random Programs (SIGGRAPH 2019)
|
Vir, Jai |
Related Reading:
Value Learning for Throughput Optimization of Deep Neural Networks (MLSys 2021) |
10/31 |
Learned Optimizations: DSLs Part 2
Main Reading: Ansor: Generating High-Performance Tensor Programs for Deep Learning (OSDI 2020)
|
Chamika |
Related Reading:
The case for learned index structures (SIGMOD 2018) - databases |
11/02 |
Learned Optimizations: Tensor Programs
Main Reading: Device Placement Optimization with reinforcement learning (ICML 2017)
|
Sanjana |
Related Reading:
Transferable Graph Optimizers for ML Compilers (NeurIPS 2020) |
11/07 |
Guest Lecture
|
|
|
11/09 |
No Class
|
|
|
11/14 |
Architecture Design Space Exploration: Part 1
Main Reading: HyperMapper: a Practical Design Space Exploration Framework (MASCOTS 2019)
|
Chase, Yuhao |
Related Reading:
A Full-stack Accelerator Search Technique for Vision Applications Mind Mappings: Enabling Efficient Algorithm-Accelerator Mapping Space Search (ASPLOS 2021) |
11/16 |
Architecture Design Space Exploration: Part 2
Main Reading: A graph placement methodology for fast chip design (Nature 2021)
|
Ben |
Related Reading:
Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning (MICRO 2021)
Cohmeleon: Learning-Based Orchestration of Accelerator Coherence in Heterogeneous SoCs (MICRO 2021)
|
11/21 |
Break
|
|
|
11/23 |
Break
|
|
|
11/28 |
Learned Architecture Simulation
Main Reading: DiffTune: Optimizing CPU Simulator Parameters with Learned Differentiable Surrogates (MICRO 2020)
|
Noelle |
Related Reading:
SimNet: Computer Architecture Simulation using Machine Learning |
11/30 |
Learned Systems: Caches
Main Reading: Applying Deep Learning to the Cache Replacement Problem (MICRO 2019)
|
Chase |
Related Reading:
Learning Memory Access Patterns (ICML 2018)
Applying Deep Learning to the Cache Replacement Problem (MICRO 2019)
|
12/05 |
Student Presentations
|
|
|