Sunday Evening: Conference Reception



Monday, June 1

8:30 AM – 8:50 AM: Opening Remarks

8:50 AM – 9:50 AM: Keynote

9:50 AM – 10:20 AM: Coffee Break

10:20 AM – 12:00 PM

Data Compression Accelerator on IBM POWER9 and z15 Processors
Bulent Abali (IBM Research); Bart Blaner, John Reilly, Matthias Klein, Ashutosh Mishra, Craig Agricola (IBM Systems); Bedri Sendir (IBM Cloud); Alper Buyuktosunoglu (IBM Research); Christian Jacobi, Bill Starke, Haren Myneni, Charlie Wang (IBM Systems)

High-Performance Deep-Learning Coprocessor Integrated into x86 SoC with Server-Class CPUs
Glenn Henry, Parviz Palangpour, Michael Thomson (Centaur Technology); J Scott Gardner (Advantage Engineering LLC); Bryce Arden, Kimble Houck, Jonathan Johnson, Kyle O'Brien, Scott Petersen, Benjamin Seroussi, Tyler Walker (Centaur Technology)

The IBM z15 High Frequency Mainframe Branch Predictor
Adam Collura, Anthony Saporito, James Bonanno, Brian R. Prasky, Narasimha Adiga, Matthias Heizmann (IBM)

Evolution of the Samsung Exynos CPU Microarchitecture
Brian Grayson (Samsung Austin Research Center/SiFive); Jeff Rupley, Gerald Zuraski, Jr., Eric Quinnell; Daniel A. Jiménez (Samsung Austin Research Center/Texas A&M); Tarun Nakra (Samsung Austin Research Center/AMD); Paul Kitchin (Samsung Austin Research Center/Nuvia); Ryan Hensley (Samsung Austin Research Center/Goodie); Edward Brekelbaum (Samsung Austin Research Center/SiFive); Vikas Sinha (Samsung Austin Research Center/Nuvia); Ankit Ghiya (Samsung Austin Research Center/ARM)

Xuantie-910: A Commercial Multi-Core 12-Stage Pipeline Out-of-Order 64-bit High Performance RISC-V Processor with Vector Extension
Chen Chen, Xiaoyan Xiang, Chang Liu, Yunhai Shang, Ren Guo, Dongqi Liu, Yimin Lu, Ziyi Hao, Jiahui Luo, Zhijian Chen, Chunqiang Li, Yu Pu, Jianyi Meng (Alibaba Cloud); Yuan Xie (Alibaba Group); Xiaoning Qi (Alibaba Cloud)

12:00 PM – 1:30 PM: Lunch

1:30 PM – 3:30 PM

Divide and Conquer Frontend Bottleneck
Ali Ansari (Sharif); Pejman Lotfi-Kamran (IPM); Hamid Sarbazi-Azad (Sharif)

Focused Value Prediction
Sumeet Bandishte, Jayesh Gaur, Zeev Sperber, Lihu Rappoport, Adi Yoaz, Sreenivas Subramoney (Intel)

Auto-Predication of Critical Branches
Adarsh Chauhan, Jayesh Gaur, Zeev Sperber, Franck Sala, Lihu Rappoport, Adi Yoaz, Sreenivas Subramoney (Intel)

Slipstream Processors Revisited: Exploiting Branch Sets
Vinesh Srinivasan (NC State); Rangeen Basu Roy Chowdhury (Intel); Eric Rotenberg (NC State)

Bouquet of Instruction Pointers: Instruction Pointer Classifier-based Spatial Hardware Prefetching
Samuel Pakalapati (Intel/BITS Pilani); Biswabandan Panda (IIT Kanpur)

CryoCore: A Fast and Dense Processor Architecture for Cryogenic Computing
Il-Kwon Byun, Dongmoon Min, Gyu-Hyeon Lee, Seongmin Na, Jangwoo Kim (SNU)
Genesis: A Hardware Acceleration Framework for Genomic Data Analysis
Tae Jun Ham (SNU); David Bruns-Smith, Brendan Sweeney (UC Berkeley); Yejin Lee, Seong Hoon Seo, U Gyeong Song (SNU); Young H. Oh (Sungkyunkwan); Krste Asanovic (UC Berkeley); Jae W. Lee (SNU); Lisa Wu Wills (Duke)

DSAGEN: Synthesizing Programmable Spatial Accelerators
Jian Weng, Sihao Liu, Zhengrong Wang, Vidushi Dadu, Tony Nowatzki (UCLA)

Bonsai: High-Performance Adaptive Merge Tree Sorting
Nikola Samardzic, Weikang Qiao, Vaibhav Aggarwal, Mau-Chung Frank Chang, Jason Cong (UCLA)

SOFF: An OpenCL High-Level Synthesis Framework for FPGAs
Gangwon Jo, Heehoon Kim, Jeesoo Lee, Jaejin Lee (SNU)

Gorgon: Accelerating Machine Learning from Relational Data
Matthew Vilim, Alex Rucker, Yaqi Zhang, Sophia Liu, Kunle Olukotun (Stanford)

A Specialized Architecture for Object Serialization with Applications to Big Data Analytics
Jaeyoung Jang (Sungkyunkwan); Sung Jun Jung, Sunmin Jeong, Jun Heo, Hoon Shin, Tae Jun Ham, Jae W. Lee (SNU)

3:30 PM – 4:00 PM: Coffee Break

4:00 PM – 6:00 PM

SpinalFlow: An Architecture and Dataflow Tailored for Spiking Neural Networks
Surya Narayanan, Karl Taht, Rajeev Balasubramonian, Edouard Giacomin, Pierre-Emmanuel Gaillardon (Utah)

NEBULA: A Neuromorphic Spin-Based Ultra-Low Power Architecture for SNNs and ANNs
Sonali Singh, Anup Sarma, Nicholas Jao (Penn State); Ashutosh pattnaik (Penn State/Arm); Sen Lu, Kezhou Yang, Abhronil Sengupta, Vijaykrishnan Narayanan, Chita Das (Penn State)

uGEMM: Unary Computing Architecture for GEMM Applications
Di Wu, Jingjie Li, Ruokai Yin (Wisconsin); Hsuan Hsiao (Toronto); Younghyun Kim, Joshua San Miguel (Wisconsin)

Hardware-Software Co-Design for Brain-Computer Interfaces
Ioannis Karageorgos, Karthik Sriram (Yale); Jan Vesely, Michael Wu (Rutgers); Marc Powell, David Borton (Brown); Rajit Manohar, Abhishek Bhattacharjee (Yale)
GraphABCD: Scaling Out Graph Analytics with Asynchronous Block Coordinate Descent
Yifan Yang (Tsinghua/MIT); Zhaoshi Li, Yangdong Deng, Zhiwei Liu, Shouyi Yin, Shaojun Wei, Leibo Liu (Tsinghua)

GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation Using Crossbar Architectures
Nagadastagiri Reddy Challapalle, Sahithi Rampalli (Penn State); Linghao Song (Duke); Nandhini Chandramoorthy, Karthik Swaminathan (IBM Research); Jack Sampson (Penn State); Yiran Chen (Duke); Vijaykrishnan Narayanan (Penn State)
T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware
Victor A. Ying (MIT); Mark C. Jeffrey (Toronto); Daniel Sanchez (MIT)

Efficiently Supporting Dynamic Task-Parallelism on Heterogeneous Cache-Coherent Systems
Moyang Wang, Tuan Ta, Lin Cheng, Christopher Batten (Cornell)

Flick: Fast and Lightweight ISA-Crossing Call for Heterogeneous-ISA Environments
Shenghsun Cho, Han Chen, Sergey Madaminov, Michael Ferdman, Peter Milder (Stony Brook)
Printed Microprocessors
Nathaniel Bleier, Muhammad Husnain Mubarik (UIUC); Farhan Rasheed, Jasmin Aghassi-Hagmann, Mehdi B. Tahoori (KIT); Rakesh Kumar (UIUC)

SysScale: Utilizing Holistic Multi-domain DVFS to Improve the Energy Efficiency of Mobile Processors
Jawad Haj-Yihia, Mohammed Alser (ETH Zürich); Nandita Vijaykumar (CMU/ETH Zürich); Jeremie Kim, Giray Yaglikci (ETH Zürich); Efraim Rotem (Intel); Onur Mutlu (ETH Zürich/CMU)

Déjà View: Spatio-Temporal Compute Reuse for Energy-Efficient 360° VR Video Streaming
Shulin Zhao, Haibo Zhang, Sandeepa Bhuyan, Cyan Subhra Mishra, Ziyu Ying, Mahmut Taylan Kandemir, Anand Sivasubramaniam, Chita Das (Penn State)

6:00 PM – 6:30 PM: Break

6:30 PM – 8:00 PM: TCCA/SIGARCH Business Meeting



Tuesday, June 2

8:30 AM – 9:30 AM: Keynote

9:30 AM – 10:00 AM: Coffee Break

10:00 AM – 12:00 PM

MLPerf Inference: A Benchmarking Methodology for Machine Learning Inference Systems
Vijay Janapa Reddi (Harvard/UT Austin); Christine Cheng (Intel); David Kanter (Real World Technologies); Peter Mattson (Google); Guenther Schmuelling (Microsoft); Carole-Jean Wu (ASU/Facebook); Brian Anderson (Google); Maximilien Breughe (NVIDIA); Mark Charlebois, William Chou (Qualcomm); Ramesh Chukka (Intel); Cody Coleman (Stanford); Sam Davis (Myrtle); Pan Deng (Tencent); Greg Diamos (Landing AI); Jared Duke (Google); Dave Fick (Mythic); J. Scott Gardner (Advantage Engineering); Itay Hubara (Habana Labs); Sachin Idgunji (NVIDIA); Thomas B. Jablin (Google); Jeff Jiao (Alibaba T-Head); Tom St. John (Tesla); Pankaj Kanwar (Google); David Lee (MediaTek); Jeffery Liao (Synopsys); Anton Lokhmotov (Dividiti); Francisco Massa (Facebook); Peng Meng (Tencent); Paulius Micikevicius (NVIDIA); Colin Osborne (Arm); Gennady Pekhimenko (Toronto); Arun Tejusve Raghunath Rajan (Intel); Dilip Sequeira (NVIDIA); Ashish Sirasao (Xilinx); Fei Sun (Alibaba); Hanlin Tang (Intel); Michael Thomson (Centaur Technology); Frank Wei (Alibaba Cloud); Ephrem Wu (Xilinx); Lingjie Xu (Alibaba Cloud); Koichi Yamada (Intel); Bing Yu (MediaTek); George Yuan (NVIDIA); Aaron Zhong (Alibaba T-Head); Peizhao Zhang (Facebook); Yuchen Zhou (General Motors)

Mocktails: Capturing the Memory Behaviour of Proprietary Mobile Architectures
Mario Badr (Toronto); Carlo Delconte (Arm); Isak Edo (Toronto); Radhika Jagtap, Matteo Andreozzi (Arm); Natalie Enright Jerger (Toronto)

Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling
Mahmoud Khairy (Purdue); Tor Aamodt (UBC); Timothy Rogers, Zhesheng Shen (Purdue)
HyperTRIO: Hyper-tenant TRanslation of I/O Addresses
Alexey Lavrov, David Wentzlaff (Princeton)

BabelFish: Fusing Address Translations Across the Stack for Containers
Dimitrios Skarlatos (UIUC); Umur Darbaz, Bhargava Gopireddy (NVIDIA/UIUC); Nam Sung Kim (Samsung/UIUC); Josep Torrellas (UIUC)

Enhancing and Exploiting Contiguity for Fast Memory Virtualization
Chloe Alverti, Stratos Psomadakis, Vasileios Karakostas (NTU Athens); Jayneel Gandhi (VMware Research); Konstantinos Nikas, Georgios Goumas, Nectarios Koziris (NTU Athens)
Revisiting RowHammer: An Experimental Analysis of Modern Devices and Mitigation Techniques
Jeremie S. Kim (CMU/ETH Zürich); Minesh Patel, Abdullah Giray Yaglikci, Hasan Hassan, Roknoddin Azizi, Lois Orosa (ETH Zürich); Onur Mutlu (ETH Zürich/CMU)

The Open-Yet-Folded Bitline: A Flexible DRAM Subarray Architecture Enabling Dynamic Performance-Capacity Tradeoff
Haocong Luo, Taha Shahroodi, Abdullah Giray Yaglikci, Hasan Hassan, Minesh Patel, Lois Orosa, Jisung Park, Onur Mutlu (ETH Zürich)
Architecting Noisy Intermediate-Scale Trapped Ion Quantum Computers
Prakash Murali (Princeton); Dripto M. Debroy, Kenneth R. Brown (Duke); Margaret R. Martonosi (Princeton)

AccQOC: Accelerating Quantum Optimal Control Based Pulse Generation
Jinglei Cheng, Haoqing Deng, Xuehai Qian (USC)

NISQ+: Boosting Computational Power of Quantum Computers by Approximating Quantum Error Correction
Adam Holmes, Mohammad Reza Jokar (UChicago); Ghasem Pasandi (USC); Yongshan Ding (UChicago); Massoud Pedram (USC); Fred Chong (UChicago)

SQUARE: Strategic Quantum Ancilla Reuse for Modular Quantum Programs via Cost-Effective Uncomputation
Yongshan Ding, Xin-Chuan Wu, Adam Holmes, Ash Wiseth, Diana Franklin (UChicago); Margaret Martonosi (Princeton); Fred Chong (UChicago)

12:00 PM – 2:00 PM: Awards Lunch

2:00 PM – 4:20 PM

HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory
Miao Cai (Nanjing); Chance Coats, Jian Huang (UIUC)

Lelantus: Fine-Granularity Copy-On-Write Operations for Secure Non-Volatile Memories
Jian Zhou, Amro Awad, Jun Wang (UCF)

MorLog: Morphable Hardware Logging for Atomic Persistence in Non-Volatile Main Memory
Xueliang Wei, Dan Feng, Wei Tong, Jingning Liu, Liuqing Ye (HUST)

Tvarak: Software-Managed Hardware Offload for DAX NVM Storage Redundancy
Rajat Kateja, Nathan Beckmann, Gregory R. Ganger (CMU)

Relaxed Persist Ordering Using Strand Persistency
Vaibhav Gogte (Michigan); William Wang, Stephan Diestelhorst (ARM); Peter M. Chen, Satish Narayanasamy, Thomas F. Wenisch (Michigan)

Hardware-Based Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects
Yuanchao Xu (NC State); Chencheng Ye (HUST); Yan Solihin (UCF); Xipeng Shen (NC State)

Check-In: In-Storage Checkpointing for Key-Value Store System Leveraging Flash-Based SSDs
Joohyeong Yoon, Won Seob Jeong, Won Woo Ro (Yonsei)
MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State
Sam Ainsworth, Timothy Jones (Cambridge)

Nested Enclave: Supporting Fine-Grained Hierarchical Isolation with SGX
Joongun Park, Naegyeong Kang, Taehoon Kim, Youngjin Kwon, Jaehyuk Huh (KAIST)

Speculative Data-Oblivious Execution: Efficient Elimination of Speculative Covert Channels
Jiyong Yu, Namrata Mantri, Josep Torrellas (UIUC); Adam Morrison (Tel Aviv); Christopher W. Fletcher (UIUC)

Packet Chasing: Observing Network Packets over a Cache Side-Channel
Mohammadkazem Taram (UCSD); Ashish Venkat (UVa); Dean Tullsen (UCSD)

Compact Leakage-Free Support for Integrity and Reliability
Meysam Taassori, Rajeev Balasubramonian (Utah); Siddhartha Chhabra, Alaa Alameldeen, Manjula Peddireddy, Rajat Agarwal (Intel); Ryan Stutsman (Utah)

DIVOT: A Novel Architecture Extending Hardware Trusted Computing Base Off CPU Chips and Beyond
Zhenyu Xu, Thomas Mauldin, Zheyi Yao, Shuyi Pei, Tao Wei, Qing Yang (URI)

CHEx86: Context-Sensitive Enforcement of Memory Safety via Microcode-Enabled Capabilities
Rasool Sharifi, Ashish Venkat (UVa)

4:20 PM – 4:40 PM: Break

4:40 PM: Excursion & Banquet



Wednesday, June 3

8:50 AM – 9:50 AM: Keynote

9:50 AM – 10:20 AM: Coffee Break

10:20 AM – 12:00 PM

Independent Forward Progress of Work-Groups
Alexandru Dutu (AMD Research); Matthew Sinclair (Wisconsin/AMD Research); Bradford M. Beckmann (AMD); David A. Wood (AMD Research/Wisconsin); Marcus Chow (UC Riverside/AMD Research)

ScoRD: A Scoped Race Detector for GPUs
Aditya K Kamath, Alvin George A, Arkaprava Basu (IIS-Bangalore)

Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Esha Choukse (Microsoft); Michael Sullivan (NVIDIA); Mike O'Connor (NVIDIA/UT Austin); Mattan Erez (UT Austin); Jeff Pool, David Nellans, Stephen W. Keckler (NVIDIA)

ZnG: Architecting GPU Multi-Processors with New Flash for Scalable Data Analysis
Jie Zhang, Myoungsoo Jung (KAIST)

Commutative Data Reordering: A New Technique to Reduce Data Movement Energy on Sparse Inference Workloads
Ben Feinberg (Sandia); Benjamin C. Heyman, Darya Mikhailenko, Ryan Wong, An Ho, Engin Ipek (Rochester)
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke (Facebook/WUSTL); Udit Gupta (Harvard); Carole-Jean Wu (ASU/Facebook); Benjamin Cho (Facebook/UT Austin); Mark Hempstead (Facebook/Tufts); Brandon Reagon (Facebook); Xuan Zhang (Facebook/WUSTL); David Brooks (Harvard); Vikas Chandra, Utku Diril, Amin Firoozshahian, Bill Jia, Kim Hazelwood, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang (Facebook)

iPIM: Programmable In-Memory Image Processing Accelerator Using Near-Bank Architecture
Peng Gu, Xinfeng Xie, Yufei Ding (UCSB); Guoyang Chen, Weifeng Zhang, Dimin Niu (Alibaba); Yuan Xie (UCSB)

Near Data Acceleration with Concurrent Host Access
Benjamin Cho, Yongkee Kwon, Sangkug Lym, Mattan Erez (UT Austin)

TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators towards Local and in Time Domain
Weitao Li, Pengfei Xu (Rice); Yang Zhao (UCSB/Rice); Haitong Li (Stanford); Yuan Xie (UCSB); Yingyan Lin (Rice)

Hyper-AP: Enhancing Associative Processing Through a Full-Stack Optimization
Yue Zha (Wisconsin); Jing Li (Wisconsin/UPenn)

12:00 PM – 1:30 PM: Lunch

1:30 PM – 3:30 PM

TransForm: Formally Specifying Transistency Models and Synthesizing Enhanced Litmus Tests
Naorin Hossain (Princeton); Caroline Trippel (Stanford/Facebook); Margaret Martonosi (Princeton)

HieraGen: Automatically Generating Hierarchical Cache Coherence Protocols from Atomic Specifications
Nicolai Oswald, Vijay Nagarajan (Edinburgh); Daniel J. Sorin (Duke)
Tailored Page Sizes
Faruk Guvenilir (UT Austin/Microsoft); Yale Patt (UT Austin)

Perforated Page: Supporting Fragmented Memory Allocation for Large Pages
Chang Hyun Park (Uppsala); Sanghoon Cha, Bokyeong Kim, Youngjin Kwon (KAIST); David Black-Schaffer (Uppsala); Jaehyuk Huh (KAIST)

A Case for Hardware-Based Demand Paging
Gyusun Lee (Sungkyunkwan); Wenjing Jin (SNU); Wonsuk Song (Sungkyunkwan); Jeonghun Gong, Jonghyun Bae, Tae Jun Ham, Jae W. Lee (SNU); Jinkyu Jeong (Sungkyunkwan)

The Virtual Block Interface (VBI): A Flexible Alternative to Conventional Virtual Memory Frameworks
Nastaran Hajinazar (SFU/ETH Zürich); Pratyush Patel (Washington); Minesh Patel, Konstantinos Kanellopoulos (ETH Zürich); Saugata Ghose (CMU); Rachata Ausavarungnirun (KMUTNB); Geraldo Francisco de Oliveira Junior (ETH Zürich); Jonathan Appavoo (Boston); Vivek Seshadri (Microsoft Research India); Onur Mutlu (ETH Zürich)
A Simultaneous Multi-Neural Network Execution Processor Architecture
Eunjin Baek, Dongup Kwon, Jangwoo Kim (SNU)

SmartExchange: Trading Higher-Cost Memory Storage/Access for Lower-Cost Computation
Yang Zhao (Rice); Xiaohan Chen (Texas A&M); Yue Wang, Chaojian Li (Rice); Yuan Xie (UCSB); Zhangyang Wang (Texas A&M); Yingyan Lin (Rice)

Centaur: A Chiplet-Based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations
Ranggi Hwang, Taehun Kim, Youngeun Kwon, Minsoo Rhu (KAIST)

DeepRecSys: A System for Optimizing End-to-End At-Scale Neural Recommendation Inference
Udit Gupta (Facebook/Harvard); Samuel Hsia (Harvard); Vikram Saraph, Xiaodong Wang, Brandon Reagen (Facebook); Gu-Yeon Wei (Harvard); Hsien-Hsin S. Lee (Facebook); David Brooks (Facebook/Harvard); Carole-Jean Wu (ASU/Facebook)

An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives
Benjamin Klenk, Nan Jiang, Greg Thorson, Larry Dennison (NVIDIA)

DRQ: Dynamic Region-Based Quantization for Deep Neural Network Acceleration
Zhuoran Song, Bangqi Fu, Feiyang Wu, Zhaoming Jiang, Li Jiang, Naifeng Jing, Xiaoyao Liang (SJTU)

3:30 PM – 4:00 PM: Coffee Break

4:00 PM – 5:00 PM

Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads
Dennis Abts, Jonathan Ross, Jon Sparling, Mark Wong-VanHaren, Max Baker, Tom Hawkins, Andrew Bell, John Thompson, Teme Kahsai, Garrin Kimmell, Jennifer Hwang, Rebekah Leslie-Hurd, Michael Bye, Rogan Creswick, Matthew Boyd, Mahitha Venigalla, Evan Laforge, Jon Purdy, Utham Kamath, Dinesh Maheshwari, Michael Beidler, Geert Rosseel, Omar Ahmad, Gleb Gagarin, Rick Czekalski, Ashay Rane, Sahil Parmar (Groq Inc.)

The Nebula RPC-Optimized Architecture
Mark Sutherland, Siddharth Gupta, Babak Falsafi (EcoCloud, EPFL); Virendra Marathe (Oracle); Dionisios Pnevmatikatos (NTU Athens/FORTH); Alexandros Daglis (Georgia Tech)

Heat to Power: Thermal Energy Harvesting and Recycling for Warm Water-Cooled Datacenters
Xinhui Zhu, Weixiang Jiang, Fangming Liu, Qixia Zhang, Li Pan, Qiong Chen, Ziyang Jia (HUST)
Echo: Compiler-Based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng (Toronto); Nandita Vijaykumar (CMU); Gennady Pekhimenko (Toronto)

JPEG-ACT: A Frequency-Domain Lossy DMA Engine for Training Convolutional Neural Networks
R. David Evans, Lu Fei Liu, Tor Aamodt (UBC)

5:00 PM – 5:15 PM: Closing Remarks