Viewing Conference Content

Papers are available in the program below. You can view conference presentations and live streams on Whova, either by downloading the app on your phone or tablet, or using the web app interface. Note that Whova content is available only to people who register for the conference.


Day 1: Monday, June 1

9:55 AM – 10:00 AM (EDT/New York)

6:55 AM (PDT/San Francisco),
15:55 (CEST/Brussels),
21:55 (CST/Beijing)
Welcome Message from the General Co-chairs
José Martínez (Cornell); José Duato (UPV)

Welcome Message from the Program Chair
Lieven Eeckhout (Ghent)

Welcome Message from the Industry Track Chair
David Patterson (Google/UC Berkeley)

10:00 AM – 11:00 AM (EDT/New York): Keynote by Margaret Martonosi

7:00 AM (PDT/San Francisco),
16:00 (CEST/Brussels),
22:00 (CST/Beijing)

Abstract
The fields of computer and information science and engineering (CISE) are central to nearly all of society's needs, opportunities, and challenges. The US National Science Foundation (NSF) was created 70 years ago with a broad mission to promote the progress of science and to catalyze societal and economic benefits. NSF, largely through its CISE directorate which has an annual budget of more than $1B, accounts for over 85% of federally-funded, academic, fundamental computer science research in the US. My talk will give an overview of NSF/CISE research, education, and research infrastructure programs, and relate them to the technical and societal trends and topics that will impact their future trajectory. I will particularly highlight opportunity areas most in need of the engagement and insights from ISCA researchers going forward.


Bio
Margaret Martonosi is the US National Science Foundation's (NSF) Assistant Director for Computer and information Science and Engineering (CISE). With an annual budget of more than $1B, the CISE directorate at NSF has the mission to uphold the Nation's leadership in scientific discovery and engineering innovation through its support of fundamental research and education in computer and information science and engineering as well as transformative advances in research cyberinfrastructure. While at NSF, Dr. Martonosi is on leave from Princeton University, where she is the Hugh Trumbull Adams '35 Professor of Computer Science.

Dr. Martonosi's research interests are in computer architecture and hardware-software interface issues in both classical and quantum computing systems. Her work has included the widely-used Wattch power modeling tool and the Princeton ZebraNet mobile sensor network project for the design and real-world deployment of zebra tracking collars in Kenya. Dr. Martonosi is a member of the American Academy of Arts and Sciences, and a Fellow of the Association for Computing Machinery (ACM) and the Institute of Electrical and Electronics Engineers (IEEE). Her papers have received numerous long-term impact awards from the computer architecture and mobile computing communities. In addition, she has earned the 2019 SIGARCH Alan D. Berenbaum Distinguished Service Award, the 2018 IEEE Computer Society Technical Achievement Award, and the 2010 Princeton University Graduate Mentoring Award, among other honors.

11:00 AM – 12:00 PM (EDT/New York)

8:00 AM (PDT/San Francisco),
17:00 (CEST/Brussels),
23:00 (CST/Beijing)
Session Chair: David Patterson (Google / UC Berkeley)
11:00 AM – 11:12 AM EDT
Data Compression Accelerator on IBM POWER9 and z15 Processors
Bulent Abali (IBM Research); Bart Blaner, John Reilly, Matthias Klein, Ashutosh Mishra, Craig Agricola (IBM Systems); Bedri Sendir (IBM Cloud); Alper Buyuktosunoglu (IBM Research); Christian Jacobi, Bill Starke, Haren Myneni, Charlie Wang (IBM Systems)

11:12 AM – 11:24 AM EDT
High-Performance Deep-Learning Coprocessor Integrated into x86 SoC with Server-Class CPUs
Glenn Henry, Parviz Palangpour, Michael Thomson (Centaur Technology); J Scott Gardner (Advantage Engineering LLC); Bryce Arden, Kimble Houck, Jonathan Johnson, Kyle O'Brien, Scott Petersen, Benjamin Seroussi, Tyler Walker (Centaur Technology)

11:24 AM – 11:36 AM EDT
The IBM z15 High Frequency Mainframe Branch Predictor
Adam Collura, Anthony Saporito, James Bonanno, Brian R. Prasky, Narasimha Adiga, Matthias Heizmann (IBM)

11:36 AM – 11:48 AM EDT
Evolution of the Samsung Exynos CPU Microarchitecture
Brian Grayson (Samsung Austin Research Center/SiFive); Jeff Rupley, Gerald Zuraski, Jr., Eric Quinnell; Daniel A. Jiménez (Samsung Austin Research Center/Texas A&M); Tarun Nakra (Samsung Austin Research Center/AMD); Paul Kitchin (Samsung Austin Research Center/Nuvia); Ryan Hensley (Samsung Austin Research Center/Goodie); Edward Brekelbaum (Samsung Austin Research Center/SiFive); Vikas Sinha (Samsung Austin Research Center/Nuvia); Ankit Ghiya (Samsung Austin Research Center/ARM)

11:48 AM – 12:00 PM EDT
Xuantie-910: A Commercial Multi-Core 12-Stage Pipeline Out-of-Order 64-bit High Performance RISC-V Processor with Vector Extension
Chen Chen, Xiaoyan Xiang, Chang Liu, Yunhai Shang, Ren Guo, Dongqi Liu, Yimin Lu, Ziyi Hao, Jiahui Luo, Zhijian Chen, Chunqiang Li, Yu Pu, Jianyi Meng (Alibaba Cloud); Yuan Xie (Alibaba Group); Xiaoning Qi (Alibaba Cloud)

12:00 PM – 1:45 PM (EDT/New York)

9:00 AM (PDT/San Francisco),
18:00 (CEST/Brussels),
Tue 00:00 (CST/Beijing)
Session Chair: David Kaeli (Northeastern)
12:00 PM – 12:15 PM EDT
Divide and Conquer Frontend Bottleneck
Ali Ansari (Sharif); Pejman Lotfi-Kamran (IPM); Hamid Sarbazi-Azad (Sharif)

12:15 PM – 12:30 PM EDT
Focused Value Prediction
Sumeet Bandishte, Jayesh Gaur, Zeev Sperber, Lihu Rappoport, Adi Yoaz, Sreenivas Subramoney (Intel)

12:30 PM – 12:45 PM EDT
Auto-Predication of Critical Branches
Adarsh Chauhan, Jayesh Gaur, Zeev Sperber, Franck Sala, Lihu Rappoport, Adi Yoaz, Sreenivas Subramoney (Intel)

12:45 PM – 1:00 PM EDT
Slipstream Processors Revisited: Exploiting Branch Sets
Vinesh Srinivasan (NC State); Rangeen Basu Roy Chowdhury (Intel); Eric Rotenberg (NC State)

1:00 PM – 1:15 PM EDT
Bouquet of Instruction Pointers: Instruction Pointer Classifier-Based Spatial Hardware Prefetching
Samuel Pakalapati (Intel/BITS Pilani); Biswabandan Panda (IIT Kanpur)

1:15 PM – 1:30 PM EDT
MuonTrap: Preventing Cross-Domain Spectre-Like Attacks by Capturing Speculative State
Sam Ainsworth, Timothy Jones (Cambridge)

1:30 PM – 1:45 PM EDT
Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads
Dennis Abts, Jonathan Ross, Jon Sparling, Mark Wong-VanHaren, Max Baker, Tom Hawkins, Andrew Bell, John Thompson, Teme Kahsai, Garrin Kimmell, Jennifer Hwang, Rebekah Leslie-Hurd, Michael Bye, Rogan Creswick, Matthew Boyd, Mahitha Venigalla, Evan Laforge, Jon Purdy, Utham Kamath, Dinesh Maheshwari, Michael Beidler, Geert Rosseel, Omar Ahmad, Gleb Gagarin, Rick Czekalski, Ashay Rane, Sahil Parmar (Groq Inc.)
Session Chair: Krste Asanovic (UC Berkeley)
12:00 PM – 12:15 PM EDT
T4: Compiling Sequential Code for Effective Speculative Parallelization in Hardware
Victor A. Ying (MIT); Mark C. Jeffrey (Toronto); Daniel Sanchez (MIT)

12:15 PM – 12:30 PM EDT
Efficiently Supporting Dynamic Task-Parallelism on Heterogeneous Cache-Coherent Systems
Moyang Wang, Tuan Ta, Lin Cheng, Christopher Batten (Cornell)

12:30 PM – 12:45 PM EDT
Flick: Fast and Lightweight ISA-Crossing Call for Heterogeneous-ISA Environments
Shenghsun Cho, Han Chen, Sergey Madaminov, Michael Ferdman, Peter Milder (Stony Brook)

12:45 PM – 1:00 PM EDT
The NeBuLa RPC-Optimized Architecture
Mark Sutherland, Siddharth Gupta, Babak Falsafi (EcoCloud, EPFL); Virendra Marathe (Oracle); Dionisios Pnevmatikatos (NTU Athens/FORTH); Alexandros Daglis (Georgia Tech)

1:00 PM – 1:15 PM EDT
Printed Microprocessors
Nathaniel Bleier, Muhammad Husnain Mubarik (UIUC); Farhan Rasheed, Jasmin Aghassi-Hagmann, Mehdi B. Tahoori (KIT); Rakesh Kumar (UIUC)

1:15 PM – 1:30 PM EDT
SysScale: Exploiting Multi-Domain Dynamic Voltage and Frequency Scaling for Energy Efficient Mobile Processors
Jawad Haj-Yihia, Mohammed Alser (ETH Zürich); Nandita Vijaykumar (Intel/Toronto); Jeremie Kim, Giray Yaglikci (ETH Zürich); Efraim Rotem (Intel); Onur Mutlu (ETH Zürich/CMU)

1:30 PM – 1:45 PM EDT
Déjà View: Spatio-Temporal Compute Reuse for Energy-Efficient 360° VR Video Streaming
Shulin Zhao, Haibo Zhang, Sandeepa Bhuyan, Cyan Subhra Mishra, Ziyu Ying, Mahmut Taylan Kandemir, Anand Sivasubramaniam, Chita Das (Penn State)

1:45 PM – 2:15 PM (EDT/New York)

10:45 AM (PDT/San Francisco),
19:45 (CEST/Brussels),
Tue 01:45 (CST/Beijing)

This panel will discuss where the security research in computer architecture is heading and how the field can have more impact going forward. For example, the panel will discuss questions such as what are the major open problems and research directions in processor security, how can computer architects improve the security landscape, what is the role of hardware in improving system security, how can academic research communities effectively work together with industry researchers, etc.

The panel will include brief position statements by the participants plus a Q&A format where each panelist will respond to questions from the audience. The panelists will also have some questions prepared.

The last decade has seen a remarkable shift towards research on hardware accelerators — particularly domain-specific accelerators — both in academia and industry. The promise of accelerators is to shed the dead weight of traditional execution models, and enable data orchestration, parallelism, and even arithmetic that is precisely tailored to application needs. Academic papers report multiple orders of magnitude improvement, and industry has seen proliferation of hardware accelerators in software companies (TPU, Brainwave) and in numerous hardware startups.

Yet, due to a number of fundamental challenges, the future of this new wave of computing machines remains largely uncertain. For one, accelerators require unique hardware/software interfaces, meaning that robust software stacks for accelerators are time consuming and difficult to create. This can make accelerator systems difficult to program effectively in a way that can keep up with algorithm evolution. Accelerators are often narrow, and aren't useful for broad domains, making them somewhat niche machines. In order to use them, algorithms are often expressed inefficiently in limited interfaces. More broadly, in the CS community outside of architecture, many simply give up on using today's accelerators (e.g., TPUs, GPU tensor cores) because of the complexities and uncertain benefits.

In this mini-panel, our panelists will discuss their views on these challenges and implications on the future of accelerator research. The panelists will also invite questions from the live audience.

8:00 PM – 9:45 PM (EDT/New York)

5:00 PM (PDT/San Francisco),
Tue 02:00 (CEST/Brussels),
Tue 08:00 (CST/Beijing)
Session Chair: Lizy K. John (UT Austin)
8:00 PM – 8:15 PM EDT
Genesis: A Hardware Acceleration Framework for Genomic Data Analysis
Tae Jun Ham (SNU); David Bruns-Smith, Brendan Sweeney (UC Berkeley); Yejin Lee, Seong Hoon Seo, U Gyeong Song (SNU); Young H. Oh (Sungkyunkwan); Krste Asanovic (UC Berkeley); Jae W. Lee (SNU); Lisa Wu Wills (Duke)

8:15 PM – 8:30 PM EDT
DSAGEN: Synthesizing Programmable Spatial Accelerators
Jian Weng, Sihao Liu, Zhengrong Wang, Vidushi Dadu (UCLA); Preyas Shah (SimpleMachines); Tony Nowatzki (UCLA)

8:30 PM – 8:45 PM EDT
Bonsai: High-Performance Adaptive Merge Tree Sorting
Nikola Samardzic, Weikang Qiao, Vaibhav Aggarwal, Mau-Chung Frank Chang, Jason Cong (UCLA)

8:45 PM – 9:00 PM EDT
SOFF: An OpenCL High-Level Synthesis Framework for FPGAs
Gangwon Jo, Heehoon Kim, Jeesoo Lee, Jaejin Lee (SNU)

9:00 PM – 9:15 PM EDT
Gorgon: Accelerating Machine Learning from Relational Data
Matthew Vilim, Alex Rucker, Yaqi Zhang, Sophia Liu, Kunle Olukotun (Stanford)

9:15 PM – 9:30 PM EDT
A Specialized Architecture for Object Serialization with Applications to Big Data Analytics
Jaeyoung Jang (Sungkyunkwan); Sung Jun Jung, Sunmin Jeong, Jun Heo, Hoon Shin, Tae Jun Ham, Jae W. Lee (SNU)

9:30 PM – 9:45 PM EDT
CryoCore: A Fast and Dense Processor Architecture for Cryogenic Computing
Il-Kwon Byun, Dongmoon Min, Gyu-Hyeon Lee, Seongmin Na, Jangwoo Kim (SNU)
Session Chair: Tim Sherwood (UCSB)
8:00 PM – 8:15 PM EDT
SpinalFlow: An Architecture and Dataflow Tailored for Spiking Neural Networks
Surya Narayanan, Karl Taht, Rajeev Balasubramonian, Edouard Giacomin, Pierre-Emmanuel Gaillardon (Utah)

8:15 PM – 8:30 PM EDT
NEBULA: A Neuromorphic Spin-Based Ultra-Low Power Architecture for SNNs and ANNs
Sonali Singh, Anup Sarma, Nicholas Jao (Penn State); Ashutosh Pattnaik (Penn State/Arm); Sen Lu, Kezhou Yang, Abhronil Sengupta, Vijaykrishnan Narayanan, Chita Das (Penn State)

8:30 PM – 8:45 PM EDT
uGEMM: Unary Computing Architecture for GEMM Applications
Di Wu, Jingjie Li, Ruokai Yin (Wisconsin); Hsuan Hsiao (Toronto); Younghyun Kim, Joshua San Miguel (Wisconsin)

8:45 PM – 9:00 PM EDT
Hardware-Software Co-Design for Brain-Computer Interfaces
Ioannis Karageorgos, Karthik Sriram (Yale); Jan Vesely, Michael Wu (Rutgers); Marc Powell, David Borton (Brown); Rajit Manohar, Abhishek Bhattacharjee (Yale)

9:00 PM – 9:15 PM EDT
Heat to Power: Thermal Energy Harvesting and Recycling for Warm Water-Cooled Datacenters
Xinhui Zhu, Weixiang Jiang, Fangming Liu, Qixia Zhang, Li Pan, Qiong Chen, Ziyang Jia (HUST)

9:15 PM – 9:30 PM EDT
GraphABCD: Scaling Out Graph Analytics with Asynchronous Block Coordinate Descent
Yifan Yang (Tsinghua/MIT); Zhaoshi Li, Yangdong Deng, Zhiwei Liu, Shouyi Yin, Shaojun Wei, Leibo Liu (Tsinghua)

9:30 PM – 9:45 PM EDT
GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation Using Crossbar Architectures
Nagadastagiri Reddy Challapalle, Sahithi Rampalli (Penn State); Linghao Song (Duke); Nandhini Chandramoorthy, Karthik Swaminathan (IBM Research); Jack Sampson (Penn State); Yiran Chen (Duke); Vijaykrishnan Narayanan (Penn State)

9:45 PM – 10:15 PM (EDT/New York)

6:45 PM (PDT/San Francisco),
Tue 03:45 (CEST/Brussels),
Tue 09:45 (CST/Beijing)

This panel will talk about AI, ML, and how the architecture community interacts with those fields. We'll consider a number of questions that relate to work going on in industry and academia, how they differ, and how much algorithms and architecture will change what we need to be doing. We'll consider questions such as:

  • Is academic architecture spending too much time chasing "the latest layer" (convolution, now recommenders) and not enough time looking for more general solutions?
  • It looks like FB and Google have bet big on ASICs, while Microsoft has chosen an FPGA path. Why is this, and how is it working out for you?
  • Why hasn't Google built a search accelerator?
  • It feels like neural network acceleration at the edge has mostly been neglected (sure, there are Edge TPUs). Why is that?
In the October 2019 edition of the Economist Magazine, there was an article about open-source computing, which states: "Open-source software was a prerequisite for the smartphone boom that has taken place over the past decade. Open-source hardware, such as RISC-V, may lead to a similar expansion of other devices in the decade to come." From the technological perspective, what challenges and opportunities will architects encounter towards the goal envisioned by the Economist Magazine? In this mini-panel, the panelists will share their thoughts on these topics.


Day 2: Tuesday, June 2

10:00 AM – 11:00 AM (EDT/New York): Keynote by Valeria Bertacco

7:00 AM (PDT/San Francisco),
16:00 (CEST/Brussels),
22:00 (CST/Beijing)

Abstract
The pace of innovation in computing systems, and thus the electronic industry, is being impacted by slowing trends in traditional device scaling, and by skyrocketing costs of design and manufacturing. The high level of engineering specialization and the high-cost barriers to enter the electronics market has shrunk the number of startups and companies that can participate in the design of novel computing systems. In this talk, I will argue that the key to reignite innovation in computing systems design is to enable participation by a broader engineering community. To attain this goal, it is necessary to transform current design and creative processes with a meaningful level of automation in the front-end stages of the design process. In this context, domain-specific languages, or DSLs, enable broader participation in the building of efficient and effective applications. Coupled with a turnkey, end-to-end eco-system of flexible hardware accelerators, which can serve as highly efficient compiler targets for applications, it becomes possible for small teams to realize big ideas. The talk will share perspectives from a large research center focused on these initiatives. The next step to catalyze this transformation will be to automate the design of new hardware accelerators, to quickly address emerging applications with domain-efficient computation.


Bio
Valeria Bertacco is Thurnau Professor of Electrical Engineering and Computer Science at the University of Michigan, and Adjunct Professor of Computer Engineering at the Addis Ababa Institute of Technology. Her research interests are in the area of computer design, with emphasis on specialized architecture solutions and design viability, in particular reliability, validation, and hardware-security assurance. She joined the University of Michigan in 2003, after working with the Advanced Technology Group of Synopsys, which she joined via the acquisition of Systems Science Inc. When the world is not engulfed in a pandemic, Valeria enjoys traveling the world to find new and exciting adventures and collaborations.

Valeria is the Director of the Applications Driving Architectures (ADA) Research Center, whose goal is to reignite computing systems design and innovation for the 2030-2040s decades, through specialized heterogeneity, domain-specific language abstractions and new silicon devices that show benefit to applications. The Center engages 21 faculty members and 130 students from 10 academic institutions in the United States. During her career, she has served as the Program Chair of the Design Automation Conference (DAC), Track Chair of the Design Automation and Test in Europe (DATE) Conference, and asAssociate Editor of the IEEE Transactions on Computer-Aided Design. Valeria is the recipient of the IEEE CEDA Early Career Award, NSF CAREER award, the Air Force Office of Scientific Research's Young Investigator award and the IBM Faculty Award. From the University of Michigan, she received the Vulcans Education Excellence Award, the Herbert Kopf Service Excellence Award, the Sarah Goddard Power Award for contribution to the betterment of women, the Rackham Faculty Recognition Award and the Harold Johnson Diversity Service Award. Valeria is an ACM Distinguished Scientist and an IEEE Fellow.

11:00 AM – 12:00 PM (EDT/New York): SIGARCH/TCCA Business Meeting

8:00 AM (PDT/San Francisco),
17:00 (CEST/Brussels),
23:00 (CST/Beijing)

12:00 PM – 1:30 PM (EDT/New York)

9:00 AM (PDT/San Francisco),
18:00 (CEST/Brussels),
Wed 00:00 (CST/Beijing)
Session Chair: Jayneel Gandhi (VMware)
12:00 PM – 12:15 PM EDT
MLPerf Inference Benchmark
Vijay Janapa Reddi (Harvard/UT Austin); Christine Cheng (Intel); David Kanter (Real World Technologies); Peter Mattson (Google); Guenther Schmuelling (Microsoft); Carole-Jean Wu (ASU/Facebook); Brian Anderson (Google); Maximilien Breughe (NVIDIA); Mark Charlebois, William Chou (Qualcomm); Ramesh Chukka (Intel); Cody Coleman (Stanford); Sam Davis (Myrtle); Pan Deng (Tencent); Greg Diamos (Landing AI); Jared Duke (Google); Dave Fick (Mythic); J. Scott Gardner (Advantage Engineering); Itay Hubara (Habana Labs); Sachin Idgunji (NVIDIA); Thomas B. Jablin (Google); Jeff Jiao (Alibaba T-Head); Tom St. John (Tesla); Pankaj Kanwar (Google); David Lee (MediaTek); Jeffery Liao (Synopsys); Anton Lokhmotov (Dividiti); Francisco Massa (Facebook); Peng Meng (Tencent); Paulius Micikevicius (NVIDIA); Colin Osborne (Arm); Gennady Pekhimenko (Toronto); Arun Tejusve Raghunath Rajan (Intel); Dilip Sequeira (NVIDIA); Ashish Sirasao (Xilinx); Fei Sun (Alibaba); Hanlin Tang (Intel); Michael Thomson (Centaur Technology); Frank Wei (Alibaba Cloud); Ephrem Wu (Xilinx); Lingjie Xu (Alibaba Cloud); Koichi Yamada (Intel); Bing Yu (MediaTek); George Yuan (NVIDIA); Aaron Zhong (Alibaba T-Head); Peizhao Zhang (Facebook); Yuchen Zhou (General Motors)

12:15 PM – 12:30 PM EDT
Mocktails: Capturing the Memory Behaviour of Proprietary Mobile Architectures
Mario Badr (Toronto); Carlo Delconte (Arm); Isak Edo (Toronto); Radhika Jagtap, Matteo Andreozzi (Arm); Natalie Enright Jerger (Toronto)

12:30 PM – 12:45 PM EDT
Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling
Mahmoud Khairy, Zhesheng Shen (Purdue); Tor M. Aamodt (UBC); Timothy G. Rogers (Purdue)

12:45 PM – 1:00 PM EDT
HyperTRIO: Hyper-Tenant Translation of I/O Addresses
Alexey Lavrov, David Wentzlaff (Princeton)

1:00 PM – 1:15 PM EDT
BabelFish: Fusing Address Translations for Containers
Dimitrios Skarlatos (UIUC); Umur Darbaz, Bhargava Gopireddy (NVIDIA/UIUC); Nam Sung Kim (Samsung/UIUC); Josep Torrellas (UIUC)

1:15 PM – 1:30 PM EDT
Enhancing and Exploiting Contiguity for Fast Memory Virtualization
Chloe Alverti, Stratos Psomadakis, Vasileios Karakostas (NTU Athens); Jayneel Gandhi (VMware Research); Konstantinos Nikas, Georgios Goumas, Nectarios Koziris (NTU Athens)
Session Chair: Moin Qureshi (Georgia Tech)
12:00 PM – 12:15 PM EDT
Revisiting RowHammer: An Experimental Analysis of Modern Devices and Mitigation Techniques
Jeremie S. Kim (CMU/ETH Zürich); Minesh Patel, Abdullah Giray Yaglikci, Hasan Hassan, Roknoddin Azizi, Lois Orosa (ETH Zürich); Onur Mutlu (ETH Zürich/CMU)

12:15 PM – 12:30 PM EDT
CLR-DRAM: A Low-Cost DRAM Architecture Enabling Dynamic Capacity-Latency Trade-Off
Haocong Luo, Taha Shahroodi, Abdullah Giray Yaglikci, Hasan Hassan, Minesh Patel, Lois Orosa, Jisung Park, Onur Mutlu (ETH Zürich)

12:30 PM – 12:45 PM EDT
Architecting Noisy Intermediate-Scale Trapped Ion Quantum Computers
Prakash Murali (Princeton); Dripto M. Debroy, Kenneth R. Brown (Duke); Margaret R. Martonosi (Princeton)

12:45 PM – 1:00 PM EDT
AccQOC: Accelerating Quantum Optimal Control Based Pulse Generation
Jinglei Cheng, Haoqing Deng, Xuehai Qian (USC)

1:00 PM – 1:15 PM EDT
NISQ+: Boosting Computational Power of Quantum Computers by Approximating Quantum Error Correction
Adam Holmes, Mohammad Reza Jokar (UChicago); Ghasem Pasandi (USC); Yongshan Ding (UChicago); Massoud Pedram (USC); Fred Chong (UChicago)

1:15 PM – 1:30 PM EDT
SQUARE: Strategic Quantum Ancilla Reuse for Modular Quantum Programs via Cost-Effective Uncomputation
Yongshan Ding, Xin-Chuan Wu, Adam Holmes, Ash Wiseth, Diana Franklin (UChicago); Margaret Martonosi (Princeton); Fred Chong (UChicago)

1:30 PM – 2:00 PM (EDT/New York)

10:30 AM (PDT/San Francisco),
19:30 (CEST/Brussels),
Wed 01:00 (CST/Beijing)
The memory system is screaming for more attention. To cater to emerging demands from technology, workloads, and society, substantial new features are required in memory system hardware and memory management software, across a spectrum of devices/platforms. Several open questions remain:
  • Given the trend towards architectural specialization, what will memory systems look like five/ten years hence? Scratchpads? Shared or private? Feature-rich memories? 3D-stacked memory?
  • What memory features are worthwhile?
  • How are these features best implemented?
  • Are feature-rich memory ISAs required?
  • Should academics be concerned with industry cost/pricing incentives?
  • Should memory chips be left alone and should features be placed on DIMMs? Or vice-versa?
  • What memory security features are practical/worthwhile?
  • What is the level of programmability and virtualization that should be supported by new memory technologies?
  • What is an ideal virtual memory abstraction/implementation when dealing with massive memory capacity and several accelerators?
Single thread performance was, is and will remain the major challenge of the processor industry. Modern microprocessor design faces many issues. Increasing data and instruction footprints, increasingly diverse workload chararacteristics, and diminishing returns from instruction-level parallelism are top concerns. What is on the horizon for core microarchitecture? Recent research advances in instruction and data prefetching, predictive cache replacement, value prediction, and other technologies continue to improve microarchitecture design. In this coffee break panel, the panelists will discuss the future of microarchitecture in industry and research.

8:00 PM – 9:45 PM (EDT/New York)

5:00 PM (PDT/San Francisco),
Wed 02:00 (CEST/Brussels),
Wed 08:00 (CST/Beijing)
Session Chair: Jishen Zhao (UCSD)
8:00 PM – 8:15 PM EDT
HOOP: Efficient Hardware-Assisted Out-of-Place Update for Non-Volatile Memory
Miao Cai (Nanjing); Chance Coats, Jian Huang (UIUC)

8:15 PM – 8:30 PM EDT
Lelantus: Fine-Granularity Copy-on-Write Operations for Secure Non-Volatile Memories
Jian Zhou, Amro Awad, Jun Wang (UCF)

8:30 PM – 8:45 PM EDT
MorLog: Morphable Hardware Logging for Atomic Persistence in Non-Volatile Main Memory
Xueliang Wei, Dan Feng, Wei Tong, Jingning Liu, Liuqing Ye (HUST)

8:45 PM – 9:00 PM EDT
Tvarak: Software-Managed Hardware Offload for Redundancy in Direct-Access NVM Storage
Rajat Kateja, Nathan Beckmann, Gregory R. Ganger (CMU)

9:00 PM – 9:15 PM EDT
Relaxed Persist Ordering Using Strand Persistency
Vaibhav Gogte (Michigan); William Wang, Stephan Diestelhorst (ARM); Peter M. Chen, Satish Narayanasamy, Thomas F. Wenisch (Michigan)

9:15 PM – 9:30 PM EDT
Hardware-Based Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects
Yuanchao Xu (NC State); Chencheng Ye (HUST); Yan Solihin (UCF); Xipeng Shen (NC State)

9:30 PM – 9:45 PM EDT
Check-In: In-Storage Checkpointing for Key-Value Store System Leveraging Flash-Based SSDs
Joohyeong Yoon, Won Seob Jeong, Won Woo Ro (Yonsei)
Session Chair: Jakub Szefer (Yale)
8:00 PM – 8:15 PM EDT
Speculative Data-Oblivious Execution: Mobilizing Safe Prediction for Safe and Efficient Speculative Execution
Jiyong Yu, Namrata Mantri, Josep Torrellas (UIUC); Adam Morrison (Tel Aviv); Christopher W. Fletcher (UIUC)

8:15 PM – 8:30 PM EDT
Packet Chasing: Observing Network Packets over a Cache Side-Channel
Mohammadkazem Taram (UCSD); Ashish Venkat (UVa); Dean Tullsen (UCSD)

8:30 PM – 8:45 PM EDT
Compact Leakage-Free Support for Integrity and Reliability
Meysam Taassori, Rajeev Balasubramonian (Utah); Siddhartha Chhabra, Alaa Alameldeen, Manjula Peddireddy, Rajat Agarwal (Intel); Ryan Stutsman (Utah)

8:45 PM – 9:00 PM EDT
A Bus Authentication and Anti-Probing Architecture Extending Hardware Trusted Computing Base Off CPU Chips and Beyond
Zhenyu Xu, Thomas Mauldin, Zheyi Yao, Shuyi Pei, Tao Wei, Qing Yang (URI)

9:00 PM – 9:15 PM EDT
CHEx86: Context-Sensitive Enforcement of Memory Safety via Microcode-Enabled Capabilities
Rasool Sharifi, Ashish Venkat (UVa)

9:15 PM – 9:30 PM EDT
Nested Enclave: Supporting Fine-Grained Hierarchical Isolation with SGX
Joongun Park, Naegyeong Kang, Taehoon Kim, Youngjin Kwon, Jaehyuk Huh (KAIST)

9:45 PM – 10:15 PM (EDT/New York)

6:45 PM (PDT/San Francisco),
Wed 03:45 (CEST/Brussels),
Wed 09:45 (CST/Beijing)

People increasingly rely on data center services to carry out their day-to-day tasks ranging from business transactions and democratizing information, to education, entertainment, and life experiences. The sheer amount of data shifted through and computed through the network, processors and storage daily is staggering. This panel aims to have a thought-provoking discussion on the latest trends and novel research directions in data center hardware and system architectures. The panelists plan to touch on several of the following topics:

  • How to carry out data center research in an academic environment?
  • Simulation-based research vs. real-world experiments?
  • Ecologically responsible data center design for mitigating carbon footprint?
  • How does one make sense of the increasing number of cloud hardware accelerators?
  • Does the end of Moore's Law matter when you can always scale out your data center?
  • Privacy and security strategies for data center operations beyond MVM acceleration

In this mini-panel, each panelist will present a short overview on the challenges and research directions for future data center architecture and services, and then we'll open up the floor to questions from the audience.

The current state of quantum computing is often compared to classical computing in the 1950s, where the fundamental building blocks existed but the full-fledged system stack did not. Another camp argues that quantum technology is much closer to classical hardware of 1938. At this time it was unknown what technology was going to prevail or how they would be integrated to construct useful machines. In either case, we may regard the development of classical computers decades ago as a roadmap for the development of quantum computers today. While quantum versions are inherently different and appear to pose greater challenges, knowledge gained from their classical counterparts can prove useful.

In the presence of challenges induced by quantum noise and ongoing attempts to demonstrate quantum supremacy/advantage, when it comes to taking classical concepts for abstraction, quantitative analysis (using performance, accuracy, metrics, etc.) or programmability, and assessing their adaptability to work for quantum systems, who can help more than computer architects and how? This question forms the focus of our panel.



Day 3: Wednesday, June 3

10:00 AM – 11:00 AM (EDT/New York): Keynote by Li-Shiuan Peh

7:00 AM (PDT/San Francisco),
16:00 (CEST/Brussels),
22:00 (CST/Beijing)

Abstract
Wearables will be as pervasive as phones. Yet, current wearables such as smartwatches and fitness trackers have fairly limited functionality. What lies ahead for wearables? In this talk, I will first take a look at the characteristics of wearable applications of the future, and discuss the implications on the architecture of next-generation wearable chips. I will then walk through recent research into wearable chip architectures and prototypes. Finally, the talk will round off with a glimpse into next-generation application scenarios for wearables.


Bio
Li-Shiuan Peh returned home to join National University of Singapore as Provost's Chair Professor in the Department of Computer Science, with a courtesy appointment in the Department of Electrical and Computer Engineering in September 2016. Previously, she was Professor of Electrical Engineering and Computer Science at MIT and was on the faculty of MIT since 2009. She was also the Associate Director for Outreach of the Singapore-MIT Alliance of Research & Technology (SMART) from 2015-2016. Prior to MIT, she was on the faculty of Electrical Engineering at Princeton University from 2002. She graduated with a Ph.D. in Computer Science from Stanford University in 2001, and a B.S. in Computer Science from the National University of Singapore in 1995. Her research focuses on networked computing, from chips to systems. Several of her papers received test-of-time, best paper and top picks awards at computer architecture, design automation and systems conferences. She is an IEEE Fellow.

11:00 AM – 12:00 PM (EDT/New York): Awards Ceremony

8:00 AM (PDT/San Francisco),
17:00 (CEST/Brussels),
23:00 (CST/Beijing)

12:00 PM – 1:30 PM (EDT/New York)

9:00 AM (PDT/San Francisco),
18:00 (CEST/Brussels),
Thu 00:00 (CST/Beijing)
Session Chair: Hyesoon Kim (Georgia Tech)
12:00 PM – 12:15 PM EDT
RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing
Liu Ke (Facebook/WUSTL); Udit Gupta (Facebook/Harvard); Benjamin Y. Cho (Facebook/UT Austin); David Brooks (Facebook/Harvard); Vikas Chandra, Utku Diril, Amin Firoozshahian, Kim Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Meng Li, Bert Maher, Dheevatsa Mudigere, Maxim Naumov, Martin Schatz, Mikhail Smelyanskiy, Xiaodong Wang, Brandon Reagen (Facebook); Carole-Jean Wu (ASU/Facebook); Mark Hempstead (Facebook/Tufts); Xuan Zhang (Facebook/WUSTL)

12:15 PM – 12:30 PM EDT
iPIM: Programmable In-Memory Image Processing Accelerator Using Near-Bank Architecture
Peng Gu, Xinfeng Xie, Yufei Ding (UCSB); Guoyang Chen, Weifeng Zhang, Dimin Niu (Alibaba); Yuan Xie (UCSB)

12:30 PM – 12:45 PM EDT
Near Data Acceleration with Concurrent Host Access
Benjamin Cho, Yongkee Kwon, Sangkug Lym, Mattan Erez (UT Austin)

12:45 PM – 1:00 PM EDT
TIMELY: Pushing Data Movements and Interfaces in PIM Accelerators Towards Local and in Time Domain
Weitao Li, Pengfei Xu (Rice); Yang Zhao (UCSB/Rice); Haitong Li (Stanford); Yuan Xie (UCSB); Yingyan Lin (Rice)

1:00 PM – 1:15 PM EDT
Hyper-AP: Enhancing Associative Processing Through a Full-Stack Optimization
Yue Zha (UPenn); Jing Li (UPenn)

1:15 PM – 1:30 PM EDT
JPEG-ACT: Accelerating Deep Learning via Transform-Based Lossy Compression
R. David Evans, Lufei Liu, Tor M. Aamodt (UBC)
Session Chair: Abhishek Bhattacharjee (Yale)
12:00 PM – 12:15 PM EDT
TransForm: Formally Specifying Transistency Models and Synthesizing Enhanced Litmus Tests
Naorin Hossain (Princeton); Caroline Trippel (Stanford/Facebook); Margaret Martonosi (Princeton)

12:15 PM – 12:30 PM EDT
HieraGen: Automated Generation of Concurrent, Hierarchical Cache Coherence Protocols
Nicolai Oswald, Vijay Nagarajan (Edinburgh); Daniel J. Sorin (Duke)

12:30 PM – 12:45 PM EDT
Tailored Page Sizes
Faruk Guvenilir (UT Austin/Microsoft); Yale Patt (UT Austin)

12:45 PM – 1:00 PM EDT
Perforated Page: Supporting Fragmented Memory Allocation for Large Pages
Chang Hyun Park (Uppsala); Sanghoon Cha, Bokyeong Kim, Youngjin Kwon (KAIST); David Black-Schaffer (Uppsala); Jaehyuk Huh (KAIST)

1:00 PM – 1:15 PM EDT
The Virtual Block Interface: A Flexible Alternative to the Conventional Virtual Memory Framework
Nastaran Hajinazar (SFU/ETH Zürich); Pratyush Patel (Washington); Minesh Patel, Konstantinos Kanellopoulos (ETH Zürich); Saugata Ghose (CMU); Rachata Ausavarungnirun (KMUTNB); Geraldo Francisco de Oliveira Junior (ETH Zürich); Jonathan Appavoo (Boston); Vivek Seshadri (Microsoft Research India); Onur Mutlu (ETH Zürich)

1:15 PM – 1:30 PM EDT
Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs
Esha Choukse (Microsoft); Michael Sullivan (NVIDIA); Mike O'Connor (NVIDIA/UT Austin); Mattan Erez (UT Austin); Jeff Pool, David Nellans, Stephen W. Keckler (NVIDIA)

1:30 PM – 2:00 PM (EDT/New York)

10:30 AM (PDT/San Francisco),
17:30 (CEST/Brussels),
Thu 01:30 (CST/Beijing)
Memory-centric computing has evolved over the years. From the late 1960s to now, various near-memory processing architectures were discussed and proposed in the form of logic units tightly fused to separate memory structures. Perhaps more recently, there is an increased amount of effort that tries to use or re-purpose memory structures to perform truly in-memory computation. Memory technology also has evolved to more tightly integrate logic close to memory and enable in-memory computation inside memory arrays of DRAM, SRAM, and various emerging technologies. More recently, proof-of-concept designs have shown benefits of such designs and various startups have been formed to pursue potential opportunities. Yet, the success of near-memory and in-memory computing systems is yet to be seen in the commercial space. This panel aims to have thought-provoking discussions on opportunities and challenges of such near-memory and in-memory computing architectures. The panelists will debate issues ranging from research to production, as well as general adoption by software and programmers.
Non-volatile memories (NVRAMs) have been envisioned as a new tier of memory in computer systems, offering comparable performance to DRAM with the durability property of storage devices. Recently, scalable server-grade NVRAM DIMMs became commercially available with the release of Intel's Optane DC Persistent Memory. Seeing the great value, server hardware and software suppliers have recently begun to exploit NVRAMs in their new generation designs. This mini-panel will feature NVRAM experts from both industry and academia to discuss and share practical challenges and future directions of NVRAM research with the ISCA community.

8:00 PM – 9:30 PM (EDT/New York)

5:00 PM (PDT/San Francisco),
Thu 02:00 (CEST/Brussels),
Thu 08:00 (CST/Beijing)
Session Chair: Tushar Krishna (Georgia Tech)
8:00 PM – 8:15 PM EDT
A Multi-Neural Network Acceleration Architecture
Eunjin Baek, Dongup Kwon, Jangwoo Kim (SNU)

8:15 PM – 8:30 PM EDT
SmartExchange: Trading Higher-Cost Memory Storage/Access for Lower-Cost Computation
Yang Zhao (Rice); Xiaohan Chen (Texas A&M); Yue Wang, Chaojian Li (Rice); Yuan Xie (UCSB); Zhangyang Wang (Texas A&M); Yingyan Lin (Rice)

8:30 PM – 8:45 PM EDT
Centaur: A Chiplet-Based, Hybrid Sparse-Dense Accelerator for Personalized Recommendations
Ranggi Hwang, Taehun Kim, Youngeun Kwon, Minsoo Rhu (KAIST)

8:45 PM – 9:00 PM EDT
DeepRecSys: A System for Optimizing End-to-End At-Scale Neural Recommendation Inference
Udit Gupta (Facebook/Harvard); Samuel Hsia (Harvard); Vikram Saraph, Xiaodong Wang, Brandon Reagen (Facebook); Gu-Yeon Wei (Harvard); Hsien-Hsin S. Lee (Facebook); David Brooks (Facebook/Harvard); Carole-Jean Wu (ASU/Facebook)

9:00 PM – 9:15 PM EDT
An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives
Benjamin Klenk, Nan Jiang, Greg Thorson, Larry Dennison (NVIDIA)

9:15 PM – 9:30 PM EDT
DRQ: Dynamic Region-Based Quantization for Deep Neural Network Acceleration
Zhuoran Song, Bangqi Fu, Feiyang Wu, Zhaoming Jiang, Li Jiang, Naifeng Jing, Xiaoyao Liang (SJTU)
Session Chair: Minsoo Rhu (KAIST)
8:00 PM – 8:15 PM EDT
Independent Forward Progress of Work-Groups
Alexandru Dutu (AMD Research); Matthew Sinclair (Wisconsin/AMD Research); Bradford M. Beckmann (AMD); David A. Wood (AMD Research/Wisconsin); Marcus Chow (UC Riverside/AMD Research)

8:15 PM – 8:30 PM EDT
ScoRD: A Scoped Race Detector for GPUs
Aditya K Kamath, Alvin George A, Arkaprava Basu (IISc-Bangalore)

8:30 PM – 8:45 PM EDT
ZnG: Architecting GPU Multi-Processors with New Flash for Scalable Data Analysis
Jie Zhang, Myoungsoo Jung (KAIST)

8:45 PM – 9:00 PM EDT
Commutative Data Reordering: A New Technique to Reduce Data Movement Energy on Sparse Inference Workloads
Ben Feinberg (Sandia); Benjamin C. Heyman, Darya Mikhailenko, Ryan Wong, An Ho, Engin Ipek (Rochester)

9:00 PM – 9:15 PM EDT
Echo: Compiler-Based GPU Memory Footprint Reduction for LSTM RNN Training
Bojian Zheng (Toronto); Nandita Vijaykumar (Intel/Toronto); Gennady Pekhimenko (Toronto)

9:15 PM – 9:30 PM EDT
A Case for Hardware-Based Demand Paging
Gyusun Lee (Sungkyunkwan); Wenjing Jin (SNU); Wonsuk Song (Sungkyunkwan); Jeonghun Gong, Jonghyun Bae, Tae Jun Ham, Jae W. Lee (SNU); Jinkyu Jeong (Sungkyunkwan)

9:30 PM – 10:00 PM (EDT/New York)

6:30 PM (PDT/San Francisco),
Thu 03:30 (CEST/Brussels),
Thu 09:30 (CST/Beijing)
The demand for intelligent applications running on a diverse range of mobile, embedded, and IoT platforms, such as micro-robots, AR/VR headsets, and energy-harvesting smart sensors, shows no sign of slowing down. What are the unique traits of intelligent applications on the horizon, and are today's computer systems and architecture ready yet? Are we simply designing architectures to support existing applications, and if so, how can we architects help to truly unlock software innovations and to enable applications that one can only imagine today?
GPUs have become the accelerator of choice in a number of demanding application domains ranging from traditional high-performance computing, to emerging workloads such as machine learning and IoT. Still, the programmability of these devices leaves a lot to be desired. Programmers need to have knowledge of the underlying hardware to achieve peak performance. Even worse, this performance tuning is not portable across generations of the device. Multi-core CPUs have been able to achieve scalable performance by leveraging a common ISA and a shared memory model. Is it time to consider coherent shared memory for CPU–GPU heterogeneous and multi-GPU platforms? If we are able to deliver such a model, what new applications may emerge?