AVS 65th International Symposium & Exhibition
    Manufacturing Science and Technology Group Tuesday Sessions
       Session MS+MI+RM-TuM

Invited Paper MS+MI+RM-TuM10
Computation Immersed in Memory: Integrating 3D vertical RRAM in the N3XT Architecture

Tuesday, October 23, 2018, 11:00 am, Room 202B

Session: IoT Session: Challenges of Neuromorphic Computing and Memristor Manufacturing (8:00-10:00 am)/Federal Funding Opportunities (11:40 am-12:20 pm)
Presenter: Weier Wan, Stanford University
Authors: W. Wan, Stanford University
W. Hwang, Stanford University
H. Li, Stanford University
T.F. Wu, Stanford University
Y.H. Malviya, Stanford University
M.M.S. Aly, Nanyang Technological University, Singapore
S. Mitra, Stanford University
H.-S.P. Wong, Stanford University
Correspondent: Click to Email

The rise of data-abundant computing, where massive amount of data is processed in applications such as machine learning, computer vision and natural language processing, demands highly energy-effcient computing systems. However, the limited connectivity between separated logic and memory chips in conventional 2D system results in majority of program execution time and energy spent at memory access.The Nano-Engineered Computing Systems Technology (N3XT) [1] approach overcomes these memory bottlenecks by monolithically integrating interleaving layers of memory and logic on the same chip, and leveraging nano-scale interlayer vias (ILVs) to provide ultra-dense connectivity between logic and memory.

The metal oxide resistive switching memory (RRAM) [2] offers non-volatility, good scalability, and monolithic 3D integration, making it a good candidate as on-chip high-capacity main memory and storage in the N3XT system. Our experimentally calibrated studies show that a N3XT system with RRAM as digital storage and CNFET as logic devices could achieve 2-3 orders of magnitude improvement in energy efficiency (product of execution time and energy) in a wide range of applications (e.g. PageRank, deep neural network inference) compared to a conventional 2D system. Such 3D nano-system has also been experimentally demonstrated with RRAM, CNFET and CMOS monolithically integrated to perform in-situ ambient gas classification [3] and hyper-dimensional computing [4].

Besides offering substantial benefits for conventional digital systems, the monolithic integration of RRAM and logic devices also enables “in-memory computing”, where computation is performed in the memory itself without explicitly moving data between memory and logic. Various types of in-memory computing operations could be performed using RRAM arrays, including analog multiply-accumulate and bit-wise logical operations. We perform system modeling that models program scheduling, communication and routing, and memory array and its peripheral circuits design on various operations to study their benefits and bottlenecks from application level. In particular we analyze the in-memory vector-matrix multiplication for deep neural network inference and bit-wise operations in 3D vertical-RRAM for hyper-dimensional computing. We show that with algorithm-architecture co-design, RRAM-based in-memory computing could further improve energy and area efficiency compared to digital implementation in a 3D monolithically integrated system.

[1] M.M.S. Aly et al., IEEE Computer, 2015. [2] H.-S P. Wong et al., Proc. IEEE, 2012. [3] M.M. Shulaker et al., Nature, 2017. [4] T. Wu et al., ISSCC, 2018.