{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,7,12]],"date-time":"2025-07-12T01:14:37Z","timestamp":1752282877500,"version":"3.41.0"},"reference-count":75,"publisher":"Association for Computing Machinery (ACM)","issue":"4","license":[{"start":{"date-parts":[[2020,9,18]],"date-time":"2020-09-18T00:00:00Z","timestamp":1600387200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100014718","name":"National Science Foundation","doi-asserted-by":"publisher","award":["1629392"],"award-info":[{"award-number":["1629392"]}],"id":[{"id":"10.13039\/100014718","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["J. Emerg. Technol. Comput. Syst."],"published-print":{"date-parts":[[2020,10,31]]},"abstract":"<jats:p>Recurrent Neural Networks (RNNs) are an important class of neural networks designed to retain and incorporate context into current decisions. RNNs are particularly well suited for machine learning problems in which context is important, such as speech recognition and language translation.<\/jats:p>\n          <jats:p>This work presents RNNFast, a hardware accelerator for RNNs that leverages an emerging class of non-volatile memory called domain-wall memory (DWM). We show that DWM is very well suited for RNN acceleration due to its very high density and low read\/write energy. At the same time, the sequential nature of input\/weight processing of RNNs mitigates one of the downsides of DWM, which is the linear (rather than constant) data access time.<\/jats:p>\n          <jats:p>RNNFast is very efficient and highly scalable, with flexible mapping of logical neurons to RNN hardware blocks. The basic hardware primitive, the RNN processing element (PE), includes custom DWM-based multiplication, sigmoid and tanh units for high density and low energy. The accelerator is designed to minimize data movement by closely interleaving DWM storage and computation. We compare our design with a state-of-the-art GPGPU and find 21.8\u00d7 higher performance with 70\u00d7 lower energy.<\/jats:p>","DOI":"10.1145\/3399670","type":"journal-article","created":{"date-parts":[[2020,9,18]],"date-time":"2020-09-18T16:16:29Z","timestamp":1600445789000},"page":"1-27","update-policy":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":7,"title":["RNNFast"],"prefix":"10.1145","volume":"16","author":[{"given":"Mohammad Hossein","family":"Samavatian","sequence":"first","affiliation":[{"name":"The Ohio State University, Neil Ave., Columbus, OH, USA"}]},{"given":"Anys","family":"Bacha","sequence":"additional","affiliation":[{"name":"University of Michigan, Dearborn, MI, USA"}]},{"given":"Li","family":"Zhou","sequence":"additional","affiliation":[{"name":"The Ohio State University, Neil Ave., Columbus, OH, USA"}]},{"ORCID":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/orcid.org\/0000-0002-6474-2201","authenticated-orcid":false,"given":"Radu","family":"Teodorescu","sequence":"additional","affiliation":[{"name":"The Ohio State University, Neil Ave., Columbus, OH, USA"}]}],"member":"320","published-online":{"date-parts":[[2020,9,18]]},"reference":[{"volume-title":"Retrieved on","year":"2020","key":"e_1_2_1_1_1"},{"volume-title":"Retrieved on","year":"2020","key":"e_1_2_1_2_1"},{"volume-title":"Retrieved on","year":"2020","key":"e_1_2_1_3_1"},{"key":"e_1_2_1_4_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001138"},{"volume-title":"Proceedings of the 33nd International Conference on Machine Learning (ICML\u201916)","year":"2016","author":"Amodei Dario","key":"e_1_2_1_5_1"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304049"},{"key":"e_1_2_1_7_1","first-page":"06064","article-title":"RESPARC: A reconfigurable and energy-efficient architecture with memristive crossbars for deep spiking neural networks","volume":"1702","author":"Ankit Aayush","year":"2017","journal-title":"Arxiv Preprint Arxiv"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1109\/IEDM.2011.6131604"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/3289602.3293989"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCAS.2019.8702471"},{"key":"e_1_2_1_11_1","doi-asserted-by":"crossref","unstructured":"Tianshi Chen Zidong Du Ninghui Sun Jia Wang Chengyong Wu Yunji Chen and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Architectural Support for Programming Languages and Operating Systems (ASPLOS\u201914). 269--284. DOI:https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1145\/2541940.2541967  Tianshi Chen Zidong Du Ninghui Sun Jia Wang Chengyong Wu Yunji Chen and Olivier Temam. 2014. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. In Architectural Support for Programming Languages and Operating Systems (ASPLOS\u201914). 269--284. DOI:https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1145\/2541940.2541967","DOI":"10.1145\/2541940.2541967"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001177"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1109\/MICRO.2014.58"},{"key":"e_1_2_1_14_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001140"},{"volume-title":"Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR abs\/1406.1078","year":"2014","author":"Cho Kyunghyun","key":"e_1_2_1_15_1"},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1145\/2934583.2934602"},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750389"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/ReConFig.2016.7857151"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1109\/ISCA.2018.00012"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.5555\/2769681.2769691"},{"volume-title":"IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP\u201913)","year":"2013","author":"Graves Alex","key":"e_1_2_1_21_1"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNNLS.2016.2582924"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2017.7858394"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1145\/3020078.3021745"},{"volume-title":"43rd ACM\/IEEE Annual International Symposium on Computer Architecture (ISCA\u201916)","year":"2016","author":"Han Song","key":"e_1_2_1_25_1"},{"volume-title":"Ng","year":"2014","author":"Hannun Awni Y.","key":"e_1_2_1_26_1"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1162\/neco.1997.9.8.1735"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2015.2474706"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1145\/2593069.2593161"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304038"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080246"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001178"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1145\/3297858.3304028"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICSICT.2018.8564918"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1109\/FCCM.2015.50"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2019.00028"},{"key":"e_1_2_1_37_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001164"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1145\/2694344.2694358"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001179"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/IJCNN.2016.7727299"},{"volume-title":"Proceedings of the NAECON 2018 IEEE National Aerospace and Electronics Conference. IEEE, 382--390","author":"Mealey Thomas","key":"e_1_2_1_41_1"},{"key":"e_1_2_1_42_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2015.2437283"},{"key":"e_1_2_1_43_1","doi-asserted-by":"publisher","DOI":"10.1145\/2627369.2627643"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNANO.2015.2391185"},{"volume-title":"Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL\u201902)","year":"2002","author":"Papineni Kishore","key":"e_1_2_1_45_1"},{"volume-title":"Proceedings of the 44th Annual International Symposium on Computer Architecture (ISCA\u201917)","author":"Parashar Angshuman","key":"e_1_2_1_46_1"},{"volume-title":"Magnetic domain-wall racetrack memory. Science 320, 5873","year":"2008","author":"Parkin Stuart S. P.","key":"e_1_2_1_47_1"},{"volume-title":"SIGIR\u201998: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 275--281","author":"Ponte Jay M.","key":"e_1_2_1_48_1"},{"volume-title":"Proceedings of the 2015 Design, Automation Test in Europe Conference Exhibition (DATE\u201915)","year":"2015","author":"Ranjan A.","key":"e_1_2_1_49_1"},{"volume-title":"Jos\u00e9 Miguel Hern\u00e1ndez-Lobato, Gu-Yeon Wei, and David M. Brooks.","year":"2016","author":"Reagen Brandon","key":"e_1_2_1_50_1"},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1145\/3007787.3001139"},{"key":"e_1_2_1_52_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080221"},{"volume-title":"Stan","year":"2011","author":"Smullen Clinton W.","key":"e_1_2_1_53_1"},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2014.2360545"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1145\/2463209.2488799"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.1109\/SmartCloud.2018.00009"},{"key":"e_1_2_1_57_1","doi-asserted-by":"publisher","DOI":"10.1049\/ip-cdt:20030965"},{"volume-title":"What level of quality can neural machine translation attain on literary text? In Translation Quality Assessment","author":"Toral Antonio","key":"e_1_2_1_58_1"},{"key":"e_1_2_1_59_1","doi-asserted-by":"publisher","DOI":"10.1145\/3079856.3080244"},{"key":"e_1_2_1_60_1","doi-asserted-by":"publisher","DOI":"10.1109\/TC.2015.2506581"},{"key":"e_1_2_1_61_1","doi-asserted-by":"publisher","DOI":"10.7873\/DATE.2013.365"},{"volume-title":"Show and tell: Lessons learned from the 2015 MSCOCO image captioning challenge. CoRR abs\/1609.06647","year":"2016","author":"Vinyals Oriol","key":"e_1_2_1_62_1"},{"key":"e_1_2_1_63_1","first-page":"2911739","article-title":"E-LSTM: An efficient hardware architecture for long short-term memory","volume":"2019","author":"Wang M.","year":"2019","journal-title":"DOI:https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1109\/JETCAS."},{"key":"e_1_2_1_64_1","doi-asserted-by":"publisher","DOI":"10.1145\/3174243.3174253"},{"key":"e_1_2_1_65_1","doi-asserted-by":"publisher","DOI":"10.1109\/HPCA.2019.00029"},{"key":"e_1_2_1_66_1","doi-asserted-by":"publisher","DOI":"10.1109\/TNANO.2015.2447531"},{"volume-title":"Automation & Test in Europe Conference & Exhibition (DATE\u201914)","year":"2014","author":"Wang Yuhao","key":"e_1_2_1_67_1"},{"key":"e_1_2_1_68_1","doi-asserted-by":"publisher","DOI":"10.1109\/TVLSI.2017.2717950"},{"volume-title":"Device-architecture co-optimization of STT-RAM based memory for low power embedded systems","author":"Xu Cong","key":"e_1_2_1_69_1"},{"key":"e_1_2_1_70_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2014.6742888"},{"key":"e_1_2_1_71_1","doi-asserted-by":"publisher","DOI":"10.1109\/ASPDAC.2015.7058988"},{"key":"e_1_2_1_72_1","doi-asserted-by":"publisher","DOI":"10.1145\/2749469.2750388"},{"key":"e_1_2_1_73_1","doi-asserted-by":"publisher","DOI":"10.1109\/CLUSTER.2017.45"},{"key":"e_1_2_1_74_1","doi-asserted-by":"publisher","DOI":"10.1109\/TCSI.2016.2529240"},{"key":"e_1_2_1_75_1","doi-asserted-by":"publisher","DOI":"10.1109\/FTFC.2013.6577771"}],"container-title":["ACM Journal on Emerging Technologies in Computing Systems"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/dl.acm.org\/doi\/10.1145\/3399670","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/dl.acm.org\/doi\/pdf\/10.1145\/3399670","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T22:38:13Z","timestamp":1750199893000},"score":1,"resource":{"primary":{"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/dl.acm.org\/doi\/10.1145\/3399670"}},"subtitle":["An Accelerator for Recurrent Neural Networks Using Domain-Wall Memory"],"short-title":[],"issued":{"date-parts":[[2020,9,18]]},"references-count":75,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2020,10,31]]}},"alternative-id":["10.1145\/3399670"],"URL":"https:\/\/summer-heart-0930.chufeiyun1688.workers.dev:443\/https\/doi.org\/10.1145\/3399670","relation":{},"ISSN":["1550-4832","1550-4840"],"issn-type":[{"type":"print","value":"1550-4832"},{"type":"electronic","value":"1550-4840"}],"subject":[],"published":{"date-parts":[[2020,9,18]]},"assertion":[{"value":"2019-05-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-05-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-09-18","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}