BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Seoul X-LIC-LOCATION:Asia/Seoul BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:KST DTSTART:18871231T000000 DTSTART:19881009T020000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20230103T035311Z LOCATION:Room 325-AB\, Level 3\, West Wing DTSTART;TZID=Asia/Seoul:20221208T140000 DTEND;TZID=Asia/Seoul:20221208T153000 UID:siggraphasia_SIGGRAPH Asia 2022_sess169_papers_448@linklings.com SUMMARY:ICARUS: A Specialized Architecture for Neural Radiance Fields Rend ering DESCRIPTION:Technical Communications, Technical Papers\n\nICARUS: A Specia lized Architecture for Neural Radiance Fields Rendering\n\nRao, Yu, Wan, Z hou, Zheng...\n\nThe practical deployment of Neural Radiance Fields (NeRF) in rendering applications faces several challenges, with the most critica l one being low rendering speed on even high-end graphic processing units (GPUs). In this paper, we present ICARUS, a specialized accelerator archit ecture tailored for NeRF rendering. Unlike GPUs using general purpose comp uting and memory architectures for NeRF, ICARUS executes the complete NeRF pipeline using dedicated plenoptic cores (PLCore) consisting of a positio nal encoding unit (PEU), a multi-layer perceptron (MLP) engine, and a volu me rendering unit (VRU). A PLCore takes in positions \& directions and ren ders the corresponding pixel colors without any intermediate data going of f-chip for temporary storage and exchange, which can be time and power con suming. To implement the most expensive component of NeRF, i.e., the MLP, we transform the fully connected operations to approximated reconfigurable multiple constant multiplications (MCMs), where common subexpressions are shared across different multiplications to improve the computation effici ency. We build a prototype ICARUS using Synopsys HAPS-80 S104, a field pro grammable gate array (FPGA)-based prototyping system for large-scale integ rated circuits and systems design. We evaluate the power-performance-area (PPA) of a PLCore using 40nm LP CMOS technology. Working at 400 MHz, a sin gle PLCore occupies 16.5 $mm^2$ and consumes 282.8 mW, translating to 0.10 5 uJ/sample. The results are compared with those of GPU and tensor process ing unit (TPU) implementations.\n\nRegistration Category: FULL ACCESS, ON- DEMAND ACCESS\n\nLanguage: ENGLISH\n\nFormat: IN-PERSON, ON-DEMAND URL:https://sa2022.siggraph.org/en/full-program/?id=papers_448&sess=sess16 9 END:VEVENT END:VCALENDAR