BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Seoul
X-LIC-LOCATION:Asia/Seoul
BEGIN:STANDARD
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:KST
DTSTART:18871231T000000
DTSTART:19881009T020000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20230103T035311Z
LOCATION:Room 325-AB\, Level 3\, West Wing
DTSTART;TZID=Asia/Seoul:20221208T140000
DTEND;TZID=Asia/Seoul:20221208T153000
UID:siggraphasia_SIGGRAPH Asia 2022_sess169_papers_448@linklings.com
SUMMARY:ICARUS: A Specialized Architecture for Neural Radiance Fields Rend
ering
DESCRIPTION:Technical Communications, Technical Papers\n\nICARUS: A Specia
lized Architecture for Neural Radiance Fields Rendering\n\nRao, Yu, Wan, Z
hou, Zheng...\n\nThe practical deployment of Neural Radiance Fields (NeRF)
in rendering applications faces several challenges, with the most critica
l one being low rendering speed on even high-end graphic processing units
(GPUs). In this paper, we present ICARUS, a specialized accelerator archit
ecture tailored for NeRF rendering. Unlike GPUs using general purpose comp
uting and memory architectures for NeRF, ICARUS executes the complete NeRF
pipeline using dedicated plenoptic cores (PLCore) consisting of a positio
nal encoding unit (PEU), a multi-layer perceptron (MLP) engine, and a volu
me rendering unit (VRU). A PLCore takes in positions \& directions and ren
ders the corresponding pixel colors without any intermediate data going of
f-chip for temporary storage and exchange, which can be time and power con
suming. To implement the most expensive component of NeRF, i.e., the MLP,
we transform the fully connected operations to approximated reconfigurable
multiple constant multiplications (MCMs), where common subexpressions are
shared across different multiplications to improve the computation effici
ency. We build a prototype ICARUS using Synopsys HAPS-80 S104, a field pro
grammable gate array (FPGA)-based prototyping system for large-scale integ
rated circuits and systems design. We evaluate the power-performance-area
(PPA) of a PLCore using 40nm LP CMOS technology. Working at 400 MHz, a sin
gle PLCore occupies 16.5 $mm^2$ and consumes 282.8 mW, translating to 0.10
5 uJ/sample. The results are compared with those of GPU and tensor process
ing unit (TPU) implementations.\n\nRegistration Category: FULL ACCESS, ON-
DEMAND ACCESS\n\nLanguage: ENGLISH\n\nFormat: IN-PERSON, ON-DEMAND
URL:https://sa2022.siggraph.org/en/full-program/?id=papers_448&sess=sess16
9
END:VEVENT
END:VCALENDAR