BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Seoul X-LIC-LOCATION:Asia/Seoul BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:KST DTSTART:18871231T000000 DTSTART:19881009T020000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20230103T035307Z LOCATION:Auditorium\, Level 5\, West Wing DTSTART;TZID=Asia/Seoul:20221206T100000 DTEND;TZID=Asia/Seoul:20221206T120000 UID:siggraphasia_SIGGRAPH Asia 2022_sess153_papers_203@linklings.com SUMMARY:Text2Light: Zero-shot Text-driven HDR Panorama Generation DESCRIPTION:Technical Papers\n\nText2Light: Zero-shot Text-driven HDR Pano rama Generation\n\nChen, Wang, Liu\n\nHigh-quality HDRIs (High Dynamic Ran ge Images), typically HDR panoramas, are one of the most popular ways to c reate photorealistic lighting and 360-degree reflections of 3D scenes in g raphics. Given the difficulty of capturing HDRIs, a versatile and controll able generative model is highly desired, where layman users can intuitivel y control the generation process. However, existing state-of-the-art metho ds still struggle to synthesize high-quality panoramas for complex scenes. In this work, we propose a zero-shot text-driven framework, Text2Light, t o generate 4K+ resolution HDRIs without paired training data. Given a free -form text as the description of the scene, we synthesize the correspondin g HDRI with two dedicated steps: 1) text-driven panorama generation in low dynamic range (LDR) and low resolution (LR), and 2) super-resolution inve rse tone mapping to scale up the LDR panorama both in resolution and dynam ic range. Specifically, to achieve zero-shot text-driven panorama generati on, we first build dual codebooks as the discrete representation for diver se environmental textures. Then, driven by the pre-trained Contrastive Lan guage-Image Pre-training (CLIP) model, a text-conditioned global sampler l earns to sample holistic semantics from the global codebook according to t he input text. Furthermore, a structure-aware local sampler learns to synt hesize LDR panoramas patch-by-patch, guided by holistic semantics. To achi eve super-resolution inverse tone mapping, we derive a continuous represen tation of 360-degree imaging from the LDR panorama as a set of structured latent codes anchored to the sphere. This continuous representation enable s a versatile module to upscale the resolution and dynamic range simultane ously. Extensive experiments demonstrate the superior capability of Text2L ight in generating high-quality HDR panoramas. In addition, we show the fe asibility of our work in realistic rendering and immersive VR.\n\nRegistra tion Category: FULL ACCESS, EXPERIENCE PLUS ACCESS, EXPERIENCE ACCESS, TRA DE EXHIBITOR\n\nLanguage: ENGLISH\n\nFormat: IN-PERSON URL:https://sa2022.siggraph.org/en/full-program/?id=papers_203&sess=sess15 3 END:VEVENT END:VCALENDAR