BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Seoul X-LIC-LOCATION:Asia/Seoul BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:KST DTSTART:18871231T000000 DTSTART:19881009T020000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20230103T035311Z LOCATION:Room 324\, Level 3\, West Wing DTSTART;TZID=Asia/Seoul:20221208T153000 DTEND;TZID=Asia/Seoul:20221208T170000 UID:siggraphasia_SIGGRAPH Asia 2022_sess166_papers_214@linklings.com SUMMARY:Stitch it in Time: GAN-Based Facial Editing of Real Videos DESCRIPTION:Technical Communications, Technical Papers\n\nStitch it in Tim e: GAN-Based Facial Editing of Real Videos\n\nTzaban, Mokady, Gal, Bermano , Cohen-Or\n\nThe ability of Generative Adversarial Networks to encode ric h semantics within their latent space has been widely adopted for facial i mage editing. However, replicating their success with videos has proven ch allenging. Applying StyleGAN editing over real videos introduces two main challenges: (i) StyleGAN operates over aligned crops. When editing videos, these crops need to be pasted back into the frame, resulting in a spatial inconsistency. (ii) Videos introduce a fundamental barrier to overcome - temporal coherency. To address the first challenge, we propose a novel sti tching-tuning procedure. The generator is carefully tuned to overcome the spatial artifacts at crop borders, resulting in a smooth transition even w hen difficult backgrounds are involved. Turning to temporal coherence, we propose that this challenge is largely artificial. The source video is alr eady temporally coherent, and deviations arise in part due to the careless treatment of individual components in the editing pipeline. We leverage t he natural alignment of StyleGAN and the tendency of neural networks to le arn low-frequency functions, and demonstrate that they provide a strongly consistent prior. These components are combined in an end-to-end framework for semantic editing of facial videos. We compare our pipeline to the cur rent state-of-the-art and demonstrate significant improvements. Our method produces various meaningful manipulations and maintain greater spatial an d temporal consistency, even in challenging talking head videos which curr ent methods struggle with. Our code will be made publicly available.\n\nRe gistration Category: FULL ACCESS, ON-DEMAND ACCESS\n\nLanguage: ENGLISH\n\ nFormat: IN-PERSON, ON-DEMAND URL:https://sa2022.siggraph.org/en/full-program/?id=papers_214&sess=sess16 6 END:VEVENT END:VCALENDAR