BEGIN:VCALENDAR VERSION:2.0 PRODID:Linklings LLC BEGIN:VTIMEZONE TZID:Asia/Seoul X-LIC-LOCATION:Asia/Seoul BEGIN:STANDARD TZOFFSETFROM:+0900 TZOFFSETTO:+0900 TZNAME:KST DTSTART:18871231T000000 DTSTART:19881009T020000 END:STANDARD END:VTIMEZONE BEGIN:VEVENT DTSTAMP:20230103T035307Z LOCATION:Auditorium\, Level 5\, West Wing DTSTART;TZID=Asia/Seoul:20221206T100000 DTEND;TZID=Asia/Seoul:20221206T120000 UID:siggraphasia_SIGGRAPH Asia 2022_sess153_papers_214@linklings.com SUMMARY:Stitch it in Time: GAN-Based Facial Editing of Real Videos DESCRIPTION:Technical Papers\n\nStitch it in Time: GAN-Based Facial Editin g of Real Videos\n\nTzaban, Mokady, Gal, Bermano, Cohen-Or\n\nThe ability of Generative Adversarial Networks to encode rich semantics within their l atent space has been widely adopted for facial image editing. However, rep licating their success with videos has proven challenging. Applying StyleG AN editing over real videos introduces two main challenges: (i) StyleGAN o perates over aligned crops. When editing videos, these crops need to be pa sted back into the frame, resulting in a spatial inconsistency. (ii) Video s introduce a fundamental barrier to overcome - temporal coherency. To add ress the first challenge, we propose a novel stitching-tuning procedure. T he generator is carefully tuned to overcome the spatial artifacts at crop borders, resulting in a smooth transition even when difficult backgrounds are involved. Turning to temporal coherence, we propose that this challeng e is largely artificial. The source video is already temporally coherent, and deviations arise in part due to the careless treatment of individual c omponents in the editing pipeline. We leverage the natural alignment of St yleGAN and the tendency of neural networks to learn low-frequency function s, and demonstrate that they provide a strongly consistent prior. These co mponents are combined in an end-to-end framework for semantic editing of f acial videos. We compare our pipeline to the current state-of-the-art and demonstrate significant improvements. Our method produces various meaningf ul manipulations and maintain greater spatial and temporal consistency, ev en in challenging talking head videos which current methods struggle with. Our code will be made publicly available.\n\nRegistration Category: FULL ACCESS, EXPERIENCE PLUS ACCESS, EXPERIENCE ACCESS, TRADE EXHIBITOR\n\nLang uage: ENGLISH\n\nFormat: IN-PERSON URL:https://sa2022.siggraph.org/en/full-program/?id=papers_214&sess=sess15 3 END:VEVENT END:VCALENDAR