#SIGGRAPHAsia | #SIGGRAPHAsia2022











SIGN IN TO VIEW THIS PRESENTATION Sign In
UmeTrack: Unified multi-view end-to-end hand tracking for VR

SessionVR and Interaction
DescriptionReal-time tracking of 3D hand pose in world space is a challenging problem
and plays an important role in VR interaction. Existing work in this space are
limited to either producing root-relative (versus world space) 3D pose or rely
on multiple stages such as generating heatmaps and kinematic optimization
to obtain 3D pose. Moreover, the typical VR scenario, which involves multiview
tracking from wide field of view (FOV) cameras is seldom addressed by
these methods. In this paper, we present a unified end-to-end differentiable
framework for multi-view, multi-frame hand tracking that directly predicts
3D hand pose in world space. We demonstrate the benefits of end-to-end
differentiabilty by extending our framework with downstream tasks such
as jitter reduction and pinch prediction. To demonstrate the efficacy of our
model,we further present a newlarge-scale egocentric hand pose dataset that
consists of both real and synthetic data. Experiments show that our system
handles various challenging interactive motions, and has been successfully
applied to real-time VR applications.
and plays an important role in VR interaction. Existing work in this space are
limited to either producing root-relative (versus world space) 3D pose or rely
on multiple stages such as generating heatmaps and kinematic optimization
to obtain 3D pose. Moreover, the typical VR scenario, which involves multiview
tracking from wide field of view (FOV) cameras is seldom addressed by
these methods. In this paper, we present a unified end-to-end differentiable
framework for multi-view, multi-frame hand tracking that directly predicts
3D hand pose in world space. We demonstrate the benefits of end-to-end
differentiabilty by extending our framework with downstream tasks such
as jitter reduction and pinch prediction. To demonstrate the efficacy of our
model,we further present a newlarge-scale egocentric hand pose dataset that
consists of both real and synthetic data. Experiments show that our system
handles various challenging interactive motions, and has been successfully
applied to real-time VR applications.
Event Type
Technical Communications
Technical Papers
TimeFriday, 9 December 20222:00pm - 3:30pm KST
LocationRoom 325-AB, Level 3, West Wing




