Pose Animator – An open source tool to bring SVG characters to life in the browser via motion capture
By Shan Huang, Creative Technologist, Google Partner Innovation
The PoseNet and Facemesh (from Mediapipe) TensorFlow.js models made real time human perception in the browser possible through a simple webcam. As an animation enthusiast who struggles to master the complex art of character animation, I saw hope and was really excited to experiment using these models for interactive, body-controlled animation.
The result is Pose Animator, an open-source web animation tool that brings SVG characters to life with body detection results from webcam. This blog post covers the technical design of Pose Animator, as well as the steps for designers to create and animate their own characters.
Using FaceMesh and PoseNet with TensorFlow.js to animate full body character
The overall idea of Pose Animator is to take a 2D vector illustration and update its containing curves in real-time based on the recognition result from PoseNet and FaceMesh. To achieve this, Pose Animator borrows the idea of skeleton-based animation from computer graphics and applies it to vector characters.
- a surface used to draw the character, and
- a hierarchical set of interconnected bones used to animate the surface.
In Pose Animator, the surface is defined by the 2D vector paths in the input SVG files. For the bone structure, Pose Animator provides a predefined rig (bone hierarchy) representation, based on the key points from PoseNet and FaceMesh. This bone structure’s initial pose is specified in the input SVG file, along with the character illustration, while the real time bone positions are updated by the recognition result from ML models.
|Detection keypoints from PoseNet (blue) and FaceMesh (red)|
Check out these steps to create your own SVG character for Pose Animator.
|Animated bezier curves controlled by PoseNet and FaceMesh output/td>|
Rigging Flow Overview
The full rigging (skeleton binding) flow requires the following steps:
- Parse the input SVG file for the vector illustration and the predefined skeleton, both of which are in T-pose (initial pose).
- Iterate through every segment in vector paths to compute the weight influence and transformation from each bone using Linear Blend Skinning (explained later in this post).
- In real time, run FaceMesh and PoseNet on each input frame and use result keypoints to update the bone positions.
- Compute new positions of vector segments from the updated bone positions, bone weights and transformations.
There are other tools that provide similar puppeteering functionality, however, most of them only update asset bounding boxes and do not deform the actual geometry of characters with recognition key points. Also, few tools provide full body recognition and animation. By deforming individual curves, Pose Animator is good at capturing the nuances of facial and full body movement and hopefully provides more expressive animation.
The rig structure is designed according to the output key points from PoseNet and FaceMesh. PoseNet returns 17 key points for the full body, which is simple enough to directly include in the rig. FaceMesh however provides 486 keypoints, so I needed to be more selective about which ones to include. In the end I selected 73 key points from the FaceMesh output and together we have a full body rig of 90 keypoints and 78 bones as shown below:
|The 90 keypoints, 78 bones full body rig|
Every input SVG file is expected to contain this skeleton in default position. More specifically, Pose Animator will look for a group called ‘skeleton’ containing anchor elements named with the respective joint they represent. A sample rig SVG can be found here. Designers have the freedom to move the joints around in their design files to best embed the rig into the character. Pose Animator will compute skinning according to the default position in the SVG file, although extreme cases (e.g. very short leg / arm bones) may not be well supported by the rigging algorithm and may produce unnatural results.
|The illustration with embedded rig in design software (Adobe Illustrator)|
Linear Blend Skinning for vector paths
Pose Animator uses one of the most common rigging algorithms for deforming surfaces using skeletal structures – Linear Blend Skinning (LBS), which transforms a vertex on a surface by blending together its transformation controlled by each bone alone, weighted by each bone’s influence. In our case, a vertex refers to an anchor point on a vector path, and bones are defined by two connected keypoints in the above rig (e.g. the ‘leftWrist’ and ‘leftElbow’ keypoints define the bone ‘leftWrist-leftElbow’).
To put into math formula, the world space position of the vertex vi’ is computed as
– wi is the influence of bone i on vertex i,
– vi describes vertex i’s initial position,
– Tj describes the spatial transformation that aligns the initial pose of bone j with its current pose.
The influence of bones can be automatically generated or manually assigned through weight painting. Pose Animator currently only supports auto weight assignment. The raw influence of bone j on vertex i is calculated as:
There is one exception case. When the in control point, curve point, and out control point are collinear, we use the curve point weight for all three points to guarantee that they stay collinear when animated. This helps to preserve the smoothness of curves.
|Collinear curve handles share the same weight to stay collinear|
While LBS already gives us animated frames, there’s a noticeable amount of jittering introduced by FaceMesh and PoseNet raw output. To reduce the jitter and get smoother animation, we can use the confidence scores from prediction results to weigh each input frame unevenly, granting less influence to low-confidence frames.
Following this idea, Pose Animator computes the smoothed position of joint i at frame t as
The smoothed confidence score of frame i is computed as
Confidence score based clipping
In addition to interpolating joint positions with confidence scores, we also introduce a minimum threshold to decide if a path should be rendered at all.
The confidence score of a path is the averaged confidence score of its segment points, which in turn is the weighted average of the influence bones’ scores. The whole path is hidden for a particular frame when its score is below a certain threshold.
This is useful for hiding paths in low confidence areas, which are often body parts out of the camera view. Imagine an upper body shot: PoseNet will always return keypoint predictions for legs and hips though they will have low confidence scores. With this clamping mechanism we can make sure lower body parts are properly hidden instead of showing up as strangely distorted paths.
To mesh or not to mesh
The current rigging algorithm is heavily centered around 2D curves. This is because the 2D rig constructed from PoseNet and FaceMesh has a large range of motion and varying bone lengths – unlike animation in games where bones have relatively fixed length. I currently get smoother results from deforming bezier curves than deforming the triangulated mesh from input paths, because bezier curves preserve the curvature / straightness of input lines better.
I am keen to improve the rigging algorithm for meshes. Besides, I want to explore a more advanced rigging algorithm than Linear Blend Skinning, which has limitations such as volume thinning around the bent areas.
New editing features
Pose Animator delegates illustration editing to design softwares like Illustrator, which are powerful for editing vector graphics, but not tailored for animation / skinning requirements. I want to support more animation features through in-browser UI, including:
- Skinning weight painting tool, to enable tweaking individual weights on keypoints manually. This will provide more precision than auto weight assignment.
- Support raster images in the input SVG files, so artists may use photos / drawings in their design. Image bounding boxes can be easily represented as vector paths so it’s straightforward to compute its deformation using the current rigging algorithm.
Try it yourself!
Try out the live demos, where you can either play with existing characters, or add in your own SVG character and see them come to life.
I’m the most excited to see what kind of interactive animation the creative community will create. While the demos are human characters, Pose Animator will work for any 2D vector design, so you can go as abstract / avant-garde as you want to push its limits.
To create your own animatable illustration, please check out this guide! Don’t forget to share your creations with us using #PoseAnimator on social media. Feel free to reach out to me on twitter @yemount for any questions.
Alternatively if you want to view the source code directly, it is available to fork on github here. Happy hacking!
Related Google News:
- Scaling deep retrieval with TensorFlow Recommenders and Vertex AI Matching Engine May 1, 2023
- Unleash your Google Cloud data with ThoughtSpot, Looker, and BigQuery May 1, 2023
- Seeing the World: Vertex AI Vision Developer Toolkit May 1, 2023
- BBC: Keeping up with a busy news day with an end-to-end serverless architecture May 1, 2023
- Scalable electronic trading on Google Cloud: A business case with BidFX May 1, 2023
- Google Cloud and Equinix: Building Excellence in ML Operations (MLOps) May 1, 2023
- Effingo: the internal Google copy service moving data at scale May 1, 2023
- Evaluating the true cost or TCO of a database — and how Cloud Spanner compares May 1, 2023