Icon Animation Blend Spaces without Triangulation

 

Icon Quaternion Weighted Average

 

Icon BVHView

 

Icon Dead Blending Node in Unreal Engine

 

Icon Propagating Velocities through Animation Systems

 

Icon Cubic Interpolation of Quaternions

 

Icon Dead Blending

 

Icon Perfect Tracking with Springs

 

Icon Creating Looping Animations from Motion Capture

 

Icon My Favourite Things

 

Icon Inertialization Transition Cost

 

Icon Scalar Velocity

 

Icon Tags, Ranges and Masks

 

Icon Fitting Code Driven Displacement

 

Icon atoi and Trillions of Whales

 

Icon SuperTrack: Motion Tracking for Physically Simulated Characters using Supervised Learning

 

Icon Joint Limits

 

Icon Code vs Data Driven Displacement

 

Icon Exponential Map, Angle Axis, and Angular Velocity

 

Icon Encoding Events for Neural Networks

 

Icon Visualizing Rotation Spaces

 

Icon Spring-It-On: The Game Developer's Spring-Roll-Call

 

Icon Interviewing Advice from the Other Side of the Table

 

Icon Saguaro

 

Icon Learned Motion Matching

 

Icon Why Can't I Reproduce Their Results?

 

Icon Latinendian vs Arabendian

 

Icon Machine Learning, Kolmogorov Complexity, and Squishy Bunnies

 

Icon Subspace Neural Physics: Fast Data-Driven Interactive Simulation

 

Icon Software for Rent

 

Icon Naraleian Caterpillars

 

Icon The Scientific Method is a Virus

 

Icon Local Minima, Saddle Points, and Plateaus

 

Icon Robust Solving of Optical Motion Capture Data by Denoising

 

Icon Simple Concurrency in Python

 

Icon The Software Thief

 

Icon ASCII : A Love Letter

 

Icon My Neural Network isn't working! What should I do?

 

Icon Phase-Functioned Neural Networks for Character Control

 

Icon 17 Line Markov Chain

 

Icon 14 Character Random Number Generator

 

Icon Simple Two Joint IK

 

Icon Generating Icons with Pixel Sorting

 

Icon Neural Network Ambient Occlusion

 

Icon Three Short Stories about the East Coast Main Line

 

Icon The New Alphabet

 

Icon "The Color Munifni Exists"

 

Icon A Deep Learning Framework For Character Motion Synthesis and Editing

 

Icon The Halting Problem and The Moral Arbitrator

 

Icon The Witness

 

Icon Four Seasons Crisp Omelette

 

Icon At the Bottom of the Elevator

 

Icon Tracing Functions in Python

 

Icon Still Things and Moving Things

 

Icon water.cpp

 

Icon Making Poetry in Piet

 

Icon Learning Motion Manifolds with Convolutional Autoencoders

 

Icon Learning an Inverse Rig Mapping for Character Animation

 

Icon Infinity Doesn't Exist

 

Icon Polyconf

 

Icon Raleigh

 

Icon The Skagerrak

 

Icon Printing a Stack Trace with MinGW

 

Icon The Border Pines

 

Icon You could have invented Parser Combinators

 

Icon Ready for the Fight

 

Icon Earthbound

 

Icon Turing Drawings

 

Icon Lost Child Announcement

 

Icon Shelter

 

Icon Data Science, how hard can it be?

 

Icon Denki Furo

 

Icon In Defence of the Unitype

 

Icon Maya Velocity Node

 

Icon Sandy Denny

 

Icon What type of Machine is the C Preprocessor?

 

Icon Which AI is more human?

 

Icon Gone Home

 

Icon Thoughts on Japan

 

Icon Can Computers Think?

 

Icon Counting Sheep & Infinity

 

Icon How Nature Builds Computers

 

Icon Painkillers

 

Icon Correct Box Sphere Intersection

 

Icon Avoiding Shader Conditionals

 

Icon Writing Portable OpenGL

 

Icon The Only Cable Car in Ireland

 

Icon Is the C Preprocessor Turing Complete?

 

Icon The aesthetics of code

 

Icon Issues with SDL on iOS and Android

 

Icon How I learned to stop worrying and love statistics

 

Icon PyMark

 

Icon AutoC Tools

 

Icon Scripting xNormal with Python

 

Icon Six Myths About Ray Tracing

 

Icon The Web Giants Will Fall

 

Icon PyAutoC

 

Icon The Pirate Song

 

Icon Dear Esther

 

Icon Unsharp Anti Aliasing

 

Icon The First Boy

 

Icon Parallel programming isn't hard, optimisation is.

 

Icon Skyrim

 

Icon Recognizing a language is solving a problem

 

Icon Could an animal learn to program?

 

Icon RAGE

 

Icon Pure Depth SSAO

 

Icon Synchronized in Python

 

Icon 3d Printing

 

Icon Real Time Graphics is Virtual Reality

 

Icon Painting Style Renderer

 

Icon A very hard problem

 

Icon Indie Development vs Modding

 

Icon Corange

 

Icon 3ds Max PLY Exporter

 

Icon A Case for the Technical Artist

 

Icon Enums

 

Icon Scorpions have won evolution

 

Icon Dirt and Ashes

 

Icon Lazy Python

 

Icon Subdivision Modelling

 

Icon The Owl

 

Icon Mouse Traps

 

Icon Updated Art Reel

 

Icon Tech Reel

 

Icon Graphics Aren't the Enemy

 

Icon On Being A Games Artist

 

Icon The Bluebird

 

Icon Everything2

 

Icon Duck Engine

 

Icon Boarding Preview

 

Icon Sailing Preview

 

Icon Exodus Village Flyover

 

Icon Art Reel

 

Icon LOL I DREW THIS DRAGON

 

Icon One Cat Just Leads To Another

Animation Blend Spaces without Triangulation

Created on Feb. 19, 2024, 6:55 p.m.

In my previous article I talked about how to compute the weighted average of a set of quaternions in a robust and accurate way. The most obvious application for this in animation and video games is in blending multiple animations together.

Most game engines provide some sort of blending functionality like this for sets of animations. Slerp or nlerp is usually used when blending two animations in a 1D parameter space, while in a 2D parameter space many engines triangulate the data points and then use the barycentric weights to do something like the weighted average of the three animations at each vertex. In a 3D parameter space (or greater) some engines also provide some kind of blend functionally (such as via tetrahedralization) or RBF interpolation, but many engines don't provide anything at all.

Unfortunately, triangulating the space and using the barycentric coordinates as blend weights has a couple of big problems.

The first is that very thin triangles can cause the resulting animation to change extremely quickly as you move through the parameter space, and so you need to be careful to place your animations at locations in the parameter space that do not construct anything degenerative.

Additionally, small tweaks in the parameter space can trigger entirely different triangulations, meaning that small adjustments can completely change how the results look even if they don't introduce degenerate triangulations.

In addition to that, although Delaunay triangulation in 2D is fairly straight forward, triangulation can be a relatively complex and expensive algorithm in higher dimensions, which gets exponentially more computationally expensive the more dimensions you add. And even if you can run it in higher dimensions, beyond three dimensions inspecting the results visually becomes almost impossible, which basically rules it out since you can't easily validate you are not getting thin or degenerate triangles (or in this case, tetrahedrons or simplices).

These two issues combined have an annoying knock-on effect: most of the parameters we wish to blend with respect to in these kind of blend spaces can be extracted automatically from the animation data itself (for example, the character's speed, turning rate, or aiming direction), but doing so often does not work in practice either because we need to place our data points carefully in the parameter space to avoid thin triangles and other degenerate cases in the triangulation, or because such parameters are higher than two dimensions.

For this reason, often it is more practical to build up a kind of higher dimensional blend space out of consecutive 1D blends (which you could see as being like bilinear interpolation). This generally works well if you can do it, but in some cases can require a lot of data as it needs a dense coverage of all the parameters at regular intervals for each parameter.

One of the first animation papers I studied during my PhD uses a method taken from another paper (which is over 20 years old now!) by Peter-Pike Sloan which avoids these issues, and allows you to find per-animation blend weights without relying on triangulation or consecutive 1D blends. This means we can easily have parameter spaces in any dimension we want, with parameters which are automatically extracted from the animation data itself, and we don't need to worry about degenerate triangles or small adjustments to the animation data or parameters causing different behavior.

It works like this: given some desired parameters, represented by a point in the parameter space \( \mathbf{p}_t \), and all the points in our parameter space associated with each of our \( n \) animations \( \mathbf{p}_0,\ \mathbf{p}_1,\ \ldots\ \mathbf{p}_n \), we start by computing the distances (in parameter space) to all these points, and stacking them into a vector:

\begin{align*} \mathbf{d}_t &= \begin{bmatrix} \| \mathbf{p}_t - \mathbf{p}_0 \| \\ \| \mathbf{p}_t - \mathbf{p}_1 \| \\ \vdots \\ \| \mathbf{p}_t - \mathbf{p}_n \| \end{bmatrix} \end{align*}

and then given this vector \( \mathbf{d}_t \) of distances we are going to put it through a matrix \( \mathbf{M} \) which is going to transform it into a vector of weight values \( \mathbf{w} \)...

\begin{align*} \mathbf{w} &= \mathbf{M}\ \mathbf{d} \end{align*}

where each value in the vector is the weight associated with each of our \( n \) animations in our set.

\begin{align*} \mathbf{w}_t &= \begin{bmatrix} w_0 \\ w_1 \\ \vdots \\ w_n \\ \end{bmatrix} \end{align*}

If we can find an appropriate matrix \( \mathbf{M} \) which performs this mapping in the way we want then it will be easy for us to compute the blend weights from the point in the parameter space.

But how do we find such an \( \mathbf{M} \)? Well, what we are going to do is set things up as a linear equation and solve for the \( \mathbf{M} \) which has the behavior we want. To do this we need to stack all of our examples of desired inputs as columns of a matrix to the right of \( \mathbf{M} \), and all the associated examples of desired outputs as columns of a matrix on the left.

So on the right we will have a matrix made up of the distance vectors computed at the parameters associated with each of the animations in the set, stacked into columns (in other words: the pair-wise distance matrix of our animation parameters):

\begin{align*} \mathbf{D} &= \begin{bmatrix} \mathbf{d}_0\ \mathbf{d}_1\ \ldots\ \mathbf{d}_n \end{bmatrix} \end{align*}

And on the left will be the weight vector we want to associate with the parameters for each of the animations in the set, stacked into columns.

\begin{align*} \mathbf{W} &= \begin{bmatrix} \mathbf{w}_0\ \mathbf{w}_1 \ldots\ \mathbf{w}_n \end{bmatrix} \end{align*}

In this case, we will say that we want the weight of the animation for the associated parameters to be one, and the weight of all the other animations to be zero.

\begin{align*} \mathbf{W} &= \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \\ \end{bmatrix} = \mathbf{I} \end{align*}

This gives us the following equation, where \( \mathbf{M} \) is the matrix we want to solve for:

\begin{align*} \mathbf{W} &= \mathbf{M}\ \mathbf{D} \end{align*}

But how do we actually solve this linear system? Algebraically it's simple: we solve this by multiplying by \( \mathbf{D}^{-1} \) on the right:

\begin{align*} \mathbf{W} &= \mathbf{M}\ \mathbf{D} \\ \mathbf{W}\ \mathbf{D}^{-1} &= \mathbf{M}\ \mathbf{D}\ \mathbf{D}^{-1} \\ \mathbf{W}\ \mathbf{D}^{-1} &= \mathbf{M} \end{align*}

But that brings up another question, how do we compute the inverse of \( \mathbf{D} \), or more to the point, how do we compute \( \mathbf{W}\ \mathbf{D}^{-1} \)?

The easiest way to compute this is to pull in some kind of linear algebra library such as Eigen which provides methods for such things out-of-the-box - and this is probably what most game developers do. However that's another dependency, which for a header-only library like Eigen can mean a big hit to compile times, and perhaps another 10MB or more added to the binary if you're not careful.

I know that linear algebra can be a bit scary, but before looking at libraries, I think it's worth considering taking a look at the code listings in books like Numerical Recipes in C. These are really not nearly as scary (or as many lines of code) as you might think. Yes, understanding and transcribing the code is a bit tedious (and annoyingly all the code in the listings uses one-based indexing), but, for example, the code for LU-decomposition (which is one of if not the most efficient method for matrix inversion), can be implement in just two functions and about 100 lines of code.

First, the following function computes a LU decomposition in place, writing it to matrix, and writing a row ordering to row_order and scaling factors to the temporary buffer row_scaling:

bool mat_lu_decompose_inplace(
    slice2d<float> matrix,     // Matrix to decompose in-place
    slice1d<int> row_order,    // Output row order
    slice1d<float> row_scale)  // Temp row scale
{
    assert(matrix.rows == matrix.cols);
    assert(matrix.rows == row_order.size);
    assert(matrix.rows == row_scale.size);
    
    int n = matrix.rows;
    
    // Compute scaling for each row
    
    for (int i = 0; i < n; i++)
    {
        float vmax = 0.0f;
        for (int j = 0; j < n; j++)
        {
            vmax = maxf(vmax, fabs(matrix(i, j)));
        }
        
        if (vmax == 0.0) { return false; }
        
        row_scale(i) = 1.0 / vmax;
    }
    
    // Loop over columns using Crout's method
    
    for (int j = 0; j < n; j++)
    {
        for (int i = 0; i < j; i++)
        {
            float sum = matrix(i, j);
            for (int k = 0; k < i; k++)
            {
                sum -= matrix(i, k) * matrix(k, j);
            }
            matrix(i, j) = sum;
        }
        
        // Search Largest Pivot
        
        float vmax = 0.0f;
        int imax = -1;
        for (int i = j; i < n; i++)
        {
            float sum = matrix(i, j);
            for (int k = 0; k < j; k++)
            {
                sum -= matrix(i, k) * matrix(k, j);
            }
            matrix(i, j) = sum;
            float val = row_scale(i) * fabs(sum);
            if (val >= vmax)
            {
                vmax = val;
                imax = i;
            }
        }
        
        if (vmax == 0.0) { return false; }
        
        // Interchange Rows
        
        if (j != imax)
        {
            for (int k = 0; k < n; k++)
            {
                float val = matrix(imax, k);
                matrix(imax, k) = matrix(j, k);
                matrix(j, k) = val;
            }
            row_scale(imax) = row_scale(j);
        }
        
        // Divide by Pivot
        
        row_order(j) = imax;
        
        if (matrix(j, j) == 0.0) { return false; }

        if (j != n - 1)
        {
            float val = 1.0 / matrix(j, j);
            for (int i = j + 1; i < n; i++)
            {
                matrix(i, j) *= val;
            }
        }
    }
    
    return true;
}

Then, given a LU decomposition, the following function can be used to compute the inverse of the decomposed matrix multiplied by a vector vector. It writes the result back into vector in-place.

void mat_lu_solve_inplace(
    slice1d<float> vector,
    const slice2d<float> decomp, 
    const slice1d<int> row_order)
{
    assert(decomp.rows == decomp.cols);
    assert(decomp.rows == row_order.size);
    assert(decomp.rows == vector.size);
    
    int n = decomp.rows;
    int ii = -1;
    
    // Forward Substitution
    
    for (int i = 0; i < n; i++)
    {
        float sum = vector(row_order(i));
        vector(row_order(i)) = vector(i);
        
        if (ii != -1)
        {
            for (int j = ii; j <= i-1; j++)
            {
                sum -= decomp(i, j) * vector(j);
            }
        }
        else if (sum)
        {
            ii = i;
        }
        
        vector(i) = sum;
    }
    
    // Backward Substitution
    
    for (int i = n - 1; i >= 0; i--)
    {
        float sum = vector(i);
        for (int j = i + 1; j < n; j++)
        {
            sum -= decomp(i, j) * vector(j);
        }
        vector(i) = sum / decomp(i, i);
    }
}

This is really what we want to do - we want to compute \( \mathbf{W}\ \mathbf{D}^{-1} \), not \( \mathbf{D}^{-1} \). This we can do by plugging columns of \( \mathbf{W} \) into vector in the above.

So to summarize: to get the matrix \( \mathbf{M} = \mathbf{W}\ \mathbf{D}^{-1} \), we need to apply LU decomposition to \( \mathbf{D} \), and then solve for each column of \( \mathbf{W} \) using the above method (which in this special case will just be columns of the identity matrix):

void fit_blend_matrix(
    slice2d<float> blend_matrix,
    const slice2d<float> animation_parameters)
{    
    int nanims = animation_parameters.rows;
    int nparams = animation_parameters.cols;
    
    // Compute Pairwise Distances

    array2d<float> distances(nanims, nanims);
    
    for (int i = 0; i < nanims; i++)
    {
        for (int j = 0; j < nanims; j++)
        {
            distances(i, j) = 0.0f;
            for (int k = 0; k < nparams; k++)
            {
                distances(i, j) += squaref(
                    animation_parameters(i, k) - 
                    animation_parameters(j, k));
            }
            distances(i, j) = sqrtf(distances(i, j));
        }
    }
    
    // Subtract epsilon from diagonal this helps the stability
    // of the decomposition and solve
    
    for (int i = 0; i < nanims; i++)
    {
        distances(i, i) -= 1e-4;
    }
    
    // Decompose in place
    
    array1d<int> row_order(nanims);
    array1d<float> row_scale(nanims);
    
    bool success = mat_lu_decompose_inplace(distances, row_order, row_scale);
    assert(success);
    
    // Write associated blend weights into blend matrix
    
    for (int i = 0; i < nanims; i++)
    {
        for (int j = 0; j < nanims; j++)
        {
            blend_matrix(i, j) = i == j ? 1.0f : 0.0f;
        }
    }
    
    // Solve for blend matrix in-place
    
    for (int i = 0; i < nanims; i++)
    {
        mat_lu_solve_inplace(blend_matrix(i), distances, row_order);
    }
}

(Note: Since \( \mathbf{W} \) is symmetric the code here actually solves for rows instead of columns, but the result is the same.)

Here, the output blend_matrix will contain our matrix \( \mathbf{M} \), animation_parameters is our matrix \( \mathbf{P} \), distances is our distance matrix \( \mathbf{D} \), and blend_matrix is also where we write in our weights matrix \( \mathbf{W} \) since this is solved in-place.

Now that we've computed \( \mathbf{M} \), given a new point in the parameter space \( \mathbf{p}_t \), we simply compute the distances to all the other points in the parameter space \( \mathbf{d}_t \), and multiply this vector of distances by \( \mathbf{M} \). This gives us a vector of weights \( \mathbf{w}_t \) to use in our weighted blend of animation data.

void compute_query_distances(
    slice1d<float> query_distances, 
    const slice2d<float> animation_parameters, 
    const slice1d<float> query_parameters)
{
    assert(query_distances.size == animation_parameters.rows);
  
    for (int i = 0; i < animation_parameters.rows; i++)
    {
        query_distances(i) = 0.0f;
        for (int j = 0; j < animation_parameters.cols; j++)
        {
            query_distances(i) += squaref(
                query_parameters(j) - animation_parameters(i, j));
        }
        query_distances(i) = sqrtf(query_distances(i));
    }
}

void compute_blend_weights(
    slice1d<float> blend_weights,
    const slice1d<float> query_distances,
    const slice2d<float> blend_matrix)
{
    mat_mul_vec(blend_weights, blend_matrix, query_distances);
}

We will want to set any weights that are less than zero back to zero, and normalize the result so that all the weights sum to one. This ensures we always have a convex combination of weights for our interpolation.

void clamp_normalize_blend_weights(slice1d<float> blend_weights)
{
    float total_blend_weight = 0.0f;
    for (int i = 0; i < blend_weights.size; i++)
    {
        blend_weights(i) = maxf(blend_weights(i), 0.0f);
        total_blend_weight += blend_weights(i);
    }
    
    for (int i = 0; i < blend_weights.size; i++)
    {
        blend_weights(i) /= total_blend_weight;
    }
}

Here is a little comparison between triangulation and this method (called "interpolation" in the video). Here we use a 2D parameter space where one axis is the animation's average speed and the other is its average turning rate.

We can get a better idea of the difference by visualizing the blend weight associated with each animation in the space. Here you can see that this method produces blend weights which change much more smoothly, without the artifacts you get from thin triangles and barycentric interpolation.

Because this method is based on linear algebra, it works almost exactly the same no matter how many dimensions the parameter space has. Here is another example where I use sample points from a trajectory of future positions and directions as the parameters (like you might use in Motion Matching). Given the 2D location and direction of three future points, this creates a 12-dimensional parameter space. This would be totally impossible with triangulation, but works fine with the proposed method (although we cannot visualize the space so easily).

Assuming you don't have very many animations in your space, computing \( \mathbf{M} \) is actually pretty fast, meaning it can potentially be done every frame. That allows for dynamic blend spaces where the parameters change according to the state of the animations in the set.

For example, here, instead of the parameter of the blend being the "average root velocity of the whole animation", I instead set it to the root velocity of the currently playing frame of each animation in the set.

This works for the trajectory-based parameterization too.

While re-triangulation is not expensive either and so could be done every frame if required, because it isn't stable, the results are jumpy:

Computationally this method can be more expensive than triangulation, simply because we are not limited to always blending three animations. But this is somewhat the price you have to pay for having that smoother interpolation.

There is one big potential problem with this method, which is revealed if we plot the point in the parameter space found by taking our predicted weights and doing the weighted sum of the parameters themselves (here shown in blue):

We can see that unlike with triangulation the weights produced by this method, when used as a weighted sum of our parameters, don't actually map exactly back to the point we asked for. And this reveals some other potential issues with it. When we move our target parameters far outside of where we have data we can see that the extrapolation behavior tends to spread out the weighting evenly across all animations, pulling the resulting animation toward the average.

We get a similar thing for the trajectory based parameterization:

Here the target trajectory is in green, while the actual trajectory we get (when using the predicted blend weights) is in blue.

This effect is both bad and good - good because it means this method extrapolates "safely" - it just goes toward the average when we ask for something far away from any of our data - bad because it introduces some pretty counter-intuitive effects. For example, if we ask for a speed faster than our fastest animation, it is likely we will actually get an animation which is slower than our fastest animation, as it will target something closer to the average.

One simple thing you can do (in 2D at least) is to use the triangulation to project onto the convex hull before computing the weights. This removes some of the weird behavior in terms of extrapolation but does not solve the issue of parameters not matching within the data distribution and does not scale to higher dimensions.

Now... trust me when I say that I spent a really long time trying to come up with a simple solution to this issue which did not somehow involve Neural Networks, but unfortunately at some point I had to concede.

The thing is, for better or worse, this problem is actually a pretty good fit for Neural Networks because we have a function we want to fit (the function mapping from from parameters to weights), which we need to be fast to evaluate, and we want to make subject to a number of relatively simple constraints:

  1. For each animation in our set, if we input its parameters, the output weights must be one for that animation and zero everywhere else.
  2. For any given input, the sum of the output weights must equal one.
  3. For any given input, the output weights must be positive.
  4. For any given input, doing the weighted sum of the animation parameters using the output weights must produce something close to the original input.

If we code up these constraints as loss functions, we can quite easily train a very small Neural Network to perform this task in less than 200 lines of PyTorch. The most important bit looks like this (full code is here):

# Sample random points in the parameter space
Prand = (Pmax - Pmin) * torch.rand(batchsize, nparams, dtype=torch.float32) + Pmin

# Compute the distances for those random points
Drand = torch.sqrt(torch.sum(torch.square(P[None,:] - Prand[:,None]), dim=-1))

# Concatenate to the points and distances from our animations
Pbatch = torch.cat([P, Prand], dim=0)
Dbatch = torch.cat([D, Drand], dim=0)

# Compute the weights by going through the neural network
W = (network((Dbatch - network_mean_in) / network_std_in) * 
    network_std_out + network_mean_out)

# Compute the normalized weights (i.e positive and sum to one)
Wnorm = torch.maximum(W, torch.zeros_like(W))
Wnorm = Wnorm / torch.maximum(
    Wnorm.sum(dim=-1)[...,None], 1e-4 * torch.ones_like(Wnorm[:,:1]))

# Loss to make weights at the animations 1 for that anim and 0 elsewhere
loss_data = torch.mean(torch.abs(W[:nanims] - torch.eye(nanims)))

# Loss to make the weights sum to one
loss_sum = 0.01 * torch.mean(torch.abs(W.sum(dim=-1) - 1))

# Loss to make the weights positive
loss_pos = 0.01 * torch.mean(torch.abs(torch.minimum(W, torch.zeros_like(W))))

# Loss to make the input match the output in parameter space
loss_proj = 0.1 * torch.mean(
    torch.sqrt(torch.sum(torch.square((Wnorm @ P) - Pbatch), dim=-1) + 1e-4))

# Add together losses
loss = loss_data + loss_sum + loss_pos + loss_proj

Just like before, the input to this network will be the distances to all the animations in the parameter space, and the output will be the weights. So it's a drop-in replacement to the matrix \( \mathbf{M} \) we used in the previous method.

The result is something that much more closely matches our expectations, and takes just a few microseconds to evaluate:

Interestingly, we can also control how the input projects onto the convex hull by changing the distance function used in constraint 4. In the above example I use the Euclidean distance between the original input and the reconstructed weighted sum of the animation parameters. But if I switch to the Manhattan distance...

loss_proj_l1 = 0.1 * torch.mean(
            torch.mean(torch.abs((Wnorm @ P) - Pbatch), dim=-1))

...we can see it will project in a way which minimizes the Manhattan distance instead:

This could be useful. For example, if we know one of our axes in parameter space represents an angle, we could make sure this axis uses the angular distance to get a projection which "wraps around" onto the other side when required.

This Neural Network based method works identically in higher dimensional parameter spaces, which we can see with our trajectory-based parameterization.

It might look like the result here is not really matching the input, but what I think is happening is actually that in higher dimensions it starts to be the case where it is increasingly unlikely for a point to be completely inside the convex hull. That means that we are almost always projecting onto the convex hull by at least some small amount whatever our input is - which would explain the fact we always have a difference.

The big downside of the Neural Network based method is that in my experiments it takes at least 15 minutes to train, which is kind of a disaster for iteration times, and also means that it is more difficult to get working with dynamic blend spaces where the animation's parameters change on a frame-by-frame basis.

But given the simplicity of the task and the small size of the network, I imagine that with some tweaking it should be possible to get the training time down to under a minute (in which case this might be a more viable option), but if anyone knows a better solution to this problem which does not require as much pre-computation I would love to hear what it is!

I think it's also worth noting that everything I've discussed today doesn't really have anything to do with animation! These are just ways of getting a set of blend weights when you have some data points in some parameter space. What you actually blend with those weights is entirely up to you!

You can find the full source code for this article here - and there is a web-demo you can play with here.

So that's everything I have for you today. I hope you found this interesting and maybe it stirred up some other ideas for how we might do animation blending and implement blend spaces in new ways. And as usual, thanks for reading.

github twitter rss