Action recognition in motion-capture data

Motion capture data is generally context less measurements of skeletal data taken at discrete time intervals. It is not linked with any understanding of what motions the data represent. In this project we aim to design an automated system which can compare new, previously unseen, motion capture data with existing known motion templates, and both classify the new data and compare it with the template. This will allow the system to act as a virtual trainer, guiding an individual to perform actions which best mimics the templates. The templates will in this situation be the target motion describing an exercise or technique.

We present a probabilistic approach to motion capture action recognition, comparison and guidance to best emulate a template and show how this can give good estimates about the intent of a given action.

The proposed method can sucessfully recognize a particular type of exercise among a group of possible matches. It can also determine how two similar - but not identical - exercises match over time and give feedback as to how one exercise should be corrected to better mimic the other.

The paper can be viewed here as a pdf.

The data and images mentioned in the paper can be downloaded here as a zip.


In physical therapy we generally have an exercise which is aimed at mobilizing the joints, tendons and muscles through predefined motions and stretches in a predefined order. This motion can be described verbally, through graphical illustrations or it can be demonstrated by a person experienced in the correct form of the exercises. Commonly the exercise is demonstrated and then the patient performs the same exercise while an expert observes and helps identify errors in how the exercise is performed and gives hints as to how the patient can correct these errors. If, however, an expert is not available, the patient receives no feedback and may end up performing the exercises in a less than optimal way. The exercise may even end up doing more harm than good.

In this project we investigate a method by which we compare a patients movements with previously recorded movements representing a correct exercise. Both movements will be in the form of Motion Capture data. This comparison will allow the system to give some feedback to the patient describing if the exercise is correct or if it needs to be adjusted. If adjustments are required, the system should give hints to the patient allowing him to correct the movements and perform the exercise in a more correct way.

Apart from serving as a guide to the correct form, it can be used as a form of training entertainment game in which the patient scores point when doing the exercise properly. For a stretching exercise the optimal stretch could be increased slowly from exercise to exercise to constantly change the target and keep the patient motivated in doing otherwise repetitive and boring exercises. This seems like a reasonable expectation considering the multitude of home fitness games available on the market today.

It should be pointed out that this system in no way aims to compete with an actual instructor that can observer in great detail while actually manipulating the patients joints to correct an intermediate pose. We aim only at improving on the current state of having no feedback what so ever when an instructor is not available.

The methods described in this paper will be usable to other forms of training as well, but with our primary focus on physical therapy, we place a greater emphasis on going through the exercise motions rather than arriving at some optimal end position. The path itself is the important part.

In this paper we will decouple the registration of motion from the analysis of that motion. While motion capture is briefly described it is not the subject of this text. This should be a general method which could be used with any pre-recorded motion in the correct format. The focus will be on comparing recorded motion capture data with previously recorded template data.

A specific action recognized among three candidates. Note how the probability density is concentrated inside a single action.

A maximal likelihood (of match) path through a difference matrix.
Thomas Grønneløv,
Apr 12, 2011, 1:09 AM
Thomas Grønneløv,
Apr 12, 2011, 2:52 AM