Detecting motion in video images

FreezeFrame does not track the animal. A tracking algorithm finds the location of the animal’s center, whereas for fear conditioning we want to detect minute motions of the limbs and head that can occur while the animal sits in the same location. For that, a more sensitive global motion-detection algorithm is required.

Detecting the occurrence of motion from a video stream is in principle very simple. Compare successive images: If they are different, motion has occurred; if they are the same, there is freezing.

Difference image (with motion).

In a computer-based system, however, the images must first be digitized. A number is assigned to each pixel proportional to its brightness. Then two successive images are subtracted from each other, pixel by pixel. If the animal has not moved, the images will be identical and the subtracted values for each pixel should be 0. If the animal has moved, the values in the area where motion has occurred will be nonzero.

In practice, there are several problems. First, there is invariably noise or “snow” in the digital conversion process. The same brightness can get encoded as 143 one time and 145 the next. Second, the light level in the room tends to change imperceptibly as the power-line voltage, and therefore the lights, vary up and down a little each second. Third, the automatic gain control in the camera and the auto-iris opening can vary a little from moment to moment as they “hunt” for the correct level. These processes all generate differences between successive digitized images that are unrelated to motion.

One way used to differentiate real motion from artifactual image differences is through a simple threshold: Throw away all the pixel differences below a certain value. But this method can mistakenly exclude smaller movements of the animal and at the same time be fooled by high amplitude noise or camera gain changes. FreezeFrame uses a statistical approach, examining the entire distribution of pixel differences. The distributions generated by real motion and by noise or gain changes are very different in character and so are easily distinguishable using statistical methods. The result is a nearly flawless motion detector that performs almost identically to a human observer.