A group of scientists from Sweden have developed an artificial intelligence model called Dessie, designed to translate the body language of horses into a format understandable to humans. The solution is based on machine learning technologies and synthetic images.
Image source: Helena Lopes / unsplash.com
During clinical examinations, veterinarians often look at visual cues given by animals, but this is not always reliable: a horse may shift pain to another leg, change weight distribution, or change posture. Its behavior may indicate orthopedic problems, behavioral disorders, or signs of injury. Traditional diagnostic tools, including x-rays and MRIs, provide results after the problem has occurred. Dessie’s goal is to read a horse’s body language to spot signs of trouble early.
As it runs, the model transforms flat images into 3D images in real time that capture the shape, pose, and movement of the horse. This is not just a visualization, but an attempt to translate the expressive language of the body. Dessie was created using factor separation learning. Traditional models stream all the information — pose, shape, background, lighting — in one place, which can confuse the AI and make it difficult to focus on what matters most — the horse itself. Factor separation learning allows each feature to be considered separately: shape is represented by one entity, pose by another, and background noise that is not relevant to the task is ignored.
The 3D objects generated by Dessie are not only highly detailed, but also reliable. The AI helps researchers isolate patterns of movement without being distracted by surrounding objects and differences in lighting. Dessie does not require high-quality cameras or markers on the horse’s body — it only requires a simple camera and basic video footage. The technology can be used by workers in rural clinics who do not have access to expensive imaging tools.
To train the AI, the researchers needed massive amounts of visual data. Since collecting real images of different horse breeds in different poses and lighting conditions is difficult, they developed a synthetic data generator called DessiePIPE. It can create an unlimited number of horse images using a 3D model and AI-generated textures based on the characteristics of different breeds. This allowed the team to teach Dessie the specifics of horse movement without having to study thousands of real animals: DessiePIPE visualizes horses walking, eating, rearing, or resting, in a variety of backgrounds and lighting conditions. The system also creates pairs of images to compare that differ in just one parameter, such as shape or pose, so the model can learn to spot subtle differences. As a result, Dessie learned to recognize small changes in movement and became more effective at generalizing data to new conditions.
Horses signal pain through subtle changes in gait and posture that only an experienced veterinarian can notice. Dessie translates these signals into objective 3D data, helping to identify problems early. It creates a digital record of the animal’s posture and movement that can be viewed multiple times, tracked over time, and shared with other clinics. Although Dessie was trained on synthetic data, the AI works effectively with real images: only 150 annotated real images were needed to tune the system. This set was enough for Dessie to outperform state-of-the-art models in test tasks: when detecting key points such as joints and other important features, the system showed better results than MagicPony and Farm3D. Dessie also more accurately predicts body shape and movement, which is important for diagnosing lameness or muscle asymmetry. Its performance increased even more with larger training data sets, thanks to the benefits of factor separation learning.
Dessie was created to analyze horses, but the system’s architecture is so flexible that it can produce high-quality results when working with other similar animals: cows, zebras, deer. The model successfully reconstructed them in 3D, despite the lack of direct training on these species. This opens up great potential in the field of animal protection: the system can study rare species using only ready-made photographs and videos, without the need for invasive monitoring. Dessie has also demonstrated high efficiency in processing artistic images, including paintings and cartoons, from which it can build accurate 3D models.
The system does have its shortcomings, however. It works best when there is only one horse in the frame, and struggles with unusual body shapes that were not in the training data. This is something that the new VAREN model, which supports a wider variety of shapes, should address. Overall, Dessie is easy to use: it analyzes a horse’s body language and translates it into synthesized speech, taking human-animal communication to a new level.