Using machine learning to detect signs of life in complex video data

The Challenge

To develop a machine learning algorithm capable of identifying potential signs of life, the research team began by labeling video footage sourced from real field water samples and lab-grown specimens. While the act of labeling was relatively simple, building and maintaining an in-house video annotation platform proved too demanding for the team’s limited engineering resources.

The Approach

They turned to Databrewery’s video annotation tools, which made it easy to track object movements over time and flag patterns that might indicate biological activity. The platform’s flexibility also allowed them to tailor the editor to their specific scientific use case without custom development or infrastructure overhead.

The Outcome

The team achieved a 5x speed improvement in customizing their labeling environment. What was originally budgeted to take a full week was completed in just one day. With the most labor-intensive video annotation tasks handled quickly, the team could shift focus back to training their detection models and advancing their search for life-supporting conditions.

Labeling

Earth’s oceans are filled with microscopic life from bacteria to protozoa and scientists believe similar organisms could exist elsewhere in the solar system. In recent years, water vapor and ice have been discovered on moons like Europa and Enceladus. A mission is now being prepared to explore these icy worlds further, and the spacecraft will carry instruments designed to collect video data from potential water samples using onboard microscopes. If that video reveals microbial movement, it could be the first direct evidence of extraterrestrial life.

Transmitting that data back to Earth, however, poses an enormous challenge. These moons are so distant that the cost of “downlinking” video footage is extremely high. Less than 0.01% of captured data can be returned, an extreme compression bottleneck that makes traditional methods unworkable. For perspective, sending back just 40 MB of compressed video could consume a third of the mission’s total science data return budget.

To address this, the Machine Learning Instrument Autonomy (MLIA) team designed a machine learning model capable of detecting life-like motion onboard. The system selects video clips most likely to show biological activity, captures short segments, and pairs each with a summary of how the decision was made allowing mission planners to prioritize which samples are sent home.

The model had to meet strict hardware and scientific requirements. It needed to run on spacecraft-grade processors (similar to mobile CPUs), track motion over time, differentiate between life-like behavior and random drift, and crucially explain its reasoning. Because deep learning methods are difficult to interpret and compute-intensive, the team turned to more traditional ML models, including decision trees and support vector machines (SVMs), to maintain transparency and efficiency.

“One of the hardest parts is teaching a model what life looks like,” said Priya Desai, an ML researcher on the project. “Some microbes swim with intent, others drift. You also have random debris in the water that looks organic to a computer. Humans can tell the difference, but a model has to learn those patterns from scratch.”
Video Transcript

Equally important was building trust with the scientists who would rely on the model. “We’re working closely with them so they understand how we’re training it and what it can and can’t do,” Desai added. “It’s not enough to get results. The results have to make sense to the people using them.” To make the model reliable, scientists were involved at every step. They helped review outputs, confirmed whether the right organisms were being tracked, and informed how the model evolved over time.

Early on, the MLIA team struggled with building in-house tools for labeling. Their ML researchers were writing Java GUIs just to tag microscopy videos and manually sorting through sample data, a slow and inefficient process. Because the team works on time-sensitive research grants, they needed a faster, scalable solution that didn’t require custom development for every project.

They turned to Databrewery to handle both platform and labeling support. The team uploaded video footage from their digital holographic microscopes and used Databrewery’s intuitive video annotation tools to track objects across time. Labelers were quickly trained and could identify movement patterns that might suggest life, significantly reducing the burden on the core research team.

“Databrewery was simple to integrate and saved us a week of setup, we were labeling on day one,” said Desai. “We customized the editor to exactly what we needed, uploaded our data, and got moving right away.”

Databrewery also provided the experienced labeling workforce the team needed to meet tight deadlines. “Turnaround was fast, accuracy was high, and we had the flexibility to evolve our workflows as our models improved. That kind of support is rare, and it made a measurable difference in our progress,” she added.

Now, with their annotation pipeline streamlined and their models improving with each iteration, the MLIA team is one step closer to helping NASA determine where and how life might exist beyond Earth.