Project Overview
The aim of this project to detect and alert anomalies in critical or lagacy control systems (e.g., cranes) without direct connection to the system, yet using purely CCTV-based introspection. This provides an air-gap that protects the system from cyber attacks. However, vision based state estimation incorporates significantly higher noise for the model to differentiate whether it's a true anomaly.
Methodology
In this work, we propose a platform-agnostic anomaly detection framework, namely AWARE, for reliable anomaly detection in the real world. AWARE comprises of two major components:
-
A world model that learned the dynamics of the system and provides next state predictions given history states and actions. By comparing the real-time prediction error with a precomputed reference, the model flags whenever a large discrepancy of behavior occurs (e.g. collision, dangerous movement).
-
A latent estimator that estimates dynamics parameters from a window of states and actions. This functions as a complementary signal for detecting subtle changes of system dynamics (e.g. a degraded motor, or a shift of payload's center of mass).
When such anomalies are detected, the model alerts the operator in real-time, enabling proactive expert intervention and preventing further damage. AWARE also supports damage introspection by decoding the estimated latent, pinpointing the source of the issue.
Results
- Our recent results demonstrate a 77% detection success rate for payload collisions and motor degradation using high noise, camera-based state estimation.
- The trained model demonstrates highly accurate motion prediction, with about 3 degrees of error over a 2 second horizon.
- Our method also demonstrates 96% success rate in detecting which motor is impaired with only camera-based state estimation.
Unlike modern VLMs which usually introduce large cloud compute and high latency, AWARE is designed to operate in real-time, fully on device (NVIDIA Jetson Orin), and consume less than 200W power.
Future Works
It would be interesting to see the transferability of AWARE on a complete different platform, for example, a quadruped robot. The robot contains higher dimensional observations that poses many challenges to the state estimation, such as self-occlusion. In the meantime, the moving robot suggests a dynamic nominal condition that changes with respect to the interection between robot and the environment. Also, it remains unclear whether AWARE still detects reliably when the locotion policy internally incorporates a level of self-correction.
Nevertheless, this project is still undergoing as more interesting aspects of the method are yet to be published.