ACE: Autonomous Conformity Evaluation of Tensor Data by Means of Novel L1 Norm Principal Component Analysis
An autonomous machine (or collective of autonomous machines) brings in and runs on a multitude of sensed data. Data quality assurance is a necessary condition for operational assurance of the autonomous machines. The project carries out basic research and develops novel mathematical methods that measure the "conformity" of each data point with respect to all other collected data (in a blind, unsupervised, artificially-intelligent way). The developed mathematical data-conformity evaluation schemes will process any given data set represented by a high-dimensional matrix (also known as tensor) and convert each data entry to a continuous zero-to-one "alert value" (zero implying highly conforming data – one implying highly non-conforming data).
“Non-conforming” sensed data may represent critical, actionable information, e.g., internal system failure, sensor system failure, external interference or data manipulation. For instance, an unmanned aerial vehicle flight controller gathers information from different sensors (accelerometer, gyro, GPS, barometric pressure, etc.), which are then fused together to make state predictions (e.g., using an enhanced Kalman filter), and makes control decisions based on these predictions. Non-conforming sensor data may lead to wrong control decisions. By identifying generated non-conforming data, a crash could be potentially prevented. Our basic research plan is to develop new theory and mathematical tools that can illuminate such non-conforming values based on recursively refined calculations of L1-norm data subspaces. These values will then be forwarded to a human or machine analyst/decision-maker for appropriate action.
Identification of non-conforming data entries will enhance our ability to rapidly identify problems during (i) testing of new artificially intelligent autonomous technologies and (ii) deployment of field-ready artificially intelligent technologies. Indeed, as a broad statement, non-conforming values used to train tree-based learning and classification algorithms can increase the number of decision tree branches associated with certain data attributes/features and lead to inaccurate feature selection criteria. The proposed theoretical framework will offer a blind, unsupervised way to identify inappropriate/faulty data independently of the nature of the original data set.