ML-HAS project target is to provide intelligence for a cloud management system to support automation fault prediction/detection, root cause analysis, and fault recovery by performing monitoring analytics and triggering actionable events.
The ability to gather multi-context (situation) data from the managed cloud environments; this data is then applied to various machine learning techniques for fault diagnostic (prediction/detection) to make specific results to real-time system situations; the consequences of prediction/detection trigger appropriate actions to cloud infrastructure management for such as root cause analysis, raise an event or schedule recovery strategy.
- A holistic view of computational center operations
- Ability to gather multi-context (situation) data from several data sources (metrics and logs) in the managed cloud environment
- Proactive monitoring, analytics, and recovery with Machine Learning-based analytic function for Fault Prediction/Detection, Root Cause Analysis, and Recovery
- A comprehensive architecture for the development and deployment of ML-based analytics from data collection, storage, and model management to actionable for fault recovery
- Data collecting and managing (ingest, transform, and store) as necessary for developing ML-based analysis functions (fault prediction/detection, root cause analysis)
- Provide a framework for developing and deploying machine learning-based analytics functions for cloud management tasks with applicable ML-based time series forecasting.
- Integrated system orchestration for closing loop for the automated recovery function