Data Quality enhancement and monitoring framework for High Performance Systems leveraging Machine Learning techniques The data that is originating from the source systems is not standardized, integrated, deduplicated and cleansed leading to ineffective/non-actionable business indicators.
The data must be aggregated to identify the common master data sets; sourcing the “clean” dataset and feed it to the learning engine and apply the generated model to auto-cleanse, suggest or run human-assisted/semi-automated scenarios.
By doing this, we want to achieve significant reduction in data quality deviations with on-the-fly fixing of about 70% of the master data entities, as identified by Data Quality indicators and dynamic recommendations to fix about 20% of the persistent data sets.