Abstract
In this paper we present a new framework to process high-volumes of data generated from heterogeneous sources with different formats (text, image’s features …etc.). The framework consists of three phases. The first phase selects appropriate data reduction technique that closely preserves all of the relevant information in the original data set. The second phase determines the suitable algorithm to apply the selected data reduction technique. The third phase integrates the reduced datasets and makes it ready to fit into different models (Visualization, Reports, Decision making, and predictions). This framework is ideal for knowledge management of data-intensive applications.
Keywords: High Performance processing, Data Mining, Data Reduction.