Skip to content

toransahu/excel-implementation-of-regression-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Excel Implementation of Regression Clustering

A major project completed in partial fulfillment of the requirements for the award of Bachelor of Technology in Computer Science & Engineering (April-May 2015).

By Abhishek Maheshwari and Toran Sahu under the Guidance of Dr. N. K. Nagwani (Assistant Professor), Department of Computer Sc. & Engineering, National Institute of Technology, Raipur - 492010, Chhattisgarh, India.

Abstract

Nowadays Excel files carry most of the datasets created for various usages. Extracting information from these Excel file is an important task. So to fulfill theses purposes clustering regression is applied to the Excel File. Prediction of attributes in these Excel files is important for some activities. Regression techniques are most widely used for prediction task where relationship between the independent variable and dependent variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. The objective of this proposed work is to find an optimum number of clusters in which original dataset should be clustered to ensure less prediction errors for estimating the value of dependent variable. The proposed project consists of four major stages, first of all data preparation is carried by extracting data from Excel file; in the second stage, clustering is used to group the similar type of data, in third stage regression techniques are applied over these groups (clusters) to predict the dependent variable value from individual clusters, and the last stage task concludes with finding the cluster count for which minimum error is estimated. The output (clusters) are generated in Excel file format for further uses.

For more details, please refer the link.