This presentation, delivered by Sathishkumar V E from Sunway University, Malaysia, provides a comprehensive guide on preparing datasets for predictive modelling. It covers the fundamental differences between classification, regression, and clustering, and highlights the importance of selecting appropriate datasets for various predictive tasks. The presentation outlines key considerations in dataset preparation, including identifying dependent and independent variables, sourcing data, and analysing correlations. A case study on Seoul Bike Data is presented to illustrate the practical application of these concepts. Additionally, the presentation discusses benchmark datasets, the need for novelty in research, and current work in progress, such as the Seoul Road Accident Dataset and groundwater level forecasting. The presentation concludes with a list of dataset sources and references to related research.
This talk is joint event of Ernet and Department of Mathematics, IIT Madras