Tabular Data Exploratory Data Analysis

Improve Your Exploratory Data Analysis for Tabular Data: Part 1

Bruce H. Cottman, Ph.D.
17 min readMar 1, 2023

I discuss why I went from five to two plot types in my preliminary EDA. I also have created a Github for all code in this blog. The GitHub repo contains three function tools you can use immediately in your EDA. For many years, I analyzed data using static plots. I also show my switch to interactive plots. As a result, I have fewer plots, a better understanding of the data, and more time to devote to developing a predictive solution.

Tabular Data can have more than a dimension for each independent variable. Photo by Pietro Jeng on Unsplash

Exploratory Data Analysis

Any Exploratory Data Analysis aims to transform data into data that the machine learning model consumes to predict a target variable as correctly as possible.

What is horrifying is that in deploying a machine learning-based solution, the data feed into the model requires preparation (and continual maintenance) that is between 60–90% of the total project time.

What interests me is the amount of space dedicated in a journal article to a "new" model or applying a model technique to a different domain. At the same time, little or no time is spent describing the data preparation.

Little or no description of the data preparation techniques used is the root cause of the…

--

--

Bruce H. Cottman, Ph.D.
Bruce H. Cottman, Ph.D.

Written by Bruce H. Cottman, Ph.D.

I write my blog utilizing decades of experience in investment, programming, and data science.

No responses yet