In statistics , exploratory data analysis EDA is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis IDA , [1] which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. Tukey defined data analysis in as: "Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of mathematical statistics which apply to analyzing data. This family of statistical-computing environments featured vastly improved dynamic visualization capabilities, which allowed statisticians to identify outliers , trends and patterns in data that merited further study. Tukey's EDA was related to two other developments in statistical theory : robust statistics and nonparametric statistics , both of which tried to reduce the sensitivity of statistical inferences to errors in formulating statistical models.
Article in Journal of the American Statistical Association 96() · January with Reads.​ Discover more publications, questions and projects in Exploratory Data Analysis.​ Hoaglin D., Mosteller F., Tukey J.W., , Understanding Robust and Exploratory Data Analysis, Wile.

