ML-Interview-Questions
- What is EDA Process?
Ans: Data Understanding is a key step in CRISP-DM framework. In any end to end Data Science project, there is a need for proper understanding of the underlying data well before starting to work on the same. EDA or exploratory data analysis comes into picture here, where in you deep dive to the dataset in hand to find patterns/correlations/any sort of relationship within. These not only helps in better data understanding but also helps in the data cleaning part by identifying irrelevant/correlated features.
Step in EDA
Identification of important feature. As per the problem, identification of the variable is a first step.
Univariate analysis to understand each column in the data.
Bivariate analysis : how the variables are interrelated to each other. This help us identifying relationship between variables.
Data Visualisation
Handling missing values and outliers.
Transform orignal variables
Derive new variables.
Important link to understand the data
https://towardsdatascience.com/ways-to-detect-and-remove-the-outliers-404d16608dba
https://medium.com/@seema.singh/why-correlation-does-not-imply-causation-5b99790df07e
Data Un
2. Handling Missing Values
mean and median for numeric data and mode for categorical data.