- Description (描述性分析)
- Exploratory (探索性分析)
- Inferential (推断分析)
- Predictive (预测分析)
- Causal (因果分析)
- Mechanistic (机理分析)
About descriptive analysis
Goal: Describe a set of data
- The first kind of data analysis performed
- Commonly applied to census data
- The description and interpretation are different steps
- Descriptions can usually not be generalized without additional statistical modeling
About exploratory analysis
Goal: Find relationships you didn’t know about
- Exploratory models are good for discovering new connections
- They are also useful for defining future studies
- Exploratory analysis are usually not the final say
- Exploratory analysis alone should not be used for generalization/predicting
- Correlation does not imply causation
About inferential analysis
Goal: Use a relatively small sample of data to say something about a bigger population
- Inference is commonly the goal of statistical models
- Inference involves estimating both the quantity you care about and your uncertainty about your estimate
- Inference depends heavily on both the population and the sampling scheme
About predictive analysis
Goal: To use the data on some objects to predict values for another object
- if $X$ predicts $Y$ does not mean that $X$ cause $Y$
- Accurate prediction depends heavily on measuring the right variables
- Although there are better and worse prediction models, more data and simple model works really well
- Predictive is very hard, especially about the future references
About casual analysis
Goal: To find out what happens to one variable when you make another variable change
- Usually randomized studies are required to identify causation
- There are approaches to inferring causation in non-randomized studies, but they are complicated and sensitive to assumptions
- Causal relationships are usually identified as a average effects, but may not apply to every individual
- Causal models are usually the “good standard” for data analysis
About mechanistic analysis
Goal: Understand the extract changes in variables that lead to changes in other variables for individual objects
- Incredibly hard to infer, except in simple situations
- Usually modeled by a deterministic set of equations(physical/engineering science)
- Generally the random components of the data is measurement error
- If the equations are known but the paraments are not, the may be infered with data analysis
近期评论