types of questions About descriptive analysis About exploratory analysis About inferential analysis About predictive analysis About casual analysis About mechanistic analysis



  • Description (描述性分析)
  • Exploratory (探索性分析)
  • Inferential (推断分析)
  • Predictive (预测分析)
  • Causal (因果分析)
  • Mechanistic (机理分析)

About descriptive analysis

Goal: Describe a set of data

  • The first kind of data analysis performed
  • Commonly applied to census data
  • The description and interpretation are different steps
  • Descriptions can usually not be generalized without additional statistical modeling

About exploratory analysis

Goal: Find relationships you didn’t know about

  • Exploratory models are good for discovering new connections
  • They are also useful for defining future studies
  • Exploratory analysis are usually not the final say
  • Exploratory analysis alone should not be used for generalization/predicting
  • Correlation does not imply causation

About inferential analysis

Goal: Use a relatively small sample of data to say something about a bigger population

  • Inference is commonly the goal of statistical models
  • Inference involves estimating both the quantity you care about and your uncertainty about your estimate
  • Inference depends heavily on both the population and the sampling scheme

About predictive analysis

Goal: To use the data on some objects to predict values for another object

  • if $X$ predicts $Y$ does not mean that $X$ cause $Y$
  • Accurate prediction depends heavily on measuring the right variables
  • Although there are better and worse prediction models, more data and simple model works really well
  • Predictive is very hard, especially about the future references

About casual analysis

Goal: To find out what happens to one variable when you make another variable change

  • Usually randomized studies are required to identify causation
  • There are approaches to inferring causation in non-randomized studies, but they are complicated and sensitive to assumptions
  • Causal relationships are usually identified as a average effects, but may not apply to every individual
  • Causal models are usually the “good standard” for data analysis

About mechanistic analysis

Goal: Understand the extract changes in variables that lead to changes in other variables for individual objects

  • Incredibly hard to infer, except in simple situations
  • Usually modeled by a deterministic set of equations(physical/engineering science)
  • Generally the random components of the data is measurement error
  • If the equations are known but the paraments are not, the may be infered with data analysis