Missing Data 3

Missing Data 3#

Handling Missing Data in Decision: A probabilistic approach#

key ideas#

A decision tree’s structure and notation
Review of imputation
- Predictive value imputation
  - mean, median or mode
  - make assumption that features are indpendent
  - surrogate splits, partition data using another feature to
XG Boost

Expected Predictions:

impute all possible completions as once to avoid strong dist assumptions
consistent for MCAR and MAR
expensive, but density can help reduce
tractably compute the exact expected predictions
loss minimization

Experiments

for a single dataset, outperforms in general

Discussion#

generally easier
given single dataset, of results, how much do we trust this?
what does this provide as an advantage
NP hard

How to miss data?: Reinformcent leanring for environgments iwith high obseration cost#

Key points#

Reinforcement learning

cost associated with making accurate observations
goal directed
RL agent tries to

Problem setting:

\(o \sim p_0(o_t |s_t; \beta)\)
beta is accuracy og obs
r is old reward

Scenario A:

observed cangle vs

Big picture: manipulating how the data collection

Discussion#

survivorship bias?
right left imbalance for figure 3
simple pendulum example helped overcome the background lacking
figures

General#

Try writing out a missingness graph for a problem of choice, some scenario where you imagine there would be missing data, or an example dataset that you can find.

Missing Data 3

Contents

Missing Data 3#

Handling Missing Data in Decision: A probabilistic approach#

key ideas#

Discussion#

How to miss data?: Reinformcent leanring for environgments iwith high obseration cost#

Key points#

Discussion#

General#