2022-01-26#

Lead Scribe: Lily

Admin#

  • sorry about notes

  • private github repo –> Spring 2022

  • grading contract FYI

    • Will be given further instructions on ways to achieve an ‘A’ or ‘B’

      • To get a ‘B’ you will only need to complete the paper and presentation

      • To get an ‘A’ you will implement a project (translation)

    • Paper and presentation will be assigned

      • Paper –> CS Conference Style

      • Draft due: last day before the presentation, will be posted for the class to review

Opening Question#

What kind of data are you most in working with?

  • Class response:

    • GIS data

    • Linguistic data (tweets, reddit posts)

    • Numerical data (tabular)

    • Video/Image

    • Time series

    • EHR/Medical related data

    • NLP

    • tabular/survey

How to Read a Paper#

Model Based ML#

  • Discrete probabilities (distributions introduced in murder mystery chapter)

  • Bernoulli

  • Priors (probablistic guess about a random variable)

    • Are useful for working with less data to create strong inferences

      • Working with things when not a lot of data is available

    • Assumptions, expressed in a probability distribution

  • Posterior

    • Inference given regularizer: Likelihood…

    • Most common posterior probability distribution we’re doing: Probability of parameters given data

  • Point Estimate

    • This are the single values produced after training (weights)

    • Posterior mean example of point estimate

  • Most of the probability distributions we’ll use belong to the exponential family

  • Conditional Probability

    • One for each value of the conditioning variable

    • (e.g.) Murder mystery –> murderer variable can be Grey or Auburn example marginal distribution from murder mystery text

  • Marginal probability

  • Maximum Likelihood Estimation

    • Assume a distribution, our goal will be to find the theta (parameter)

    • Maximizing, find parameters that will give us the highest probability (finding the one–parameter–that fits best)

  • elicitation - an interdisciplinary field in statistics and psychology; study of how to get an expert’s distribution for how likely an event is to occur.

Prepare for next class#

  • Order of the weekly topics may change

  • Dr. Brown will present next week, but we’ll start rotating the following week

  • There are (2) readings, bring questions and prepare

Learning & Evaluation#

  • Read through the whole Learning and Evaluation Page after I post a notification to, there are some fixes to be made

  • Bring Questions to class next week

  • Be ready to work on your grading contract

Reading#

The Scientific Method in the Science of Machine Learning and Value-laden Disciplinary Shifts in Machine Learning

#