Latent Variable Models: GMM & Topic Models

Latent Variable Models: GMM & Topic Models#

Gaussian Mixture models#

  • key points:

    • model versus algorithm

  • algorithm:

    • initialize

    • Estep

    • Mstep

    • until convergence:

      • parameters stop changing, assignments stop changing

  • Covariance types:

    • covers weakness in kmeans

Topic Models#

  • corpora: collection of documents

  • text modeling, was classically binary matrices

    • also tf-idf

    • useful for dicsriminating docuemtns,

    • lacks meaning

  • pLSI: probabilistic, latent semantic indexing

    • reminiscnet of GMM

    • assumes exchangeability

    • mixutre components