Text analysis

  1. Text statistics. [slides] [nbviewer] [ipynb] [vldb.txt]
  2. Good_Turing Smoothing. [nbviewer] [ipynb]
  3. Boolean retrieval model Slides (pdf)
  4. Vector space model. [slides] [nbviewer] [ipynb]
  5. Word embedding [nbviewer] [ipynb] [WS3535 nbviewer] [WS353 ipynb] WS353-Sim.txt
  6. Classification Naive Bayes slides (pdf), [nbviewer], [vldb_train]. [icse_train]. [vldb_test]. [icse_test].
  7. Evaluation. slides. chapter 8
  8. languagde modelling slides . chapter 12
  9. Co-occurrence. [nbviewer] [ipynb]
  10. LSI(Latent Semantic Indexing) and SVD(Singular Value Decomposition) [slides] ipynb chapter 18 IIR web page
  11. Text processing basics (slides) .

Graph Analysis

  • Graph embedding [nbviewer] [ipynb]

    Search Engine Construction

    Crawling

    Resources

    Text Book

    Other reference books:

    Tools