Information Retrieval – Book by Christopher Manning et al

The link is here, contents:

Classical textbook kind of book such as this one is worthy of browsing to have an comprehensive view of this domain. 1. Boolean retrieval:
extended boolean model versus ranked retrieval
2. The term vocabulary and postings lists
document delineation and character sequence decoding
tokenization
dropping common terms: stop words
normalization (equivalence classing of terms)
stemming and lemmatization
faster postings list intersection via skip pointers
positional postings and phrase queries
3. Dictionaries and tolerant retrieval
search structure of dictionaries
wildcard queries
k-gram indexes for wildcard queries
Spelling correction
Phonetic correction

4. Index Construction
hardware basics
blocked sort-based indexing
single-pass in memory indexing
distributed indexing
dynamic indexing
other types of indexes
5. Index compression
statistical properties of terms in information retrieval
dictionary compression
postings file compression
6. Scoring, term weighting and the vector space model
parametric and zone indexes
term frequency and weighting
the vector space model for scoring
variant tf-idf functions
sublinear tf scaling
maximum tf normalization
document and query weighting schemes
pivoted normalized document length
7. Computing scores in a complete search system
efficient scoring and ranking
components of an information retrieval system
vector space scoring and query operator interaction
8. Evaluation in information retrieval
9. Relevance feedback and query expansion
global methods for query reformulation
10. XML retrieval
11. Probabilistic information retrieval
12. Language models for information retrieval
13. Text Classification and Naive Bayes
14. Vector Space classification
15. Support vector machines and machine learning on documents
16. Flat clustering
17. Hierarchical clustering
18. Matrix decompositions and latent semantic indexing
19. Web search basics
20. Web crawling and indexes
21. Link analysis
Bibliography

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.