Table of Notation |
Preface |
Boolean retrieval / 1: |
An example information retrieval problem / 1.1: |
A first take at building an inverted index / 1.2: |
Processing Boolean queries / 1.3: |
The extended Boolean model versus ranked retrieval / 1.4: |
References and further reading / 1.5: |
The term vocabulary and postings lists / 2: |
Document delineation and character sequence decoding / 2.1: |
Determining the vocabulary of terms / 2.2: |
Faster postings list intersection via skip pointers / 2.3: |
Positional postings and phrase queries / 2.4: |
Dictionaries and tolerant retrieval / 2.5: |
Search structures for dictionaries / 3.1: |
Wildcard queries / 3.2: |
Spelling correction / 3.3: |
Phonetic correction / 3.4: |
Index construction / 3.5: |
Hardware basics / 4.1: |
Blocked sort-based indexing / 4.2: |
Single-pass in-memory indexing / 4.3: |
Distributed indexing / 4.4: |
Dynamic indexing / 4.5: |
Other types of indexes / 4.6: |
Index compression / 4.7: |
Statistical properties of terms in information retrieval / 5.1: |
Dictionary compression / 5.2: |
Postings file compression / 5.3: |
Scoring, term weighting, and the vector space model / 5.4: |
Parametric and zone indexes / 6.1: |
Term frequency and weighting / 6.2: |
The vector space model for scoring / 6.3: |
Variant tf-idf functions / 6.4: |
Computing scores in a complete search system / 6.5: |
Efficient scoring and ranking / 7.1: |
Components of an information retrieval system / 7.2: |
Vector space scoring and query operator interaction / 7.3: |
Evaluation in information retrieval / 7.4: |
Information retrieval system evaluation / 8.1: |
Standard test collections / 8.2: |
Evaluation of unranked retrieval sets / 8.3: |
Evaluation of ranked retrieval results / 8.4: |
Assessing relevance / 8.5: |
A broader perspective: System quality and user utility / 8.6: |
Results snippets / 8.7: |
Relevance feedback and query expansion / 8.8: |
Relevance feedback and pseudo relevance feedback / 9.1: |
Global methods for query reformulation / 9.2: |
XML retrieval / 9.3: |
Basic XML concepts / 10.1: |
Challenges in XML retrieval / 10.2: |
A vector space model for XML retrieval / 10.3: |
Evaluation of XML retrieval / 10.4: |
Text-centric versus data-centric XML retrieval / 10.5: |
Probabilistic information retrieval / 10.6: |
Review of basic probability theory / 11.1: |
The probability ranking principle / 11.2: |
The binary independence model / 11.3: |
An appraisal and some extensions / 11.4: |
Language models for information retrieval / 11.5: |
Language models / 12.1: |
The query likelihood model / 12.2: |
Language modeling versus other approaches in information retrieval / 12.3: |
Extended language modeling approaches / 12.4: |
Text classification and Naive Bayes / 12.5: |
The text classification problem / 13.1: |
Naive Bayes text classification / 13.2: |
The Bernoulli model / 13.3: |
Properties of Naive Bayes / 13.4: |
Feature selection / 13.5: |
Evaluation of text classification / 13.6: |
Vector space classification / 13.7: |
Document representations and measures of relatedness in vector spaces / 14.1: |
Rocchio classification / 14.2: |
k nearest neighbor / 14.3: |
Linear versus nonlinear classifiers / 14.4: |
Classification with more than two classes / 14.5: |
The bias-variance tradeoff / 14.6: |
Support vector machines and machine learning on documents / 14.7: |
Support vector machines: The linearly separable case / 15.1: |
Extensions to the support vector machine model / 15.2: |
Issues in the classification of text documents / 15.3: |
Machine-learning methods in ad hoc information retrieval / 15.4: |
Flat clustering / 15.5: |
Clustering in information retrieval / 16.1: |
Problem statement / 16.2: |
Evaluation of clustering / 16.3: |
K-means / 16.4: |
Model-based clustering / 16.5: |
Hierarchical clustering / 16.6: |
Hierarchical agglomerative clustering / 17.1: |
Single-link and complete-link clustering / 17.2: |
Group-average agglomerative clustering / 17.3: |
Centroid clustering / 17.4: |
Optimality of hierarchical agglomerative clustering / 17.5: |
Divisive clustering / 17.6: |
Cluster labeling / 17.7: |
Implementation notes / 17.8: |
Matrix decompositions and latent semantic indexing / 17.9: |
Linear algebra review / 18.1: |
Term-document matrices and singular value decompositions / 18.2: |
Low-rank approximations / 18.3: |
Latent semantic indexing / 18.4: |
Web search basics / 18.5: |
Background and history / 19.1: |
Web characteristics / 19.2: |
Advertising as the economic model / 19.3: |
The search user experience / 19.4: |
Index size and estimation / 19.5: |
Near-duplicates and shingling / 19.6: |
Web crawling and indexes / 19.7: |
Overview / 20.1: |
Crawling / 20.2: |
Distributing indexes / 20.3: |
Connectivity servers / 20.4: |
Link analysis / 20.5: |
The Web as a graph / 21.1: |
PageRank / 21.2: |
Hubs and authorities / 21.3: |
Bibliography / 21.4: |
Index |
Table of Notation |
Preface |
Boolean retrieval / 1: |
An example information retrieval problem / 1.1: |
A first take at building an inverted index / 1.2: |
Processing Boolean queries / 1.3: |