Foundations of statistical natural language processing

>> Google Books

所蔵情報QRコード

Foundations of statistical natural language processing / Christopher D. Manning, Hinrich Schütze

資料種別:

図書

出版情報:

Cambridge, Mass. ; London : MIT Press, c1999

形態:

xxxvii, 680 p. ; 24 cm

著者名:

ISBN:

9780262133609 [0262133601] (: hc)

書誌ID:

BA41868399

子書誌情報

フルテキスト

所蔵情報

他の版・巻

書誌詳細

注記:

Includes bibliographical references (p. 611-655) and index

主題:

Computational linguistics--Statistical methods

言語:

英語

目次情報:

List of Tables

List of Figures

Table of Notations

Preface

Road Map

Preliminaries / I：

Introduction / 1：

Ratinalist and Empiricist Approaches to Language / 1.1：

Scientific Content / 1.2：

Questions that linguistics should answer / 1.2.1：

Non-categorical phenomena in language / 1.2.2：

Language and cognition as probabilistic phenomena / 1.2.3：

The Ambiguity of Language: Why NLP Is Difficult / 1.3：

Dirty Hands / 1.4：

Lexical resources / 1.4.1：

Word counts / 1.4.2：

Zipf's laws / 1.4.3：

Collocations / 1.4.4：

Concordances / 1.4.5：

Further Reading / 1.5：

Exercises / 1.6：

Mathematical Foundations / 2：

Elementary Probability Theory / 2.1：

Probability spaces / 2.1.1：

Conditional probability and independence / 2.1.2：

Bayes' theorem / 2.1.3：

Random variables / 2.1.4：

Expectation and variance / 2.1.5：

Notation / 2.1.6：

Joint and conditional distributions / 2.1.7：

Determining P / 2.1.8：

Standard distributions / 2.1.9：

Bayesian statistics / 2.1.10：

Essential Information Theory / 2.1.11：

Entropy / 2.2.1：

Joint entropy and conditional entropy / 2.2.2：

Mutual information / 2.2.3：

The noisy channel model / 2.2.4：

Relative entropy or Kullback-Leibler divergence / 2.2.5：

The relation to language: Cross entropy / 2.2.6：

The entropy of English / 2.2.7：

Perplexity / 2.2.8：

Linguistic Essentials / 2.2.9：

Parts of Speech and Morphology / 3.1：

Nouns and pronouns / 3.1.1：

Words that accompany nouns: Determiners and adjectives / 3.1.2：

Verbs / 3.1.3：

Other parts of speech / 3.1.4：

Phrase Structure / 3.2：

Phrase structure grammars / 3.2.1：

Dependency: Arguments and adjuncts / 3.2.2：

X' theory / 3.2.3：

Phrase structure ambiguity / 3.2.4：

Semantics and Pragmatics / 3.3：

Other Areas / 3.4：

Corpus-Based Work / 3.5：

Getting Set Up / 4.1：

Computers / 4.1.1：

Corpora / 4.1.2：

Software / 4.1.3：

Looking at Text / 4.2：

Low-level formatting issues / 4.2.1：

Tokenization: What is a word? / 4.2.2：

Morphology / 4.2.3：

Sentences / 4.2.4：

Marked-up Data / 4.3：

Markup schemes / 4.3.1：

Grammatical tagging / 4.3.2：

Words / 4.4：

Frequency / 5：

Mean and Variance / 5.2：

Hypothesis Testing / 5.3：

The t test / 5.3.1：

Hypothesis testing of differences / 5.3.2：

Pearson's chi-square test / 5.3.3：

Likelihood ratios / 5.3.4：

Mutual Information / 5.4：

The Notion of Collocation / 5.5：

Statistical Inference: n-gram Models over Sparse Data / 5.6：

Bins: Forming Equivalence Classes / 6.1：

Reliability vs. discrimination / 6.1.1：

n-gram models / 6.1.2：

Building n-gram models / 6.1.3：

Statistical Estimators / 6.2：

Maximum Likelihood Estimation (MLE) / 6.2.1：

Laplace's law, Lidstone's law and the Jeffreys-Perks law / 6.2.2：

Held out estimation / 6.2.3：

Cross-validation (deleted estimation) / 6.2.4：

Good-Turing estimation / 6.2.5：

Briefly noted / 6.2.6：

Combining Estimators / 6.3：

Simple linear interpolation / 6.3.1：

Katz's backing-off / 6.3.2：

General linear interpolation / 6.3.3：

Language models for Austen / 6.3.4：

Conclusions / 6.4：

Word Sense Disambiguation / 6.5：

Methodological Preliminaries / 7.1：

Supervised and unsupervised learning / 7.1.1：

Pseudowords / 7.1.2：

Upper and lower bounds on performance / 7.1.3：

Supervised Disambiguation / 7.2：

Bayesian classification / 7.2.1：

An information-theoretic approach / 7.2.2：

Dictionary-Based Disambiguation / 7.3：

Disambiguation based on sense definitions / 7.3.1：

Thesaurus-based disambiguation / 7.3.2：

Disambiguation based on translations in a second-language corpus / 7.3.3：

One sense per discourse, one sense per collocation / 7.3.4：

Unsupervised Disambiguation / 7.4：

What Is a Word Sense? / 7.5：

Lexical Acquisition / 7.6：

Evaluation Measures / 8.1：

Verb Subcategorization / 8.2：

Attachment Ambiguity / 8.3：

Hindle and Rooth (1993) / 8.3.1：

General remarks on PP attachment / 8.3.2：

Selectional Preferences / 8.4：

Semantic Similarity / 8.5：

Vector space measures / 8.5.1：

Probabilistic measures / 8.5.2：

The Role of Lexical Acquisition in Statistical NLP / 8.6：

Grammar / 8.7：

Markov Models / 9：

Hidden Markov Models / 9.1：

Why use HMMs? / 9.2.1：

General form of an HMM / 9.2.2：

The Three Fundamental Questions for HMMs / 9.3：

Finding the probability of an observation / 9.3.1：

Finding the best state sequence / 9.3.2：

The third problem: Parameter estimation / 9.3.3：

HMMs: Implementation, Properties, and Variants / 9.4：

Implementation / 9.4.1：

Variants / 9.4.2：

Multiple input observations / 9.4.3：

Initialization of parameter values / 9.4.4：

Part-of-Speech Tagging / 9.5：

The Information Sources in Tagging / 10.1：

Markov Model Taggers / 10.2：

The probabilistic model / 10.2.1：

The Viterbi algorithm / 10.2.2：

Variations / 10.2.3：

Hidden Markov Model Taggers / 10.3：

Applying HMMs to POS tagging / 10.3.1：

The effect of initialization on HMM training / 10.3.2：

Transformation-Based Learning of Tags / 10.4：

Transformations / 10.4.1：

The learning algorithm / 10.4.2：

Relation to other models / 10.4.3：

Automata / 10.4.4：

Summary / 10.4.5：

Other Methods, Other Languages / 10.5：

Other approaches to tagging / 10.5.1：

Languages other than English / 10.5.2：

Tagging Accuracy and Uses of Taggers / 10.6：

Tagging accuracy / 10.6.1：

Applications of tagging / 10.6.2：

Probabilistic Context Free Grammars / 10.7：

Some Features of PCFGs / 11.1：

Questions for PCFGs / 11.2：

The Probability of a String / 11.3：

Using inside probabilities / 11.3.1：

Using outside probabilities / 11.3.2：

Finding the most likely parse for a sentence / 11.3.3：

Training a PCFG / 11.3.4：

Problems with the Inside-Outside Algorithm / 11.4：

Probabilistic Parsing / 11.5：

Some Concepts / 12.1：

Parsing for disambiguation / 12.1.1：

Treebanks / 12.1.2：

Parsing models vs. language models / 12.1.3：

Weakening the independence assumptions of PCFGs / 12.1.4：

Tree probabilities and derivational probabilities / 12.1.5：

There's more than one way to do it / 12.1.6：

Phrase structure grammars and dependency grammars / 12.1.7：

Evaluation / 12.1.8：

Equivalent models / 12.1.9：

Building parsers: Search methods / 12.1.10：

Use of the geometric mean / 12.1.11：

Some Approaches / 12.2：

Non-lexicalized treebank grammars / 12.2.1：

Lexicalized models using derivational histories / 12.2.2：

Dependency-based models / 12.2.3：

Discussion / 12.2.4：

Applications and Techniques / 12.3：

Statistical Alignment and Machine Translation / 13：

Text Alignment / 13.1：

Aligning sentences and paragraphs / 13.1.1：

Length-based methods / 13.1.2：

Offset alignment by signal processing techniques / 13.1.3：

Lexical methods of sentence alignment / 13.1.4：

Word Alignment / 13.1.5：

Statistical Machine Translation / 13.3：

Clustering / 13.4：

Hierarchical Clustering / 14.1：

Single-link and complete-link clustering / 14.1.1：

Group-average agglomerative clustering / 14.1.2：

An application: Improving a language model / 14.1.3：

Top-down clustering / 14.1.4：

Non-Hierarchical Clustering / 14.2：

K-means / 14.2.1：

The EM algorithm / 14.2.2：

Topics in Information Retrieval / 14.3：

Some Background on Information Retrieval / 15.1：

Common design features of IR systems / 15.1.1：

Evaluation measures / 15.1.2：

The probability ranking principle (PRP) / 15.1.3：

The Vector Space Model / 15.2：

Vector similarity / 15.2.1：

Term weighting / 15.2.2：

Term Distribution Models / 15.3：

The Poisson distribution / 15.3.1：

The two-Poisson model / 15.3.2：

The K mixture / 15.3.3：

Inverse document frequency / 15.3.4：

Residual inverse document frequency / 15.3.5：

Usage of term distribution models / 15.3.6：

Latent Semantic Indexing / 15.4：

Least-squares methods / 15.4.1：

Singular Value Decomposition / 15.4.2：

Latent Semantic Indexing in IR / 15.4.3：

Discourse Segmentation / 15.5：

TextTiling / 15.5.1：

Text Categorization / 15.6：

Decision Trees / 16.1：

Maximum Entropy Modeling / 16.2：

Generalized iterative scaling / 16.2.1：

Application to text categorization / 16.2.2：

Perceptrons / 16.3：

k Nearest Neighbor Classification / 16.4：

Tiny Statistical Tables / 16.5：

Bibliography

Index

List of Tables

List of Figures

Table of Notations

Preface

Road Map

Preliminaries / I：

続きを見る

東工大ブックレビュー

類似資料:

1 図書 Introduction to information retrieval Manning, Christopher D., Raghavan, Prabhakar, Schütze, Hinrich Cambridge University Press	7 図書 Natural language processing Rustin, Randall Algorithmics Press
2 図書統計的自然言語処理の基礎 Manning, Christopher D., Schütze, Hinrich, 加藤, 恒昭, 菊井, 玄一郎, 林, 良彦, 森, 辰則(1964-) 共立出版	8 図書 Handbook of natural language processing Dale, Robert, 1959-, Hermann, Moisl, Somers, Harold Marcel Dekker
3 図書情報検索の基礎 Manning, Christopher D., Raghavan, Prabhakar, Schütze, Hinrich, 岩野, 和生, 黒川, 利明(1948-), 濱田, 誠司, 村上, 明子共立出版	9 図書 Strategies for natural language processing (: pbk) Lehnert, Wendy G., Ringle, Martin L. Erlbaum Associates
4 図書 Connectionist, statistical and symbolic approaches to learning for natural language processing International Joint Conference on Artificial Intelligence, Wermter, Stefan, Riloff, Ellen, Scheler, Gabriele Springer	10 図書 Natural language processing Pereira, Fernando C. N., Grosz, Barbara J. MIT Press
5 電子ブック Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing Wermter, Riloff, Ellen, Scheler, Gabriele SpringerLink Books - AutoHoldings, Springer Berlin Heidelberg	11 電子ブック Aspects of Natural Language Processing Hutchison, David, Kanade, Takeo Springer eBooks Computer Science, Springer Berlin Heidelberg
6 図書 Computer interpretation of natural language descriptions (: uk ; : us) Mellish, C. S. (Christopher S.), 1954- E. Horwood	12 電子ブック Advances in Natural Language Processing Przepi?rkowski Springer eBooks Computer Science, Springer International Publishing