Search records | 東京工業大学附属図書館蔵書検索

図書

1. Approximate dynamic programming : solving the curses of dimensionality. 2nd ed (: hard)

Warren B. Powell

出版情報:	Hoboken, N.J. : Wiley, c2011 xviii, 627 p. ; 25 cm
シリーズ名:	Wiley series in probability and mathematical statistics
子書誌情報:	loading…
所蔵情報:	loading…

目次情報: 続きを見る

Preface to the Second Edition

Preface to the First Edition

Acknowledgments

The Challenges of Dynamic Programming / 1：

A Dynamic Programming Example: A Shortest Path Problem / 1.1：

The Three Curses of Dimensionality / 1.2：

Some Real Applications / 1.3：

Problem Classes / 1.4：

The Many Dialects of Dynamic Programming / 1.5：

What Is New in This Book? / 1.6：

Pedagogy / 1.7：

Bibliographic Notes / 1.8：

Some Illustrative Models / 2：

Deterministic Problems / 2.1：

Stochastic Problems / 2.2：

Information Acquisition Problems / 2.3：

A Simple Modeling Framework for Dynamic Programs / 2.4：

Problems / 2.5：

Introduction to Markov Decision Processes / 3：

The Optimality Equations / 3.1：

Finite Horizon Problems / 3.2：

Infinite Horizon Problems / 3.3：

Value Iteration / 3.4：

Policy Iteration / 3.5：

Hybrid Value-Policy Iteration / 3.6：

Average Reward Dynamic Programming / 3.7：

The Linear Programming Method for Dynamic Programs / 3.8：

Monotone Policies* / 3.9：

Why Does It Work?** / 3.10：

Introduction to Approximate Dynamic Programming / 3.11：

The Three Curses of Dimensionality (Revisited) / 4.1：

The Basic Idea / 4.2：

Q-Learning and SARSA / 4.3：

Real-Time Dynamic Programming / 4.4：

Approximate Value Iteration / 4.5：

The Post-Decision State Variable / 4.6：

Low-Dimensional Representations of Value Functions / 4.7：

So Just What Is Approximate Dynamic Programming? / 4.8：

Experimental Issues / 4.9：

But Does It Work? / 4.10：

Modeling Dynamic Programs / 4.11：

Notational Style / 5.1：

Modeling Time / 5.2：

Modeling Resources / 5.3：

The States of Our System / 5.4：

Modeling Decisions / 5.5：

The Exogenous Information Process / 5.6：

The Transition Function / 5.7：

The Objective Function / 5.8：

A Measure-Theoretic View of Information** / 5.9：

Policies / 5.10：

Myopic Policies / 6.1：

Lookahead Policies / 6.2：

Policy Function Approximations / 6.3：

Value Function Approximations / 6.4：

Hybrid Strategies / 6.5：

Randomized Policies / 6.6：

How to Choose a Policy? / 6.7：

Policy Search / 6.8：

Background / 7.1：

Gradient Search / 7.2：

Direct Policy Search for Finite Alternatives / 7.3：

The Knowledge Gradient Algorithm for Discrete Alternatives / 7.4：

Simulation Optimization / 7.5：

Approximating Value Functions / 7.6：

Lookup Tables and Aggregation / 8.1：

Parametric Models / 8.2：

Regression Variations / 8.3：

Nonparametric Models / 8.4：

Approximations and the Curse of Dimensionality / 8.5：

Learning Value Function Approximations / 8.6：

Sampling the Value of a Policy / 9.1：

Stochastic Approximation Methods / 9.2：

Recursive Least Squares for Linear Models / 9.3：

Temporal Difference Learning with a Linear Model / 9.4：

Bellman's Equation Using a Linear Model / 9.5：

Analysis of TD(0), LSTD, and LSPE Using a Single State / 9.6：

Gradient-Based Methods for Approximate Value Iteration* / 9.7：

Least Squares Temporal Differencing with Kernel Regression* / 9.8：

Value Function Approximations Based on Bayesian Learning* / 9.9：

Why Does It Work* / 9.10：

Optimizing While Learning / 9.11：

Overview of Algorithmic Strategies / 10.1：

Approximate Value Iteration and Q-Learning Using Lookup Tables / 10.2：

Statistical Bias in the Max Operator / 10.3：

Approximate Value Iteration and Q-Learning Using Linear Models / 10.4：

Approximate Policy Iteration / 10.5：

The Actor-Critic Paradigm / 10.6：

Policy Gradient Methods / 10.7：

The Linear Programming Method Using Basis Functions / 10.8：

Approximate Policy Iteration Using Kernel Regression* / 10.9：

Finite Horizon Approximations for Steady-State Applications / 10.10：

Adaptive Estimation and Stepsizes / 10.11：

Learning Algorithms and Stepsizes / 11.1：

Deterministic Stepsize Recipes / 11.2：

Stochastic Stepsizes / 11.3：

Optimal Stepsizes for Nonstationary Time Series / 11.4：

Optimal Stepsizes for Approximate Value Iteration / 11.5：

Convergence / 11.6：

Guidelines for Choosing Stepsize Formulas / 11.7：

Exploration Versus Exploitation / 11.8：

A Learning Exercise: The Nomadic Trucker / 12.1：

An Introduction to Learning / 12.2：

Heuristic Learning Policies / 12.3：

Gittins Indexes for Online Learning / 12.4：

The Knowledge Gradient Policy / 12.5：

Learning with a Physical State / 12.6：

Value Function Approximations for Resource Allocation Problems / 12.7：

Value Functions versus Gradients / 13.1：

Linear Approximations / 13.2：

Piecewise-Linear Approximations / 13.3：

Solving a Resource Allocation Problem Using Piecewise-Linear Functions / 13.4：

The SHAPE Algorithm / 13.5：

Regression Methods / 13.6：

Cutting Planes* / 13.7：

Dynamic Resource Allocation Problems / 13.8：

An Asset Acquisition Problem / 14.1：

The Blood Management Problem / 14.2：

A Portfolio Optimization Problem / 14.3：

A General Resource Allocation Problem / 14.4：

A Fleet Management Problem / 14.5：

A Driver Management Problem / 14.6：

Implementation Challenges / 14.7：

Will ADP Work for Your Problem? / 15.1：

Designing an ADP Algorithm for Complex Problems / 15.2：

Debugging an ADP Algorithm / 15.3：

Practical Issues / 15.4：

Modeling Your Problem / 15.5：

Online versus Offline Models / 15.6：

If It Works, Patent It! / 15.7：

Bibliography

Index

Preface to the Second Edition

Preface to the First Edition

Acknowledgments

図書

2. Dynamic programming and optimal control. 4th ed (v. 1)

Dimitri P. Bertsekas

出版情報:	Belmont, Mass. : Athena Scientific, c2017 xix, 555 p. ; 24 cm
シリーズ名:	Athena Scientific optimization and computation series
子書誌情報:	loading…
所蔵情報:	loading…

文献の複写および貸借の依頼を行う

文献複写・貸借依頼

絞り込み条件: