Convergence and Error Bounds for Universal Prediction of Nonbinary Sequences (Marcus Hutter)

previous home search

LaTeX (44kb) PostScript (226kb) PDF (179kb) Html/Gif

contact up next

Convergence and Error Bounds for Universal Prediction of Nonbinary Sequences

Author: Marcus Hutter (2001)

Comments: 12 pages

Subj-class: Learning; Artificial Intelligence; Computational Complexity

ACM-class: F.2.3

Reference: Proceedings of the 12th European Conference on Machine Learning (ECML-2001) 239-250

Report-no: IDSIA-07-01 and cs.LG/0106036

Paper: LaTeX (44kb) - PostScript (226kb) - PDF (179kb) - Html/Gif

Slides: PostScript - PDF

Keywords: Bayesian sequence prediction; Solomonoff induction; Kolmogorov complexity; learning; universal probability; finite non-binary alphabet; convergence; error bounds; games of chance; partial and delayed prediction; classification.

Abstract: Solomonoff's uncomputable universal prediction scheme ß allows to predict the next symbol x_k of a sequence x₁...x_k-1 for any Turing computable, but otherwise unknown, probabilistic environment µ. This scheme will be generalized to arbitrary environmental classes, which, among others, allows the construction of computable universal prediction schemes ß. Convergence of ß to µ in a conditional mean squared sense and with µ probability 1 is proven. It is shown that the average number of prediction errors made by the universal ß scheme rapidly converges to those made by the best possible informed µ scheme. The schemes, theorems and proofs are given for general finite alphabet, which results in additional complications as compared to the binary case. Several extensions of the presented theory and results are outlined. They include general loss functions and bounds, games of chance, infinite alphabet, partial and delayed prediction, classification, and more active systems.

Reviewer: Resolves a 25 year open problem in a central prediction method for time series in the affirmative. Textbook quality--definite accept.

previous home search

LaTeX (44kb) PostScript (226kb) PDF (179kb) Html/Gif

contact up next

Table of Contents

Introduction
Setup

Strings and Probability Distributions

Universal Prior Probability Distribution

Probability Classes

Convergence

Upper Bound for the Relative Entropy

Lower Bound for the Relative Entropy

Convergence of ß to µ

Theorem 1 (Convergence)

Error Bounds

Total Expected Numbers of Errors

Error Bound

Theorem 2 (Error bound)

Proof of Theorem 2

Generalizations

General Loss Function

Games of Chance

Infinite Alphabet

Partial Prediction, Delayed Prediction, Classification

More Active Systems

Miscellaneous

Summary

previous home search

LaTeX (44kb) PostScript (226kb) PDF (179kb) Html/Gif

contact up next

BibTeX Entry

@Article{Hutter:01alpha,
  author =       "Marcus Hutter",
  number =       "IDSIA-07-01",
  institution =  "Istituto Dalle Molle di Studi sull'Intelligenza Artificiale",
  address =      "Manno(Lugano), CH",
  title =        "Convergence and Error bounds for Universal Prediction of Nonbinary Sequences",
  journal =      "Proceedings of the 12th European Conference on
                  Machine Learning (ECML-2001)",
  editor =       "Luc De Raedt and Peter Flach",
  publisher =    "Springer",
  series =       "Lecture Notes in Artificial Intelligence (LNAI 2167)",
  month =         sep,
  year =         "2001",
  pages =        "239--250",
  ftp =          "ftp://ftp.idsia.ch/pub/techrep/IDSIA-07-01.ps.gz",
  url =          "http://www.hutter1.net/ai/palpha.htm",
  url2 =         "http://arxiv.org/abs/cs.LG/0106036",
  keywords =     "Induction; Solomonoff, Bayesian, deterministic
                  prediction; Kolmogorov complexity; leaning; Loss function;
                  algorithmic information theory; universal probability",
  abstract =     "Solomonoff's uncomputable universal prediction scheme $\xi$ allows
                  to predict the next symbol $x_k$ of a sequence $x_1...x_{k-1}$ for
                  any Turing computable, but otherwise unknown, probabilistic
                  environment $\mu$. This scheme will be generalized to arbitrary
                  environmental classes, which, among others, allows the
                  construction of computable universal prediction schemes $\xi$.
                  Convergence of $\xi$ to $\mu$ in a conditional mean squared sense
                  and with $\mu$ probability $1$ is proven. It is shown that the
                  average number of prediction errors made by the universal $\xi$
                  scheme rapidly converges to those made by the best possible
                  informed $\mu$ scheme. The schemes, theorems and proofs are given
                  for general finite alphabet, which results in additional
                  complications as compared to the binary case.
                  Several extensions of the presented theory and
                  results are outlined. They include general loss functions and
                  bounds, games of chance, infinite alphabet, partial and delayed
                  prediction, classification, and more active
                  systems.",
}

previous home search

LaTeX (44kb) PostScript (226kb) PDF (179kb) Html/Gif

contact up next

Author:	Marcus Hutter (2001)
Comments:	12 pages
Subj-class:	Learning; Artificial Intelligence; Computational Complexity
ACM-class:	F.2.3
Reference:	Proceedings of the 12th European Conference on Machine Learning (ECML-2001) 239-250
Report-no:	IDSIA-07-01 and cs.LG/0106036
Paper:	LaTeX (44kb) - PostScript (226kb) - PDF (179kb) - Html/Gif
Slides:	PostScript - PDF