previous  home  search  LaTeX (44kb)   PostScript (226kb)   PDF (179kb)   Html/Gif   contact    up    next  

Convergence and Error Bounds for Universal Prediction of Nonbinary Sequences


Author: Marcus Hutter (2001)
Comments: 12 pages
Subj-class: Learning; Artificial Intelligence; Computational Complexity
ACM-class: F.2.3
Reference: Proceedings of the 12th European Conference on Machine Learning (ECML-2001) 239-250
Report-no: IDSIA-07-01 and cs.LG/0106036
Paper: LaTeX (44kb)  -  PostScript (226kb)  -  PDF (179kb)  -  Html/Gif
Slides: PostScript - PDF

Keywords: Bayesian sequence prediction; Solomonoff induction; Kolmogorov complexity; learning; universal probability; finite non-binary alphabet; convergence; error bounds; games of chance; partial and delayed prediction; classification.

Abstract: Solomonoff's uncomputable universal prediction scheme ß allows to predict the next symbol xk of a sequence x1...xk-1 for any Turing computable, but otherwise unknown, probabilistic environment µ. This scheme will be generalized to arbitrary environmental classes, which, among others, allows the construction of computable universal prediction schemes ß. Convergence of ß to µ in a conditional mean squared sense and with µ probability 1 is proven. It is shown that the average number of prediction errors made by the universal ß scheme rapidly converges to those made by the best possible informed µ scheme. The schemes, theorems and proofs are given for general finite alphabet, which results in additional complications as compared to the binary case. Several extensions of the presented theory and results are outlined. They include general loss functions and bounds, games of chance, infinite alphabet, partial and delayed prediction, classification, and more active systems.

Reviewer: Resolves a 25 year open problem in a central prediction method for time series in the affirmative. Textbook quality--definite accept.

 previous  home  search  LaTeX (44kb)   PostScript (226kb)   PDF (179kb)   Html/Gif   contact    up    next  

Table of Contents

  1. Introduction
  2. Setup
    • Strings and Probability Distributions
    • Universal Prior Probability Distribution
    • Probability Classes
  3. Convergence
    • Upper Bound for the Relative Entropy
    • Lower Bound for the Relative Entropy
    • Convergence of ß to µ
    • Theorem 1 (Convergence)
  4. Error Bounds
    • Total Expected Numbers of Errors
    • Error Bound
    • Theorem 2 (Error bound)
    • Proof of Theorem 2
  5. Generalizations
    • General Loss Function
    • Games of Chance
    • Infinite Alphabet
    • Partial Prediction, Delayed Prediction, Classification
    • More Active Systems
    • Miscellaneous
  6. Summary
 previous  home  search  LaTeX (44kb)   PostScript (226kb)   PDF (179kb)   Html/Gif   contact    up    next  

BibTeX Entry

@Article{Hutter:01alpha,
  author =       "Marcus Hutter",
  number =       "IDSIA-07-01",
  institution =  "Istituto Dalle Molle di Studi sull'Intelligenza Artificiale",
  address =      "Manno(Lugano), CH",
  title =        "Convergence and Error bounds for Universal Prediction of Nonbinary Sequences",
  journal =      "Proceedings of the 12th European Conference on
                  Machine Learning (ECML-2001)",
  editor =       "Luc De Raedt and Peter Flach",
  publisher =    "Springer",
  series =       "Lecture Notes in Artificial Intelligence (LNAI 2167)",
  month =         sep,
  year =         "2001",
  pages =        "239--250",
  ftp =          "ftp://ftp.idsia.ch/pub/techrep/IDSIA-07-01.ps.gz",
  url =          "http://www.hutter1.net/ai/palpha.htm",
  url2 =         "http://arxiv.org/abs/cs.LG/0106036",
  keywords =     "Induction; Solomonoff, Bayesian, deterministic
                  prediction; Kolmogorov complexity; leaning; Loss function;
                  algorithmic information theory; universal probability",
  abstract =     "Solomonoff's uncomputable universal prediction scheme $\xi$ allows
                  to predict the next symbol $x_k$ of a sequence $x_1...x_{k-1}$ for
                  any Turing computable, but otherwise unknown, probabilistic
                  environment $\mu$. This scheme will be generalized to arbitrary
                  environmental classes, which, among others, allows the
                  construction of computable universal prediction schemes $\xi$.
                  Convergence of $\xi$ to $\mu$ in a conditional mean squared sense
                  and with $\mu$ probability $1$ is proven. It is shown that the
                  average number of prediction errors made by the universal $\xi$
                  scheme rapidly converges to those made by the best possible
                  informed $\mu$ scheme. The schemes, theorems and proofs are given
                  for general finite alphabet, which results in additional
                  complications as compared to the binary case.
                  Several extensions of the presented theory and
                  results are outlined. They include general loss functions and
                  bounds, games of chance, infinite alphabet, partial and delayed
                  prediction, classification, and more active
                  systems.",
}
      
 previous  home  search  LaTeX (44kb)   PostScript (226kb)   PDF (179kb)   Html/Gif   contact    up    next