0
\eeq
dominates all $\nu\in\M$. This dominance is
necessary for the desired convergence $\xi\to\mu$ similarly to
(\ref{eukdist}). The question is what properties $\xi$ possesses.
The distinguishing property of $\M_{enum}^{semi}$ is that $\xi$ is
itself an element of $\M_{enum}^{semi}$. When concerned with
predictions, $\xi_\M\in\M$ is not by itself an important property,
but whether $\xi$ is computable in one of the senses of Definition
\ref{defCompFunc}. We define
\bqan
\M_1\geqm\M_2 & :\Leftrightarrow &
\mbox{there is an element of $\M_1$ that dominates all elements of
$\M_2$} \\
& :\Leftrightarrow &
\exists\rho\!\in\!\M_1\;\forall\nu\!\in\!\M_2\;\exists w_\nu\!>\!0
\;\forall x:\rho(x)\!\geq\!w_\nu\nu(x).
\eqan
$\geqm $ is transitive (but not necessarily reflexive) in the
sense that $\M_1 \geqm \M_2 \geqm \M_3$ implies $\M_1 \geqm \M_3$
and $\M_0 \supseteq \M_1 \geqm \M_2 \supseteq \M_3$ implies $\M_0
\geqm \M_3$.
%
For the computability concepts introduced in Section~\ref{secCC}
we have the following proper set inclusions
\beqn
\begin{array}{ccccccc}
\M_{rec}^{msr} & \subset & \M_{est}^{msr} & \equiv & \M_{enum}^{msr} & \subset & \M_{appr}^{msr} \\
\cap & & \cap & & \cap & & \cap \\
\M_{rec}^{semi} & \subset & \M_{est}^{semi} & \subset & \M_{enum}^{semi} & \subset & \M_{appr}^{semi}
\end{array}
\eeqn
%
where $\M^{msr}_c$ stands for the set of all probability measures
of appropriate computability type $c\in\{$rec=recursive, est=estimable, enum=enumerable,
appr=approximable$\}$, and similarly for semimeasures
$\M^{semi}_c$. From an enumeration of a measure $\rho$ one can
construct a co-enumeration by exploiting
$\rho(x_{1:n})=1-\sum_{y_{1:n}\neq x_{1:n}}\rho(y_{1:n})$. This
shows that every enumerable measure is also co-enumerable, hence
estimable, which proves the identity $\equiv$ above.
With this notation, Theorem~\ref{thUniM} implies
$\M_{enum}^{semi}\geqm\M_{enum}^{semi}$. Transitivity allows to
conclude, for instance, that
$\M_{appr}^{semi}\geqm\M_{rec}^{msr}$, i.e.\ that there is an
approximable semimeasure that dominates all recursive measures.
The standard ``diagonalization'' way of proving
$\M_1\ngeqm\M_2$ is to take an arbitrary
$\mu\in\M_1$ and ``increase'' it to $\rho$ such that
$\mu\ngeqm\rho$ and show that $\rho\in\M_2$.
There are $7\times 7$ combinations of (semi)measures $\M_1$ with
$\M_2$ for which $\M_1\geqm\M_2$ could be true or false. There are
four basic cases, explicated in the following theorem, from which
the other 49 combinations displayed in Table~\ref{tabUniSMsr}
follow by transitivity.
%------------------------------%
\ftheorem{thNoUniApp}{Universal (semi)measures}{
%------------------------------%
A semimeasure $\rho$ is said to be universal for $\M$ if it
multiplicatively dominates all elements of $\M$ in the sense
$\forall\nu\exists w_\nu>0:\rho(x)\geq w_\nu\nu(x)\forall x$. The
following holds true:
\begin{list}{}{\parsep=1ex}
\item[$o)$]
$\exists\rho:\{\rho\}\geqm\M$: For every countable set
of (semi)measures $\M$, there is a (semi)measure that dominates
all elements of $\M$.
\item[$i)$]
$\M_{enum}^{semi}\geqm\M_{enum}^{semi}$:
The class of enumerable semimeasures {\em contains}
a universal element.
\item[$ii)$]
$\M_{appr}^{msr}\geqm\M_{enum}^{semi}$:
There {\em is} an approximable measure that dominates all enumerable
semimeasures.
\item[$iii)$]
$\M_{est}^{semi}\ngeqm\M_{rec}^{msr}$: There is
{\em no} estimable semimeasure that dominates all recursive
measures.
\item[$iv)$]
$\M_{appr}^{semi}\ngeqm\M_{appr}^{msr}$: There is
{\em no} approximable semimeasure that dominates all approximable
measures.
\end{list}
}%------------------------------%
\begin{table}[thb]
\ftablex{tabUniSMsr}{Existence of universal (semi)measures}{%
The entry in row $r$ and column $c$ indicates whether there is an
$r$-able (semi)measure $\rho$ dominating the set $\M$ that contains all
$c$-able (semi)measures, where $r,c\in\{$recurs, estimat, enumer,
approxim$\}$. Enumerable measures are estimable. This is the
reason why the enum.\ row and column in case of measures are
missing. The superscript indicates from which part of Theorem
\ref{thNoUniApp} the answer follows. For the bold face entries
directly, for the others using transitivity of $\geqm $.
\begin{center}
\begin{tabular}{|c|c||c|c|c|c||c|c|c|}\hline
$\nwarrow$ & $\M$ & \multicolumn{4}{c||}{semimeasure} & \multicolumn{3}{c|}{measure}\\ \hline
$\rho$&$\searrow$& rec. & est. & enum. & appr. & rec. & est. & appr. \\ \hline\hline
s & rec. & no$^{iii}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ \\ \cline{2-9}
e & est. & no$^{iii}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ & {\bf no}$^{\bf iii}$& no$^{iii}$ & no$^{iv}$ \\ \cline{2-9}
m & enum. & yes$^{i}$ & yes$^{i}$ & {\bf yes}$^{\bf i}$ & no$^{iv}$ & yes$^{i}$ & yes$^{i}$ & no$^{iv}$ \\ \cline{2-9}
i &appr. & yes$^{i}$ & yes$^{i}$ & yes$^{i}$ & no$^{iv}$ & yes$^{i}$ & yes$^{i}$ & {\bf no}$^{\bf iv}$\\ \hline\hline
m & rec. & no$^{iii}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ \\ \cline{2-9}
s & est. & no$^{iii}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ & no$^{iii}$ & no$^{iii}$ & no$^{iv}$ \\ \cline{2-9}
r &appr. & yes$^{ii}$ & yes$^{ii}$ & {\bf yes}$^{\bf ii}$& no$^{iv}$ & yes$^{ii}$ & yes$^{ii}$ & no$^{iv}$ \\ \hline
\end{tabular}
\end{center}}
\end{table}
\noindent If we ask for a universal (semi)measure that at least
satisfies the weakest form of computability, namely being
approximable, we see that the largest dominated set among the 7
sets defined above is the set of enumerable semimeasures. This is
the reason why $\M_{enum}^{semi}$ plays a special role. On the
other hand, $\M_{enum}^{semi}$ is not the largest set dominated by
an approximable semimeasure, and indeed no such largest set
exists. One may, hence, ask for ``natural'' larger sets $\M$. One
such set, namely the set of cumulatively enumerable semimeasures
$\M_{\text{CEM}}$, has recently been discovered by Schmidhuber
\cite{Schmidhuber:00toe,Schmidhuber:02gtm}, for which even
$\xi_{\text{CEM}}\in\M_{\text{CEM}}$ holds.
\noindent Theorem~\ref{thNoUniApp} also holds for {\em discrete
(semi)measures} $P$ defined as follows:
%------------------------------%
\fdefinition{defDSemi}{Discrete (semi)measures}{
%------------------------------%
$P(x)$ denotes the probability of $x\in\SetN$. We call
$P:\SetN\to[0,1]$ a discrete (semi)measure if $\sum_{x\in\SetN}
P(x)\stackrel{(<)}=1$.
}%------------------------------%
%
Theorem~\ref{thNoUniApp}
$(i)$ is Levin's major result \cite[Thm.4.3.1 \& Thm.4.5.1]{Li:97}, %
and $(ii)$ is due to Solomonoff \cite{Solomonoff:78}. %
The proof of $\M_{rec}^{semi}\ngeqm\M_{rec}^{semi}$ in
\cite[p249]{Li:97} contains minor errors and is not extensible to
$(iii)$, and the proof in \cite[p276]{Li:97} only applies to
infinite alphabet and not to the binary/finite case considered
here. $\M_{est}^{semi}\ngeqm\M_{est}^{semi}$
is mentioned in \cite{Zvonkin:70} without proof.
%
A direct proof of $(iv)$ can be found in \cite{Hutter:04uaibook}.
%
Here, we reduce $(iv)$ to $(iii)$ by exploiting the following
elementary fact (well-known for integer-valued functions, see
e.g.\ \cite[p634]{Simpson:77}):
%------------------------------%
\flemma{lemOracle}{Approximable = $H$-estimable}{
%------------------------------%
A function is approximable iff it is estimable with the help of
the halting oracle.
}%------------------------------%
%------------------------------%
\paradot{Proof}
%------------------------------%
With $H$-computable we mean, computable with the help of the
halting oracle, or equivalently, computable under extra input of
the halting sequence $h=h_{1:\infty}\in\B^\infty$, where $h_n=1$
$:\Leftrightarrow$ $U(n)$ halts.
Assume $f$ is approximable, i.e.\ $\forall\eps\exists y,m:
R(m,y,\eps)$, where relation $R(m,y,\eps):=[\forall n\geq
m:|f_n(x)-y|<\eps]$ and recursive $f_n\to f$. Fix $\eps>0$.
Search (dovetail) for $m\in\SetN$ and $y$ ($\in\odt\eps\SetZ$ is
sufficient) such that $R(m,y,\eps)=$true. $R$ is
co-enumerable, hence $H$-decidable, hence $y$ can be $H$-computed,
hence $f$ is $H$-estimable, since $f(x)=y\pm O(\eps)$.
Now assume that $f$ is $H$-estimable, i.e.\ $\exists T\in$TM
$\forall\eps,x:|T(x,\eps,h)-f(x)|<\eps$. Since $h$ is
co-enumerable, $T$ and hence $f$ are approximable. More formally,
let $h_n^t=1$ $:\Leftrightarrow$ $U(n)$ halts within $t$ steps.
Then $g(x,\eps) := T(x,\eps,h) = T(x,\eps,\lim_{t\to\infty}h^t) =
\lim_{t\to\infty}T(x,\eps,h^t)$ is approximable, where the
exchange of limits holds, since $T$ only reads $n_{x\eps}<\infty$
bits of $h$ and $h_{1:n_{x\eps}}=h^t_{1:n_{x\eps}}$ for
sufficiently large $t$. \qed
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Proof of Theorem~\ref{thNoUniApp}}\label{secProof}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We first prove the theorem for discrete (semi)measures $P$ (Definition
\ref{defDSemi}), since it contains the essential ideas in a
cleaner form. We then present the proof for continuous
(semi)measures $\mu$ (Definition~\ref{defSemi}). We present proofs
for binary alphabet $\X=\B$ only. The proofs naturally generalize from
binary to arbitrary finite alphabet. $\arg\min_x f(x)$ is the $x$
that minimizes $f(x)$. Ties are broken in an arbitrary but
computable way (e.g.\ by taking the smallest $x$).
%------------------------------%
\paradot{Proof (discrete case)}\\%
%------------------------------%
\paranodot{(o)} $Q(x):=\sum_{P\in\M}w_P P(x)$
with $w_P>0$ obviously dominates all $P\in\M$ (with constant
$w_P$). With $\sum_P w_P=1$ and all $P$ being discrete
(semi)measures also $Q$ is a discrete (semi)measure.
\paranodot{(i)} See \cite[Thm.4.3.1]{Li:97}.
\paranodot{(ii)} Let $P$ be the universal element in
$\M_{enum}^{semi}$ and $\alpha:=\sum_x P(x)$. We normalize $P$ by
$Q(x):={1\over\alpha}P(x)$. Since $\alpha\leq 1$ we have $Q(x)\geq
P(x)$. Hence $Q\geq P\geqm\M_{enum}^{semi}$. As a
ratio between two enumerable functions, $Q$ is still approximable,
hence $\M_{appr}^{msr}\geqm\M_{enum}^{semi}$.
\paranodot{(iii)}
Let $P\in\M_{rec}^{semi}$. We partition $\SetN$ into chunks
$I_n:=\{2^{n-1},...,2^n-1\}$ ($n\geq 1$) of increasing size. With
$x_n:=\arg\min_{x\in I_n}P(x)$ we define $Q(x_n):={1\over
n(n+1)}\forall n$ and $Q(x):=0$ for all other $x$. Exploiting that
a minimum is smaller than an average and that $\mu$ is a
semimeasure, we get
\beqn
P(x_n)=\min_{x\in I_n}P(x)\leq{1\over|I_n|}\sum_{x\in
I_n}P(x)\leq{1\over|I_n|}={1\over 2^{n-1}}= {n(n+1)\over
2^{n-1}} Q(x_n)
\eeqn
Since ${n(n+1)\over 2^{n-1}}\to 0$ for $n\to\infty$, $P$ cannot
dominate $Q$ ($P\ngeqm Q$). With $P$ also $Q$
is recursive. Since $P$ was an arbitrary recursive semimeasure
and $Q$ is a recursive measure ($\sum Q(x)=\sum[{1\over
n(n+1)}]=\sum[{1\over n}-{1\over n+1}]=1$) this implies
$\M_{rec}^{semi}\ngeqm\M_{rec}^{msr}$.
Assume now that there is an estimable semimeasure
$S\geqm\M_{rec}^{msr}$. We construct a recursive semimeasure
$P\geqm S$ as follows. Choose an initial $\eps>0$ and finitely
compute an $\eps$-approximation $\hat S$ of $S(x)$. If $\hat
S>2\eps$ define $P(x):=\odt\hat S$, else halve $\eps$ and repeat
the process. Since $S(x)>0$ (otherwise it could not dominate,
e.g.\ $T(x):={1\over x(x+1)}\in\M_{rec}^{msr}$) the loop
terminates after finite time. So $P$ is recursive. Inserting $\hat
S=2P(x)$ and $\eps<\odt\hat S=P(x)$ into $|S(x)-\hat S|<\eps$ we
get $|S(x)-2P(x)|)}=\rho(x0)+\rho(x1)$, $x\in\B^*$, is
respected. On the other hand, the chunking $I_n:=\B^n$ is more
natural here.
\paranodot{(o)} $\rho(x):=\sum_{\nu\in\M}w_\nu \nu(x)$ with $w_\nu>0$
obviously dominates all $\nu\in\M$ (with domination constant
$w_\nu$). With $\sum_\nu w_\nu=1$ and all $\nu$ being
(semi)measures also $\rho$ is a (semi)measure.
\paranodot{(i)} See \cite[Thm.4.5.1]{Li:97}.
\paranodot{(ii)} Let $\xi$ be a universal element in $\M_{enum}^{semi}$.
We define \cite{Solomonoff:78}
\beqn
\xi_{norm}(x_{1:n}) \;:=\;
\prod_{t=1}^n{\xi(x_{1:t}) \over \xi(x_{0
\;\Rightarrow\; 0_{1:\infty}\;\mbox{is a
$\mu$-random sequence},
\\
\mu_2(0_{1:n}) &\!=\!& \prod_{t=1}^n(1\!-\!\odt
t^{-2})\stackrel{n\to\infty}\longrightarrow c_2=0.358...>0
\;\Rightarrow\; \xi(0_{1:n})
\to w_1c_1+w_2c_2=:c_\xi>0
\\
\xi(0_{0\,\forall\th$.
In the following let $\mu=\mu_{\th_0}\in\M$ be the true environment.
\beq\label{MLmuMr}
\omega=x_{1:\infty} \mbox{ is } \mu/\xi\mbox{-random}
\quad\Leftrightarrow\quad
\exists c_\omega : {\xi(x_{1:n})\leq c_\omega\!\cdot\!\mu_{\th_0}(x_{1:n})}
\;\forall n
\eeq
For binary alphabet it is sufficient to establish whether
$\xi(1|x_{1:n}) \toinfty{n} \th_0\equiv\mu(1|x_{1:n})$ for
$\mu/\xi$-random $x_{1:\infty}$ in order to decide
$\xi(x_n|x_{0$.
This, together with (\ref{MLenbnd2}) implies $\e^{n\eps}\leq c$ for
infinitely many $n$ which is impossible. Hence, the assumption
$\hat\th_n\not\to\th_0$ was wrong.
Now, $\hat\th_n\to\th_0$ implies that for arbitrary
$\th\neq\th_0$, $\th\in\Theta$ and for sufficiently large $n$
there exists $\delta_\th>0$ such that $D(\hat\th_n||\th)\geq 2\delta_\th$
(since $D(\th_0||\th)\neq 0)$ and $D(\hat\th_n||\th_0)\leq\delta_\th$.
This implies
\beqn\label{MLwto0}
w_n^\th \;\leq\; {w_\th\over w_{\th_0}}
\e^{n[D(\hat\th_n||\th_0)\!-\!D(\hat\th_n||\th)]}
\;\leq\; {w_\th\over w_{\th_0}} \e^{-n\delta_\th}
\;\toinfty{n}\; 0,
\eeqn
where we have used (\ref{MLpw}) and (\ref{MLmuRatio}) in the first
inequality and the second inequality holds for sufficiently large
$n$. Hence $\sum_{\th\neq\th_0} w_n^\th\to 0$ by (\ref{MLsumconv})
and $w_n^{\th_0}\to 1$ by normalization (\ref{MLpw}), which finally gives
\beqn
\xi(1|x_{1:n})=w_n^{\th_0} \mu_{\th_0}(1|x_{1:n}) +
\sum_{\th\neq\th_0}w_n^\th \mu_\th(1|x_{1:n}) \;\toinfty{n}
\mu_{\th_0}(1|x_{1:n}).
\eeqn
%------------------------------%
{\bf (ii)} We first consider the case $\Theta=\{\th_0,\th_1\}$:
Let us choose $\bar\th$ ($=\ln({1-\th_0\over
1-\th_1})/\ln({\th_1\over\th_0}{1-\th_0\over 1-\th_1})
\not\in\Theta$) in the (KL) middle of $\th_0$ and $\th_1$ such
that
\beq\label{MLMid}
D(\bar\th||\th_0)=D(\bar\th||\th_1), \qquad
0 < \th_0 < \bar\th < \th_1 < 1,
\eeq
\beqn
\mbox{and choose $x_{1:\infty}$ such that $\hat\th_n:={n_1\over n}$
satisfies $|\hat\th_n-\bar\th|\leq{1\over n}
\quad(\Rightarrow\;\hat\th_n\toinfty{n}\bar\th)$}
\eeqn
We will show that $x_{1:\infty}$
is $\mu_{\th_0}/\xi$-random {\em and} $\mu_{\th_1}/\xi$-random.
Obviously no $\xi$ can converge to $\th_0$
{\em and} $\th_1$, thus proving $\M$-non-convergence.
($x_{1:\infty}$ is obviously not $\mu_{\th_{0/1}}$ M.L.-random,
since the relative frequency $\hat\th_n\not\to\th_{0/1}$.
$x_{1:\infty}$ is not even $\mu_{\bar\th}$ M.L.-random, since
$\hat\th_n$ converges too fast ($\sim\odn$). $x_{1:\infty}$ is
indeed very regular, whereas ${n_1\over n}$ of a truly
$\mu_{\bar\th}$ M.L.-random sequence has fluctuations of the order
$1/\sqrt n$. The fast convergence is necessary for
doubly $\mu/\xi$-randomness.
%
The reason that $x_{1:\infty}$ is $\mu/\xi$-random, but not M.L.-random is
that $\mu/\xi$-randomness is a weaker concept than M.L.-randomness for
$\M\subset\M_{enum}^{semi}$. Only regularities characterized by
$\nu\in\M$ are recognized by $\mu/\xi$-randomness.)
In the following we assume that $n$ is sufficiently large
such that $\th_0\leq\hat\th_n\leq\th_1$. We need
\beq\label{MLDD}
|D(\hat\th||\th)-D(\bar\th||\th)| \leq c|\hat\th-\bar\th|
\quad\forall\,\th,\hat\th,\bar\th\in[\th_0,\th_1]
\qmbox{with} \textstyle c:=\ln\!{\th_1(1-\th_0)\over\th_0(1-\th_1)} < \infty
\eeq
which follows for $\hat\th\geq\bar\th$ (similarly
$\hat\th\leq\bar\th$) from
\beqn
D(\hat\th||\th)-D(\bar\th||\th) = \int_{\bar\th}^{\hat\th}
[{\textstyle\ln{\th'\over\th}-\ln{1-\th'\over 1-\th}}]d\th'
\leq \int_{\bar\th}^{\hat\th}
[{\textstyle\ln{\th_1\over\th_0}-\ln{1-\th_1\over 1-\th_0}}]d\th'
= c\!\cdot\!(\hat\th-\bar\th)
\eeqn
where we have increased $\th'$ to $\th_1$ and decreased $\th$ to
$\th_0$ in the inequality. Using (\ref{MLDD}) in (\ref{MLmuRatio})
twice we get
\beq\label{MLmu01}
{\mu_{\th_1}(x_{1:n})\over\mu_{\th_0}(x_{1:n})}
=
\e^{n[D(\hat\th_n||\th_0)-D(\hat\th_n||\th_1)]}
\leq
\e^{n[D(\bar\th||\th_0)+c|\hat\th_n-\bar\th|-
D(\bar\th||\th_1)+c|\hat\th_n-\bar\th|]}
\leq
\e^{2c}
\eeq
where we have used (\ref{MLMid}) in the last inequality. Now,
(\ref{MLmu01}) and (\ref{MLpw}) lead to
\beq\label{MLwgeq0}
w_n^{\th_0}
= w_{\th_0}{\mu_{\th_0}(x_{1:n})\over\xi(x_{1:n})}
= [1+{w_{\th_1}\over w_{\th_0}}{\mu_{\th_1}(x_{1:n})\over\mu_{\th_0}(x_{1:n})}]^{-1}
\geq [1+{w_{\th_1}\over w_{\th_0}}\e^{2c}]^{-1}=:c_0>0,
\eeq
which shows that $x_{1:\infty}$ is $\mu_{\th_0}/\xi$-random by
(\ref{MLmuMr}). Exchanging $\th_0\leftrightarrow\th_1$ in
(\ref{MLmu01}) and (\ref{MLwgeq0}) we similarly get
$w_n^{\th_1}\geq c_1>0$, which implies (using
$w_n^{\th_0}+w_n^{\th_1}=1$)
\beq\label{MLnonconv2}
\xi(1|x_{1:n})=
\sum_{\th\in\{\th_0,\th_1\}}w_n^\th \mu_\th(1|x_{1:n})
= w_n^{\th_0}\!\cdot\!\th_0 + w_n^{\th_1}\!\cdot\!\th_1
\neq \th_0 = \mu_{\th_0}(1|x_{1:n}).
\eeq
This shows $\xi(1|x_{1:n}) \;\;\not\!\!\!\toinfty{n}
\mu(1|x_{1:n})$.
One can show that $\xi(1|x_{1:n})$ does not only not converge to
$\th_0$ (and $\th_1$), but that it does not converge at all. The
fast convergence demand $|\hat\th_n-\bar\th|\leq\odn$ on
$x_{1:\infty}$ can be weakened to
$\hat\th_n\leq\bar\th+O(\odn)\,\forall n$ and
$\hat\th_n\geq\bar\th-O(\odn)$ for infinitely many $n$, then
$x_{1:\infty}$ is still $\mu_{\th_0}/\xi$-random, and
$w_n^{\th_1}\geq c_1'>0$ for infinitely many $n$, which is
sufficient to prove $\xi\not\to\mu$.
We now consider general $\Theta$ with gap in the sense that there exist
$0<\th_0<\th_1<1$ with
$[\th_0,\th_1]\cap\Theta=\{\th_0,\th_1\}$: We show
that all $\th\neq\th_0,\th_1$ give asymptotically no contribution
to $\xi(1|x_{1:n})$, i.e.\ (\ref{MLnonconv2}) still applies. Let
$\th\in\Theta\setminus\{\th_0,\th_1\}$; all other definitions as
before. Then
$\delta_\th:=D(\bar\th||\th)-D(\bar\th||\th_{0/1})>0$, since
$\th$ is farther than $\th_{0/1}$ away from $\bar\th$
($|\th-\bar\th|>|\th_{0/1}-\bar\th|$). Similarly to (\ref{MLmu01}) with
$\th$ instead $\th_1$ we get
\beqn
{\mu_\th(x_{1:n})\over\mu_{\th_0}(x_{1:n})}
= \e^{n[D(\hat\th_n||\th_0)-D(\hat\th_n||\th)]}
\leq \e^{2c}\!\cdot\!
\e^{n[D(\bar\th||\th_0)-D(\bar\th||\th)]}
= \e^{2c}\e^{-n\delta_\th}
\toinfty{n} 0
\eeqn
Hence $w_n^\th\leq{w_\th\over w_{\th_0}}\e^{2c}\e^{-n\delta_\th}\to
0$ from (\ref{MLpw}) and
$\eps_n:=\sum_{\th\in\Theta\setminus\{\th_0,\th_1\}}
w_n^\th\mu_\th(1|x_{1:n})\toinfty{n} 0$ from (\ref{MLsumconv}).
Hence $
\xi(1|x_{1:n})
= w_n^{\th_0}\cdot\th_0 + w_n^{\th_1}\cdot\th_1 + \eps_n
\neq \th_0 = \mu_{\th_0}(1|x_{1:n})
$
for sufficiently large $n$, since $\eps_n\to 0$, $w_n^{\th_1}\geq c'_1>0$
and $\th_0\neq\th_1$.
\qed
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Conclusions}\label{secConc}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
For a hierarchy of four computability definitions, we completed
the classification of the existence of computable (semi)measures
dominating all computable (semi)measures. Dominance is an important
property of a prior, since it implies rapid convergence of the
corresponding posterior with probability one.
%
A strengthening would be convergence for all Martin-L{\"o}f (M.L.)
random sequences. This seems natural, since M.L.\ randomness can
be defined in terms of Solomonoff's prior $M$, so there is a close
connection.
%
Contrary to what was believed before, the question of posterior
convergence $M/\mu\to 1$ for all M.L.\ random sequences is still
open. Some exciting progress has been made recently in
\cite{Hutter:04mlconvx}, partially answering this question.
%
We introduced a new flexible notion of
$\mu/\xi$-randomness which contains Martin-L{\"of} randomness as a
special case. Though this notion may have a wider range of
application, the main purpose for its introduction was to show
that standard proof attempts of
$M/\mu\stackrel{M.L.}\longrightarrow 1$ based on dominance only
must fail. This follows from the derived result that the validity
of $\xi/\mu\to 1$ for $\mu/\xi$-random sequences depends on the
Bayes mixture $\xi$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Bibliography %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{small}
\begin{thebibliography}{Hut03b}
\bibitem[Cha75]{Chaitin:75}
G.~J. Chaitin.
\newblock A theory of program size formally identical to information theory.
\newblock {\em Journal of the ACM}, 22(3):329--340, 1975.
\bibitem[Doo53]{Doob:53}
J.~L. Doob.
\newblock {\em Stochastic Processes}.
\newblock Wiley, New York, 1953.
\bibitem[G{\'a}c74]{Gacs:74}
P.~G{\'a}cs.
\newblock On the symmetry of algorithmic information.
\newblock {\em Soviet Mathematics Doklady}, 15:1477--1480, 1974.
\bibitem[HM04]{Hutter:04mlconvx}
M.~Hutter and An.~A. Muchnik.
\newblock Universal convergence of semimeasures on individual random sequences.
\newblock In {\em Proc. 15th International Conf. on Algorithmic Learning Theory
({ALT-2004})}, volume 3244 of {\em LNAI}, pages 234--248, Padova, 2004.
Springer, Berlin.
\bibitem[Hut01]{Hutter:01alpha}
M.~Hutter.
\newblock Convergence and error bounds for universal prediction of nonbinary
sequences.
\newblock In {\em Proc. 12th European Conf. on Machine Learning (ECML-2001)},
volume 2167 of {\em LNAI}, pages 239--250, Freiburg, 2001. Springer, Berlin.
\bibitem[Hut03a]{Hutter:03unipriors}
M.~Hutter.
\newblock On the existence and convergence of computable universal priors.
\newblock In {\em Proc. 14th International Conf. on Algorithmic Learning Theory
({ALT-2003})}, volume 2842 of {\em LNAI}, pages 298--312, Sapporo, 2003.
Springer, Berlin.
\bibitem[Hut03b]{Hutter:03unimdl}
M.~Hutter.
\newblock Sequence prediction based on monotone complexity.
\newblock In {\em Proc. 16th Annual Conf. on Learning Theory ({COLT-2003})},
volume 2777 of {\em LNAI}, pages 506--521, Washington, DC, 2003. Springer,
Berlin.
\bibitem[Hut04]{Hutter:04uaibook}
M.~Hutter.
\newblock {\em Universal Artificial Intelligence: Sequential Decisions based on
Algorithmic Probability}.
\newblock Springer, Berlin, 2004.
\newblock 300 pages, http://www.idsia.ch/$_{^{\sim}}$marcus/ai/uaibook.htm.
\bibitem[Kol65]{Kolmogorov:65}
A.~N. Kolmogorov.
\newblock Three approaches to the quantitative definition of information.
\newblock {\em Problems of Information and Transmission}, 1(1):1--7, 1965.
\bibitem[Lam87]{Lambalgen:87}
{M. van} Lambalgen.
\newblock {\em Random Sequences}.
\newblock PhD thesis, University of Amsterdam, 1987.
\bibitem[Lev73]{Levin:73random}
L.~A. Levin.
\newblock On the notion of a random sequence.
\newblock {\em Soviet Mathematics Doklady}, 14(5):1413--1416, 1973.
\bibitem[Lev74]{Levin:74}
L.~A. Levin.
\newblock Laws of information conservation (non-growth) and aspects of the
foundation of probability theory.
\newblock {\em Problems of Information Transmission}, 10(3):206--210, 1974.
\bibitem[LV97]{Li:97}
M.~Li and P.~M.~B. Vit\'anyi.
\newblock {\em An Introduction to {K}olmogorov Complexity and its
Applications}.
\newblock Springer, Berlin, 2nd edition, 1997.
\bibitem[Sch71]{Schnorr:71}
C.~P. Schnorr.
\newblock {\em Zuf{\"a}lligkeit und Wahrscheinlichkeit}.
\newblock Springer, Berlin, 1971.
\bibitem[Sch00]{Schmidhuber:00toe}
J.~Schmidhuber.
\newblock Algorithmic theories of everything.
\newblock Report IDSIA-20-00, quant-ph/0011122, {IDSIA}, Manno (Lugano),
Switzerland, 2000.
\bibitem[Sch02]{Schmidhuber:02gtm}
J.~Schmidhuber.
\newblock Hierarchies of generalized {Kolmogorov} complexities and
nonenumerable universal measures computable in the limit.
\newblock {\em International Journal of Foundations of Computer Science},
13(4):587--612, 2002.
\bibitem[Sim77]{Simpson:77}
S.~G. Simpson.
\newblock Degrees of unsolvability: A survey of results.
\newblock In J.~Barwise, editor, {\em Handbook of Mathematical Logic}, pages
631--652. North-Holland, Amsterdam, 1977.
\bibitem[Sol64]{Solomonoff:64}
R.~J. Solomonoff.
\newblock A formal theory of inductive inference: Parts 1 and 2.
\newblock {\em Information and Control}, 7:1--22 and 224--254, 1964.
\bibitem[Sol78]{Solomonoff:78}
R.~J. Solomonoff.
\newblock Complexity-based induction systems: Comparisons and convergence
theorems.
\newblock {\em IEEE Transaction on Information Theory}, IT-24:422--432, 1978.
\bibitem[VL00]{Vitanyi:00}
P.~M.~B. Vit\'anyi and M.~Li.
\newblock Minimum description length induction, {B}ayesianism, and {K}olmogorov
complexity.
\newblock {\em IEEE Transactions on Information Theory}, 46(2):446--464, 2000.
\bibitem[Vov87]{Vovk:87}
V.~G. Vovk.
\newblock On a randomness criterion.
\newblock {\em Soviet Mathematics Doklady}, 35(3):656--660, 1987.
\bibitem[Wan96]{Wang:96}
Y.~Wang.
\newblock {\em Randomness and Complexity}.
\newblock PhD thesis, Universit{\"a}t Heidelberg, 1996.
\bibitem[ZL70]{Zvonkin:70}
A.~K. Zvonkin and L.~A. Levin.
\newblock The complexity of finite objects and the development of the concepts
of information and randomness by means of the theory of algorithms.
\newblock {\em Russian Mathematical Surveys}, 25(6):83--124, 1970.
\end{thebibliography}
\end{small}
\end{document}
%---------------------End-of-UniPriorx.tex--------------------%