Human Knowledge Compression Contest
Frequently Asked Questions & Answers
In , HTML and XML documents, the logical constructs known as character data and   consist of sequences of characters, in which each  can manifest directly (representing itself), or can be  by a series of characters called a  reference, of which there are  types: a numeric character reference and a character entity . This article lists the  entity  that are valid in  and  documents.If you do not understand any English, there is little hope that you can fill in the missing words. You may assign a high probability that the missing words equal some of the other words, and a small probability to all other strings of letters. Working on a string pattern matching level you may conjecture =HTML and =XML, since they precede the word "documents", and similarly =character preceding "entity". If you do understand English you would probably further easily guess that =character, =represented, and =two, where  requires to understand that characters are "in" sequences,  needs understanding of conjugation, and  needs understanding the concept of counting. To guess =reference probably needs some understanding of the contents of the article, which then easily implies =references. As a Markup Language expert, you "know" that =attribute, =values, and =character, and you may conjecture =SGML. So clearly, the more you understand, the more you can delete from the text without loss (this idea is behind Cloze-style reading comprehension and led to the Billion Word Imputation challenge). If a program reaches the reconstruction capabilities of a human, it should be regarded as has having the same understanding. (Sure, many will continue to define intelligence as what a machine can't do, but that will make them ultimately non-intelligent). (Y cn ccmplsh lt wtht ndrstndng nd ntllgnc, bt mr wth).
|© 2000 by ...||[home] [search] [science] [contact] [up] [prize]||... Marcus Hutter|