Wednesday, August 13, 2008

A Complex Sleight of Hand

T-Stone has written in defense of Dawkin’s idea that theists proposing Intelligent Design would need to have a God who was more complex than the universe is. Important to this discussion is the following point T-Stone raises:

A 1,000 x 1,000 pixel grid of random pixels, on the other hand, isn't as pretty to look at as a rendering of the Mandelbrot set, but it is much more complex -- maximally complex, as it turns out (which is part of why it's not as appealing aesthetically as a fractal image!). It's counterintuitive to people who don't work with information theory and algorithmic complexity, but its a fact of the domain: randomness is the theoretical maximum for measured complexity. You can't get any more complex than purely random. In a random grid of pixels, we cannot guess anything about any pixels at all. In a rendering of Sierpinski triangles, or the Mandelbrot or Julia set, as soon as we see one level of rendering, prior to any recursion, we no everything about the rest of image, and can reproduce the fractal to any depth of detail without the original program.
Unfortunately for T-Stone, if he paid attention to what he has written here he’d see that he’s soundly refuted Dawkins. After all, if maximal randomness is equivalent to maximal complexity, then it is easy for me to write a program that will generate completely random output. In other words, it is easy for me—a person who is not maximally complex—to produce a program with output that is maximally complex. Thus, if we want to play T-Stone’s game and use complexity in this sense, then Dawkin’s argument must be surrendered.

If I can make a program that is more complex than I am, then God can create a universe that is more complex than He is.

FWIW, I disagree with T-Stone’s version of information and complexity. And despite what his post would lead you to believe, the idea that “maximal randomness = maximal complexity” is not true for all information theories. And in fact, if I were to use T-Stone’s definition of complexity then I would ask him to explain not why there is so much complexity in the universe, but rather why there is so little complexity. If complexity = randomness, then it doesn’t take a rocket scientist to realize that there’s a lot of the universe that is not random, and therefore there is a lot of this universe that is not complex. Under his information theory, randomness is the default. We do not need to explain random data. We do need to explain structured and ordered data. Therefore, we do not need to explain complexity; we need to explain non-complexity.

T-Stone is just giving a sleight of hand here. It would be like a mathematician saying "a > b" and having T-Stone say, "The greater than sign is inverted with the less than sign, therefore 'a > b' means 'a is less than b'."

But as soon as he engages in his sleight of hand, we respond: "If the greater than sign is inverted with the less than sign, then 'a > b' is no longer true, rather 'a < b' is."

Inverting the operator without inverting the operands does not refute the original expression.

14 comments:

  1. I see T-Stone has responded, but given his opening statement "He thinks there's something fishy with the idea of random strings being more complex than repetitive or structured strings" which is obviously NOT what I was refering to, and also given the fact that the other posters out there seem more interested in psychoanalysis and pretending we're mentally unstable, there seems little point in responding. I'm not going to waste my time chasing after cage-stage atheists.

    ReplyDelete
  2. I will add one thing for people who are interested. T-Stone's example of a 1,000 x 1,000 random pixel grid is only "maximally complex" if there are no repetitions at all. That is, there must be 1,000,000 pixels each with their own unique color. If there are any repetitions, compression can occur.

    For example, suppose that the random pixels are simply 1 or 0, that is the pixel is either on or off (i.e. "white" or "black"). Now suppose that one section of the string shows:

    0100010011101011

    We can easily compress this by saying "00" = A and "01" = B and "10" = C and "11" = D. Thus, the above string of numbers can be compressed to:

    BABADCCD

    This is a simplistic examination. In fact, given that we have 26 letters to choose from, there is no reason that we have to stick with only two-digit numbers. We can use four-digit numbers sections easily enough represented with just 16 letters.

    Of course, this level of compression immediately begs the question: is it really less complex to have various symbols represent sequences of other symbols? That is, while the string "BABADCCD" is shorter than the string "0100010011101011", the program needed to convert "BABADCCD" is more complex than simply reading "0100010011101011".

    This also brings to mind the question: if complexity in information is reduced due to compression where certain symbols stand for a certain combination of other symbols, then we can theoretically define a symbol as a long combination of other symbols. In other words, we can say:

    "*" = "0100010011101011"

    Now we can transmit all of the above with a simple character: "*"

    As we continue in this vein, obviously constructing a symbol code for all the iterations of a 1,000 x 1,000 grid will be immense; but the output will be greatly condensed. No matter which iteration comes about, it is represented by one symbol.

    But this brings to bear more issues beyond simply the complexity of the information itself. The program of compression (which, in order to be useful, must be able to decompress data as well) can be infinitely more complex than the output; and vice versa. But if we're merely looking at the information present and not considering the compression algorithms at all, then "*" is obviously less complex than "BABADCCD" which is itself less complex than "0100010011101011".

    Some people want to stop thinking at that level and not consider any further. Others don't. You can decide whether it's important or not.

    ReplyDelete
  3. Your grasp of information theory is lacking, and you really ought to take a course in it before you post more about it. There are not competing "information theories", the field is a well-established field of mathematics with established and agreed upon rules.

    Most of your statements are flat-out wrong. For example:

    >>>If there are any repetitions, compression can occur.

    This is definitely not true. There will always be repetitions in any string of sufficient length, nonetheless you cannot create a compressor that will successfully compress random numbers. This is because you compress by replacing repeated sequences with shorter sequences, and then adding those shorter sequences to a symbol table. (The symbol table's size must be considered part of the compressed string's size, otherwise, all you've done is destroy the original string because you can't uncompress.)

    For a truly random string, the symbol table grows as much as the main string shortens, and your total length (in bits) has not decreased.

    You can only successfully compress when some sequences occur more frequently than others, so that you encode the frequent ones with short strings and the infrequent ones with long strings, and your net size of message + symbol table decreases. It doesn't work on random strings.

    >>>But if we're merely looking at the information present and not considering the compression algorithms at all,

    But you can't do that. The amount of information (=complexity) in a string is defined as the smallest algorithm which can produce that string. That may not jive with your intuitive feeling about what "complexity" means, but that nonetheless is the mathematical definition as far as information science is concerned.

    Your statement is analogous to saying "let's look at how hot something is without considering the temperature at all".

    It's not your opinion or Peter Pikes opinion of what that word means ... the word has a strict and rigorous definition to mathematicians and information scientists.

    >>>then "*" is obviously less complex than "BABADCCD" which is itself less complex than "0100010011101011".

    They are absolutely not less complex. By your definitions, those three strings have identical complexity: 16 bits. They are merely specified in codes that carry, respectively, 16 bits per symbol, 2 bits per symbol, and 16 bits per symbol.

    You are confusing compression with translation. You don't compress something by putting it in a different language - going from binary to your "ABCD" language is not compression.

    Respelling it, still in binary, such that you can recreate the original string - and making your algorithm so that it will work on any string you give to it - that's compression.

    Please, please go take a course in information theory before continuing this debate. You're trying to speak a language full of terms you don't understand.

    ReplyDelete
  4. Idahoev said:
    ---
    There will always be repetitions in any string of sufficient length
    ---

    Exactly my point.

    You further said:
    ---
    ...nonetheless you cannot create a compressor that will successfully compress random numbers. This is because you compress by replacing repeated sequences with shorter sequences, and then adding those shorter sequences to a symbol table. (The symbol table's size must be considered part of the compressed string's size, otherwise, all you've done is destroy the original string because you can't uncompress.)
    ---

    And that's where you're wrong.

    Consider the example once again. If you only have two digits (a 1 and a 0) for a 1,000,000 long string, you'll find a lot of repetitions. Even if the 1's and 0's are generated randomly. And those repetitions are sufficient enough that you can include a symbol table with your data.

    In a "pure" random sample, you'd have approximately the same number of "00", "01", "10", and "11" sequences. Thus, you could replace them all with the four symbols "A", "B", "C", and "D". Your string (which was 1,000,000 bits long) is now 500,000 bits long. Are you seriously claiming that the symbol table will be 500,000 bits in size?

    You said:
    ---
    But you can't do that. The amount of information (=complexity) in a string is defined as the smallest algorithm which can produce that string.
    ---

    Which is exactly my point. Dawkins (and by extention, T-Stone) only have the universe, which would be the "output" of the whole algorithm. They have no idea how complex or how simple something need be to produce that output. They don't even think about that. Information theory can say nothing about God because we don't have all the sufficient knowledge available to us. What it can say is whether natural processes can account for the information we see present.

    Thanks for venting your views. Now perhaps you could actually read what I wrote before responding.

    ReplyDelete
  5. Peter Pike said,

    "FWIW, I disagree with T-Stone’s version of information and complexity. And despite what his post would lead you to believe, the idea that “maximal randomness = maximal complexity” is not true for all information theories."

    Oh really? Then what exactly did you mean when you wrote this? Please explain:

    "FWIW, I disagree with T-Stone’s version of information and complexity. And despite what his post would lead you to believe, the idea that “maximal randomness = maximal complexity” is not true for all information theories."

    ReplyDelete
  6. Nihilist,

    First, I assumed what T-Stone meant by "maximal randomness" was the impossibility of the string to be compressed. Given that assumption, then my statements about randomness come into play.

    Again, consider the example of the 1,000,000 bits. Random data will tend to have an equal number of "00", "01", "10", and "11" number sequences. When you have 1,000,000 bits, that means you'll have roughly 250,000 of each. (As anyone who's rolled dice know, these percentages rarely happen in the real world.)

    You can compress those to halve the length of the string. Even including the symbol table will not increase the string length back to its original size.

    Therefore, it follows that a "maximally random" string can only be a set length, and if it cannot be compressed it must have specific rules in place as to what is in the string. Therefore, the "random" strings are not so random after all in order to be "maximally complex."

    Now there are random aspects to it. I'm not denying that. Rather, as the string gets longer, in order to maintain its maximal complexity it must lose its randomness.

    Let's use lower case letters for an example. Suppose our string starts with:

    "jq"

    The next letter can be completely random. It can even be another "j." But if it is another "j" then a "q" CANNOT follow, else it can be compressed (say with a "J").

    Now short strings can indeed run into problems with the symbol table, but even if we say the individual portions of the symbol table are twice the size of each bit of the string, then because there are only 26 lower case letters in the alphabet, random letters will result in letter "pairs" every (1/26) x (1/26), or 1/676 times. Thus, even having a text with only, say, 2,000 randomly generated characters will provide sufficent number of repeats for you to be able to compress the string. And that's even if you're using an arbitrary table and you're not tailor-making a compression program (like you would for language). You know that the pairing of "jq" will occur roughly ever 676 times, so that substitution will reduce the original length of the (sufficiently long) string even if that's the only substitution you make.

    You can increase the complexity of the symbol table to be any finite value greater than the length of the string, and it will still be possible to compress that string if the string is longer. Only if you claim that the complexity of the symbol table is infinitely large will you be unable to compress the string in this manner.

    Therefore, a "maximally random" that is "maximally complex" can only have a finite number of repetitions (depending on the number of symbols the string itself has and the symbol table that would accompany it). Even adding in the symbol table's information, there is still a finite length that a "maximally random" string can be before it must obey rules (i.e., no longer be random) and still maintain it's "maximal complexity."

    Finally, consider the odds that a string of random characters would reach it's maximal complexity by purely random means. The vast majority of random strings (unless you keep them really short) will have those repetitions enabling compression. It is far more likely that a random process will generate those strings than that one will generate a "maximally complex" string.

    It is far easier for an intelligent being to create a maximally complex string than for a maximally complex string to occur randomly.

    But again, I predicated this on the belief that T-Stone accepted "maximal complexity" as the inability to compress the string.

    And FWIW, I didn't invent this theory, but I don't remember where I read it either. If you must have a name for it, you can try to Google it. I've written it out for you so I don't feel the need to spend any more time on it.

    ReplyDelete
  7. BTW, just to clear up one other thing that may cause confusion, since Idahoev said:
    ---
    You can only successfully compress when some sequences occur more frequently than others, so that you encode the frequent ones with short strings and the infrequent ones with long strings, and your net size of message + symbol table decreases. It doesn't work on random strings.
    ---

    This is not accurate at all. And because I like T-Stone so much, I'll use the Wiki article to demonstrate it.

    Wiki says:
    ---
    ...[T]he following string:

    25.888888888

    ...can be compressed as:

    25.[9]8

    Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated, just written in a smaller form.
    ---

    In the same way, random data (especially binary data that only has 2 digits to use) will form "runs" that can be compressed in this manner. Suppose you get the sequence: 1000000000.

    That can be compressed (as per T-Stone's beloved Wiki!) as: 1[9]0.

    So even ignoring everything else with the symbol table, Wiki agrees with me that random data can be compressed so long as there are "runs" in it. And with "pure" random data...there will be runs.

    ReplyDelete
  8. Oh yes, and one last thing: the claim that the symbol table would be just as long as the string. This is also false because there is no reason to send "00 = A" for every instance of "00" when a definition for the decompression program is sufficient. That is, you define the compression once and then compress for each instance. The decompression program finds the definition and decompresses it. There's no need to send the symbol table for each instance of what you're replacing.

    ReplyDelete
  9. As I was dropping off to sleep last night, I think I discovered where the problem in understanding my position is coming in for folks like Idahoev. Idahoev would be correct if we were forced to use the same symbol "alphabet" to compress data as we used to create it. While this restriction is necessary to get computers to communicate with each other, it's an artificial restriction when it comes to information as a whole.

    As a quick example, suppose that Adam writes a program and compiles it so it's in binary form. He wants to send it to Bill, but Bill lacks a compiler, an internet connection, and a portable storage device. He does, however, have the ability to type 1s and 0s into a program on his computer and have it save it as a binary file.

    Adam wants to send the program to Bill so he can have it too, but when he prints out the binary file (as 1s and 0s) it is much too long to mail inexpensively. Therefore, he compresses it using the compression technique I already showed (i.e. "00" = "A", etc.), and drops that compressed form in the mail. Bill gets it, converts it back to binary, inputs it into his computer and has the program.

    Now this technique is impossible for computers (although it may be possible for a quantum computer) since they can only function in binary. However, binary is very inefficient for humans, which is why we don't speak binary. The English alphabet serves as a meta-alphabet for the binary alphabet, and you can compress the binary alphabet in the English meta-alphabet (and then transmit it in the meta-alphabet before converting back to the binary alphabet).

    If you keep that in mind (the difference between the alphabet and the meta-alphabet) it should help you understand my original point. Even if random strings cannot be compressed using the alphabet, some of them can still be compressed using a meta-alphabet. However, my argument is that there are still some strings that remain that are incompressible in both the alphabet and the meta-alphabet, and those would be the correct definition of "maximally complex." And because those strings must obey certain rules, the "maximally random" string is not the "maximally complex" string.

    Hope that helps to explain it.

    ReplyDelete
  10. Peter, I think some confusion may lie with how you are using the symbols "A", "B", etc. to take the place of "00" and "01", etc. The symbols we see are actually 2-D arrays of pixels and so actually use a very large number of bits (pixels either on or off). In your last analogy, when the computer's electrical signals are printed out, the number of bits used to transmit the information is actually increased by a huge amount to print the 1 and 0 symbols. Because the bit sequences for each symbol are not random, but highly regular, the symbols can indeed be compressed to bit sequences for other symbols.

    I think Idahoev was considering "1" and "0" conceptually to be referring to single bits rather than the symbols we use when writing them down. In the context of us actually seeing single bits with our own eyes, the actual bit stream would be a series of on and off pixels which would look like a mottled gray image on the screen (or paper). You would have to look at it with a magnifying glass to see the bits, which are just dots at this point, clearly.

    ReplyDelete
  11. >> "00" = A and "01" = B and "10" = C and "11" = D.

    And how do you transmit those symbols? Let us suppose that you do it in the easy, obvious way: as ASCII codes. This means that eight bits are being used to send each two-bit sequence, and your "compression" has quadrupled the size of the data.

    As each symbol encodes a data string of two bits, I suppose you could encode each symbol into a two-bit sequence and send that instead, such that for "A" you send "00"; for "B", "01"; and so forth. That would give you the maximal compression, but (assuming random data), you can't compress it any further than that, because there's no redundancy in the data. You need to look at the number of bits, not the number of characters you write it in.

    ReplyDelete
  12. "In other words, it is easy for me—a person who is not maximally complex—to produce a program with output that is maximally complex."

    You should try that, using a Turing machine. You will find in fact that it is very, very difficult. That you disagree with the mathematical facts doesn't make them any less factual.

    ReplyDelete
  13. "In the same way, random data (especially binary data that only has 2 digits to use) will form "runs" that can be compressed in this manner. Suppose you get the sequence: 1000000000.

    That can be compressed (as per T-Stone's beloved Wiki!) as: 1[9]0."

    The only reason that works is because you are quite inefficiently treating those binary values as 7-bit ASCII values.

    However, '[', ']', and '9' are not binary values. You would have to produce a unique binary pattern to represent this special case (it would take four bits just to represent the number 9) as well as a means for allowing that pattern to occur by itself elsewhere (as would happen in a truly random series of binary values) and not be interpreted as a signal for an expansion of a run. By the time you have done this you are not guaranteed compression on any given sequence of truly random data.

    Further complicating this issue is that since you have to escape your run sequence when it appears outside of a run, you will be adding information to the message every time your escape sequence shows up in a non-run context.

    In the end you may actually be ADDING to the length of the message, not compressing it, as your escape sequence may occur much more frequently than runs which are long enough to benefit from it and each time you escape it to say "no this isn't a run", you are adding more information.

    ReplyDelete
  14. Idahoev would be correct if we were forced to use the same symbol "alphabet" to compress data as we used to create it.

    ... you are forced to use the same symbol alphabet. All quantities of data data, from an information theory standpoint, are measured in bits, which means what matters is how long they are when represented in binary. Your other alphabets are just shorthand for codes that use several bits to represent other symbols. When you change your symbol code, all you are changing is the # of bits per symbol ... as I said in my previous comment. You aren't actually compressing data.

    Look ... if you think you can write a compressor that will successfully compress arbitrary random strings, I strongly encourage you to do so. You would overturn five decades of computer science and mathematics, and you could make yourself many thousands of dollars in the process, because there are plenty of unclaimed prizes offered to anyone who can prove it can be done.

    Good luck...

    ReplyDelete