A Few Million Virtual Monkeys Randomly Recreate Shakespeare 312
First time accepted submitter eljefe6a writes "On September 23 at 2:30 PST the A Million Amazonian Monkeys project successfully recreated A Lover's Complaint. This is the first time a work of Shakespeare has actually been randomly reproduced. It is one small step for a monkey, one giant leap for virtual primates everywhere. From the article: 'For this project, I used Hadoop, Amazon EC2, and Ubuntu Linux. Since I don’t have real monkeys, I have to create fake Amazonian Map Monkeys. The Map Monkeys create random data in ASCII between a and z. It uses Sean Luke’s Mersenne Twister to make sure I have fast, random, well behaved monkeys. Once the monkey’s output is mapped, it is passed to the reducer which runs the characters through a Bloom Field membership test. If the monkey output passes the membership test, the Shakespearean works are checked using a string comparison. If that passes, a genius monkey has written 9 characters of Shakespeare. The source material is all of Shakespeare’s works as taken from Project Gutenberg.'"
Frankly, that's cool (Score:2)
Re: (Score:3)
I wish I'd thought of it - and what a neat way to go about it.
So is it safe to say you're virtually impressed with the whole affair?
Re: (Score:2)
Re: (Score:2)
Cool? Success is a forgone conclusion and the results were written over a hundred years ago. Cool would be to have this system write something new.
Re: (Score:3)
It would be a foregone conclusion if monkeys were indeed randomly typing on a keyboard. But in practice, they tend to like certain keys leading to at best a pseudo random distribution of keystrokes. On top of that, many of the characters needed to produce the works require not just one keystroke, but a shift and a keystroke to work.
Consequently, simulating this with virtual monkeys is almost sure to come up with a result that differs substantively from using actual monkeys to do the project.
Re: (Score:2)
"No shit?"
Re: (Score:2, Insightful)
What a depressingly dull world you live in. By that thinking, all human endeavor is a waste of time because it is a forgone conclusion that we all die in the end.
Sometimes it is the journey that is more important than the final destination. This was not about making another copy of a work of literature, but the creation of a simulation of virtual monkeys.
Re: (Score:2)
The creation of a simulation of virtual monkeys isn't all that impressive, though. It could basically be a student project.
Your comparison is flawed. Plotting to destroy every human alive at noon GMT June 3rd, 2007 could be considered a waste of time since they all die in the end, because them all dying in the end is already a foregone conclusion. Making a great work of art is not a waste of time even if the appreciators all end up dead and the work forgotten forever, because the point wasn't for the app
Re:Frankly, that's cool (Score:5, Insightful)
I don't even understand it.....
He randomly generates 9 characters until he gets the 9 characters he wants. Then he repeats until he has the Shakespeare book he wanted? That's not how 'random' works. Why 9 characters? Why not 1?
I will have my computer randomly guess letters until an A comes up. Then until a B comes up. And then, at the end I'll have the ABCs! RANDOMLY!
Am I being retarded? Did I miss why this is cool?
Re:Frankly, that's cool (Score:5, Informative)
That didn't happen.
Re: (Score:2)
Exactly. The way I remember the 'challenge' It is that
If enough monkeys are given enough typewriters one of them will eventually type out the complete works of Shakespeare.
Of course, in this case 'enough' is probably a number that won't fit in our universe (googleplex at least.)
I did think of it. (Score:5, Insightful)
I did think of it. I even registered a domain (see my URL and e-mail address). Planned on making a screensaver that would randomly generate stuff, and convince people to run it, ala SETI@Home. Then college happened, then graduate school happened, then marriage happened, then baby happened... And then (once again), I read on SlashDot that someone else has done one of my ideas again and made the front page.
But then again, literally as I'm reading this, my daughter is singing the Blue's Clues theme song next to me while my wife and I get ready to queue up for our nightly game of League of Legends... Sitting in the downstairs den/office that's full of years of gamer stuff that all represents the happy memories of those several years of college. That guy can have my monkeys. Good for him. I found something better. :)
Re: (Score:2)
Re: (Score:2)
Fair enough. But in my case, I found that gaming with my friends WAS what I wanted to do. :)
Re:I did think of it. (Score:5, Insightful)
One thing the Internet has taught me is that (nearly) all my ideas are non-unique. It's the execution that counts.
Re:I did think of it. (Score:5, Funny)
I almost meant to say that exact same thing!
Re: (Score:3)
I was meaning to say that exact same thing - almost.
Re: (Score:2)
ideas are non-unique. It's the execution that counts.
Really? Haven't you been paying attention to what's happening with software and business practice patents?
Re: (Score:3)
Re: (Score:2)
Dude, ignore all the no-life haters replying to this post who clearly don't get it.
Congrats on finding the Meaning of Life - personal happiness.
Re: (Score:3)
No! I'm saving them for that sour wine I'll never make!
Re: (Score:2)
Nah, I got into something much more useless than simulating an infinite number of monkeys at typewriters while I was there: Virtual Reality. :(
Oh well, at least it got me a job!
I thought that was virtually impossible (Score:2)
...and wouldn't it be easier to let them evolve and then one of them can BE Shakespeare 2.0?
It is in fact virtually impossible (Score:5, Insightful)
What's happening here (if I understand the writeup) is that the monkeys are typing random letter combinations, until they hit a small phrase that happens to be in shakespeare. Then that phrase is marked as done.
Let n be the size in characters of the target phrase. If n=1, then the complete works of shakespeare are obtained as soon as each of the letters of the alphabet have been typed at least once. You could do this in a few seconds on your computer keyboard. If n=2, then the complete works are obtained as soon as all the possible pairs of letters have been typed. The experiment in TFA has n=9 I think.
As n grows larger, the time until completion grows exponentially. Once his expeiment is done, the case n=10 should take roughly 26 times as long (ignoring punctuation capitals and diacritical marks). Alternatively, it would require a cloud roughly 26 times bigger to do it in the same amount of time.
Re:It is in fact virtually impossible (Score:4, Insightful)
Exactly. Breaking down the problem of "randomly finding thousands of characters in the right order" to "randomly finding 9 characters in the right order" is bullshit, because this requires information about the order of all the 9-character-blobs you find.
In other news: I compressed a Gigabyte down to 2 bits. You just have to know the order of the bits!
Stupid article. Stupid submitter. Stupid waste of energy. That's the 21st century for you. Idiocracy at its best.
Re:It is in fact virtually impossible (Score:4, Insightful)
You could prove that for the length of a work of Shakespeare (N), the amount of "monkeys" required to solve the problem in the same amount of time is 26^(N-9). Or, as it relates to the proverb, the solution to the equation has the time required to create a work of Shakespeare as infinite and the number of monkeys required to solve it in that time as infinite.
Of course, that solution didn't require programming the monkeys. But it is extrapolatable out to an entire work.
Re: (Score:2)
This experiment, while fun, isn't exactly the infinite monkey experiment.
Of course it is not. It is impossible to simulate an infinite amount of monkeys working for an infinite amount of time. Some concession has to be made to the fact that we have a finite amount of computing power.
Re: (Score:2)
This sort of concession misses the point. The "infinite monkey theorem" is about how wildly unlikely things are not the same as impossible things. Therefore you cannot discount the possibility that a thing happened or will happen just because it is very improbable to happen, if it was or is going to be subjected to an arbitrarily high number of "chances" to happen.
This experiment breaks it down to brute-forcing a poor password, billions of times, instead of brute forcing a friggin' insane password, which
Re: (Score:3, Informative)
He virtually cornered the market (Score:2)
I'm virtually impressed, virtually speechless even! The man is a virtual genius.
Re: (Score:2)
Then they started hurling virtual feces
Re: (Score:2)
Then they started hurling virtual feces
So then it would be accurate to label his experiment a bunch of steaming monkey feces?
Re: (Score:2)
Virtually, yeah.
I seem to recall something about this... (Score:2)
Sequential wording (Score:2)
HRmm...... (Score:5, Insightful)
If i'm understanding this, this isn't as cool as it seems. It seems like his 'monkeys' are just randomly creating words, and he matches those words against any word used in Shakespeare. If he gets a match, he marks that one as done. So, as some point one monkey made the word "be" and all of a sudden green lights all over the place.
I think the original saying was how random and unique it would be for a solid set of strings to randomly create a whole piece of work _in one go_ . Not a word here, a word there, OMG 100% of Shakespeare words have been randomly created.
Re:HRmm...... (Score:5, Insightful)
Exactly. So if it's going to be done in this way, then why not break it down into INDIVIDUAL characters. Have a monkey generate a single letter, and see if that happens to match something one of Shakespeare's works. I bet that algorithm would be even faster.
Re: (Score:3)
Yep. This whole 'experiment' reminds me of the Monty Python Great Actor skit:
Sir Edwin: Ah, well, I don't want you to get the impression it's just a question of the number of words... um... I mean, getting them in the right order is just as important. Old Peter Hall used to say to me, 'They're all there Eddie, now we've got to get them in the right order.'
Re: (Score:2)
Yeah, you got it right the first time. Despite the popularity of the idea that a million monkeys could randomly create the works of Shakespeare, It would take trillions of years for the monkeys to create the first few paragraphs. This is an obvious time-waster.
Re: (Score:3)
Re: (Score:2)
What if you allow misspellings and word substitutions? I've been able to read ebooks just fine with all the steganography they've been putting in (either that, or pretty much every ebook I've ever read has had terrible editing...). Surely that narrows the problem space a little bit.
Re: (Score:2)
Ok, but how about this...
The next letter in the target manuscript is 'F'... count the number of keystrokes until a monkey randomly types F. The next letter is 'l', type the number of keystrokes until an l is hit... and at the end multiply the numbers together...
That sounds fairly reasonable to me.
Re:HRmm...... (Score:5, Insightful)
It's not that a million monkeys could randomly create the works of Shakespeare. It is that an infinite number of monkeys could recreate all of the work in the known world, including Shakespeare. The thing about infinity, is that it is really, really big. If the amount of resources thrown at a problem is truly infinite, all possible results just happen, no matter how improbable.
The point of the saying is how mind-meldingly large infinite is, and how bad our minds are at comprehending the ramifications. This is one.
Re: (Score:2)
I'm probably the only person on /.who will point out that none of Shakespeare's works were written in paragraphs.
Re: (Score:3)
You nailed it. The problem becomes more difficult as the number of characters and words increases, for the simple reason that you have to go further without a mistake having been made. If something like a Bogosort [wikipedia.org] takes O(n*n!), I shudder to think how long recreating the works of Shakespeare would take, but that's the very point of the expression: to express the unlikelihood of a random set of occurrences leading to an outcome.
Re: (Score:2)
I always thought the expression was saying the exact opposite. I've always heard it used in a similar manner as "even a broken clock is right twice a day". That not every genius-looking outcome required genius-level input. In other words, rather than being about unlikelihood, it is about inevitableness.
Re: (Score:2)
I've heard it both ways, now that you mention it. "Put some monkeys in front of enough typewriters..." vs. "That's like putting monkeys in front of typewriters and expecting..."
Re: (Score:2)
I was about to say that, but then I realized it's not. The difference being that bogosort deals with a fixed set of members to sort, whereas compiling the works of Shakespeare via random keystrokes does not. Or, put another way, the monkeys could repeat or fail to repeat any character any number of times, plus they need to order those potentially incorrect letters, whereas bogosort will always have the correct letters, and it's just a matter of ordering them.
An intuitive way to see that they're different is
Re: (Score:2)
Re: (Score:2)
The slashdot title should have been, "Man completely misunderstands the Monkey Shakespeare Theorem."
I noticed that the linked website has comments off, so no one can help the author understand what the theorem really means.
Indeed (Score:2)
http://en.wikipedia.org/wiki/Infinite_monkey_theorem#Direct_proof [wikipedia.org]
Probabilities
Even if the observable universe were filled with monkeys the size of atoms typing from now until the heat death of the universe, their total probability to produce a single instance of Hamlet would still be many orders of magnitude less than one in 10^183,800.
Re: (Score:2)
Yeah but if you have a million programmers typing randomly into their keyboards, eventually one might write a program to simulate teh million monkeys experiment correctly.
So this is progress.
Re: (Score:2)
Re: (Score:2)
Yeah, it was definitely pretty weak. I was expecting to read about all the other works of literature of equal or lesser length to the shakespeare one that the monkeys also produced. Including the screen play for the simpsons episode about this very subject (except with dickens).
Recreating subsets of Shakespeare (Score:2)
So the virtual monkeys are recreating a subset of the work of Shakespeare not an entire work. And the Hadoop instance is splicing them together?
Re: (Score:2)
That's what it sounds like. Basically, it's a rigged system to get around the problem that it would take virtually an infinite amount of time to accomplish this if we were looking for a fully-complete work from one random string of characters.
Oblig. Simpsons (Score:5, Funny)
"It was the best of times, it was the BLURST of times! Stupid monkeys!" {strikes them with script...}
Point being? (Score:2)
srand (time(NULL));
while (1)
if (rand()==1234)
puts("OMGOOSES!");
Kinda a waste of CPU cycles...
Re: (Score:2)
Yeah, it isn't completely random random, because of all the filtering done to ensure the randomly-generated words are part of the original text. Sure the words are generated randomly, but reducing them and checking for membership, and then checking to see if they're in the source kinda ruins the whole point.
bleh (Score:2)
Re: (Score:2)
Does anyone else think this is supid? (Score:5, Insightful)
and that he missed the point of the expression?
Of course it will work the Mersenne twister will eventually cover the entire 9 letter space and then he can search though for the parts that match (yes he is doing it concurrently but that’s just an inefficient way of doing it). If he had the RAM and time he could eventually recreate every book possible.
The Wikipedia page explains it better that infinite random sting is bound to contain something that is perceived as useful. Of course the literal take [wikipedia.org] on on the expression is the most funny.
Re: (Score:3)
Surely this is obvious.
1 million monkeys on typewriters coming up with 9 CHARACTERS of shakespeare each is just a tad more likely than any monkey (from a team of 1 million) coming up with the ENTIRE WORK of shakespeare.
I'm not really sure what this guy set out to prove.
Re: (Score:3)
Ya, and they certainly got a lot of help to recreate Shakespeare... like human help.
These monkeys were no ordinary monkeys either. First and foremost, they BEHAVE.
It's like he didn't even understand the expression as GP said, yet went out to demonstrate his misunderstanding literally. ... and that is what makes this story interesting :)
Re: (Score:2)
not only did the monkeys produce nothing but five pages consisting largely of the letter S, they started by attacking the keyboard with a stone, and continued by urinating and defecating on it.
Sounds to me like the monkeys produced five pages of Snakespear.
Re: (Score:2)
Re: (Score:2)
Yes. It's stupid and he should be embarrassed.
All he's done is get a bunch of "virtual monkeys" to recreate many 9-character works of Shakespeare.
Putting them in order wasn't done randomly... and it's the *order* of the words (or characters or bits or whatever arbitrary length of data you decide to use) that makes it a Shakespearian work!
I'll bet he chose "9" because it was the biggest he could make it without it "taking too long."
Re: (Score:2)
and that he missed the point of the expression?
Of course it will work the Mersenne twister will eventually cover the entire 9 letter space and then he can search though for the parts that match (yes he is doing it concurrently but that’s just an inefficient way of doing it). If he had the RAM and time he could eventually recreate every book possible.
The Wikipedia page explains it better that infinite random sting is bound to contain something that is perceived as useful. Of course the literal take [wikipedia.org] on on the expression is the most funny.
Yes its missing the point. Its a neat trick and should pad his resume but its missing the point of the infinite number of monkeys. The worst thing about it is that its potentially harmful. You can hear the class discussion on infinity now, where when discussing the problem someone who's seen the story on the news/read it on the net pipes up that some guy has proven it using the internet, thus sidetracking the discussion away from the concept of infinity.
Re: (Score:2)
Come on you know what it means.
My lack of punctuation on the other hand is pretty bad. Sorry to anyone who had to read a sentence again.
Given the process used, the title is misleading (Score:2)
I think that the goal is that one of the many monkeys types an entire work of Shakespeare, not that many monkeys each type a very small segment of Shakespeare mixed in with gibberish, and then the many very small segments of Shakespeare are cut from the surrounding gibberish and combined by a person of intelligence into a work of Shakespeare.
Why is this interesting? (Score:2)
Certainly this story must interest some people. To you, I ask this question: what makes this story interesting? To me it's a waste of energy that doesn't produce anything unexpected or particularly interesting. Compared to this, the Minecraft Enterprise-D [youtube.com] is useful--it's at least interesting.
(Note: I am a mathematician, so maybe I'm missing some of the novelty associated with random number generation and exponential growth.)
Re: (Score:2)
Certainly this story must interest some people. To you, I ask this question: what makes this story interesting? To me it's a waste of energy that doesn't produce anything unexpected or particularly interesting. Compared to this, the Minecraft Enterprise-D [youtube.com] is useful--it's at least interesting.
(Note: I am a mathematician, so maybe I'm missing some of the novelty associated with random number generation and exponential growth.)
I reckon most of us are of this persuasion...colour me unimpressed as well.
Re: (Score:2)
If he had successfully randomly achieved a shakespeare play, [...] It would be like a flying saucer landing and informing someone that they won the galactic lottery.
It's far, far, far, far, far, far, far, far, far, (...), far more improbable than that. The text of Hamlet (see Project Gutenberg [gutenberg.org]) is around 180 KB long, so around 1.44 million bits. Being generous and lopping off half (since most of the characters aren't present), and then rounding down, let's say it's 500,000 bits. There are 2^500,000 possibilities; this is a number with around 150,000 decimal digits. It's comparable to the odds of winning a 1-in-a-million lottery 25 thousand times in a row.
Winning a gala
Re: (Score:2)
To be clear, I didn't mean to imply that you didn't understand the problem. In fact, clearly you did, since you cut it down to a manageable version. This story was submitted and posted by two different people, so at least 3 people (assuming no overlap) have found it interesting. I was just curious about why it interests those who find it interesting.
Please, enjoy your project. I hope the negativity hasn't disheartened you.
Re: (Score:2)
BS (Score:2)
Re: (Score:2)
I have a RNG that uses a large table of random characters, and returns them sequentially. Seed value 0. It can recreate the poem.
To populate the table I used a bunch of old text from some dead guy.
If a infinite number of slashdot readers... (Score:2)
All donate an infinite amount of cash, you can build an infinite computing platform to run the infinite monkey experiment!
Don't be so tough on the guy! (Score:2)
He kept a million virtual monkeys gainfully employed for X amount of time. The job field is hard out there for virtual monkeys, too, you know. Bunch of anti-virtual monkey people, you are! Hmmmph!
Not sure why this is impressive? (Score:2)
I did something relatively like this (albeit smaller scale) when I was a 8th grader in the early 80's, learning BASIC.
One of the first programs I actually wrote myself instead of laboriously copying out of a book or a magazine was "Monkeys" - this was a program that randomly generated letters and added them to the right end of a string, checked the string against a target value, and if it didn't fit, deleted the leftmost character to the string.
If a$="tobeornottobethatisthequestion" then b$="The monkeys hav
Seriously? How dumb is that? (Score:2)
9 character chunks? Should I spend a 1/2 hour writing a monkey class that emits 1 character chunks and simply monitor it for "Much Ado About Nothing"? I bet it could reproduce the work in less than a second with a single monkey.
9 characters of any particular Shakespearean work isn't slashworthy...
Now if he had a monkey emit the entire work that would be interesting in an examination of how long it took and how it occurred.
Commenters missing the greater context (Score:2)
The rebuttal is that people making that first argument don't understand the replication part of natural selection. Evolution doesn't say atoms randomly come together to form each person. First they formed useful proteins, and those genes got replicated. Repeat and add one level of complexity each time, keep repeating 4 billion years... and you finally make complex organisms.
Back to the ana
Re: (Score:2)
I guess he had to use virtual monkeys because all real monkeys have progressed to randomly downloading things from bit-torrent.
There's nothing random about their downloads. Hot steaming MMMMMonkey porn!
Re: (Score:3)
all real monkeys have progressed to randomly downloading things from bit-torrent.
Sadly, in my experience you're more likely to find them here:
http://www.microsoft.com/learning/en/us/certification/mcse.aspx [microsoft.com]
Re: (Score:2)
Couldn't be worse at it than the last couple ;)
Re: (Score:3)
Couldn't be worse at it than the last couple ;)
Clearly you haven't been watching the Republican debates... ;^)
Re:Huh? (Score:5, Insightful)
As a programmer of several "stupid computer tricks" myself (like a filesystem driver for mounting IRC!), I am very appreciative for the fast computers that let us simulate very complex systems very quickly. I understand that it is my responsibility, as a software engineer, to use that speed and memory efficiently to optimize the results of the simulation.
This project has generated better illustrative proof than ever before that randomness will eventually produce everything. This is often a difficult concept for non-mathematical people to accept, so a nice example is always welcome among those who seek to educate. It is also worth noting that this project is running on Hadoop, which is not yet considered stable. While monkeys type Shakespeare, they also find bugs, stress-test releases, and educate at least one programmer. After such a test, Hadoop is much more favorable as a platform for more "real computing work" projects, like processing medical records looking for previously-unknown medication side effects.
While on the subject of "real computing work", please note that all nontrivial computation is done by software, and that all software can run on a Turing machine as designed in 1937. Those hardware engineers are doing real electrical engineering work, making circuits run with less power and smaller size. Those chemical engineers are doing real chemistry work, making semiconductors that can switch faster and at lower voltage. The software engineers are doing real computing work, finding fast algorithms and optimizing processes.
Re: (Score:2, Informative)
This project has generated better illustrative proof than ever before that randomness will eventually produce everything.
This project proves no such thing. It has shown only that randomness can reproduce (duplicate) something that already existed. This project can never reproduce War and Peace in the original Russian, as the Cyrillic alphabet is not included. It demonstrates effectively that some people will see what they want to see.
Re:Huh? (Score:4, Funny)
Re: (Score:2)
Re: (Score:3)
This project has generated better illustrative proof than ever before that randomness will eventually produce everything. This is often a difficult concept for non-mathematical people to accept, so a nice example is always welcome among those who seek to educate.
Here's a simpler example:
while(1)
{
int x = rand() % 10;
if (x==666) printf("Yes, everything!\n");
}
Re:Huh? Not random! (Score:2)
Re: (Score:2)
"This project has generated better illustrative proof than ever before that randomness will eventually produce everything."
I don't think you understand what this means, randomness in an ideal world will produce anything. Randomness in the real world will not. There are all sorts of gotcha's that make such a statement meaningless outside someones imagination.
Re: (Score:2)
I think it started about the time they introduced the floppy dik drive.
Re: (Score:2)
...using a random generator like Mersenne twister wouldn't work...
Now, for the mathematicians in the room: What is the probability that the particular string of length 11621 that corresponds to a particular work of Shakespeare is one of the ~2^20000 possible sequences from the Mersenne twister?
Re: (Score:2)
Ah HA, Shakespeare used spaces, too.
Re: (Score:2)
I double-dog dare this guy to do this with the latest Harry Potter and try to sell the result. It would make for an interesting court case ("Your Honor, my client simply carved his own novel out of a mound of gibberish").
Re: (Score:2)
In theory, some of the instances of the work would be prefixed by "This is a work of Shakespeare:".
Of course, others would also be prefixed with "This is a work of $author:" where $author would be the name of every writer that ever existed, rendering it pointless.
Re: (Score:2)
Maybe his virtual monkeys are in virtual cages.
Re: (Score:3)
My thoughts exactly.
If he had the virtual monkeys type random sequences of 1 character each, he'd have found one of Shakespeare's much sooner.
The typical interpretation would require a single monkey to have typed the entire document.
Nice try, but no banana.