News that a computer passed the Turing test was quickly undermined once people began looking into the claim in detail. CBC's John Bowman collected a sample of the criticisms on Twitter.
Wired's Adam Mann takes it apart at leisure.
Anders Sandberg, meanwhile, links to a blog post of his, "Eugene the Turing test-beating teenbot reveals more about humans than computers". He suggests that the appeal of the Turing test lies in human incapacity to discern actual intelligence.
News media, including the CBC, carried the story and the skepticism surrounding it. But on Twitter, programmers and tech journalists almost immediately began to question the claim on a number of fronts.
Eugene Goostman isn't a "supercomputer," but a computer program called a "chatbot," meant to emulate a person typing into an instant messaging service, they pointed out.
As for the claim that it "passed" the "Turing Test" "for the very first time," well, they found each part of that claim questionable.
The Turing test is based on a question and answer game, proposed by renowned British mathematician and codebreaker Alan Turing, to distinguish humans from computers.
Turing predicted in a 1950 paper that within 50 years, computers would play the game so well that an "average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning."
[. . .] There's even dispute over whether the test as the researchers set it up was really the "iconic Turing Test." The judges were told that "Eugene" was a 13-year-old boy from Odessa, Ukraine, and that English wasn't his first language.
So, right away, the bar was lowered.
Wired's Adam Mann takes it apart at leisure.
There’s nothing in this example to be impressed by,” wrote computational cognitive scientist Joshua Tenenbaum of MIT in an email. He added that “it’s not clear that to meet that criterion you have to produce something better than a good chatbot, and have a little luck or other incidental factors on your side.”
Screenshots on the BBC’s article about the win show a transcript that doesn’t read like much more than a random sentence generator. When WIRED chatted with Goostman through his programmers’ Princeton website, the results felt something like an AIM chatbot circa 1999.
WIRED: Where are you from?
Goostman: A big Ukrainian city called Odessa on the shores of the Black Sea
WIRED: Oh, I’m from the Ukraine. Have you ever been there?
Goostman: ukraine? I’ve never there. But I do suspect that these crappy robots from the Great Robots Cabal will try to defeat this nice place too.
The version on the website could of course be a different version than was used during the competition.
This particular chatbox almost passed a version of the Turing test two years ago, fooling judges approximately 29 percent of the time.
Fooling around 30 percent of the judges also doesn’t seem like a particularly high bar. While the group claims that no previous computer program has been able to reach this level, there have been numerous chatbots, some as far back as the 1960s, which were able to fool people for at least a short while. In a 1991 competition, a bot called PC Therapist was able to get five out of 10 judges to believe it was human. More recently, there have been fears that online chatbots could trick people into falling in love with them, stealing their personal information in the process. And a 2011 demonstration had a program named Cleverbot manage a Turing Test pass rate of nearly 60 percent.
Anders Sandberg, meanwhile, links to a blog post of his, "Eugene the Turing test-beating teenbot reveals more about humans than computers". He suggests that the appeal of the Turing test lies in human incapacity to discern actual intelligence.
Why do we fall for it so easily? It might simply be that we have evolved with an inbuilt folk psychology that makes us believe that agents think, are conscious, make moral decisions and have free will. Philosophers will happily argue that these things do not necessarily imply each other, but experiments show that people tend to think that if something is conscious it will be morally responsible (even if it is a deterministic robot).
It is hard to conceive of a human-like agent without consciousness but with moral agency, so we tend to ascribe agency and free will to anything that looks conscious. It might just be the presence of eyes, or an ability to talk back, or any other tricks of human-likeness.
So Eugene’s success in the Turing test may tell us more about how weak we humans are when it comes to detecting intelligence and agency in conversation than about how smart our machines are.