In 1950, British Mathematician Alan Turing published a paper called Computing Machinery and Intelligence. The paper opens with the remarkable sentence, "I propose to consider the question 'Can machines think?'" Remember that back in 1950, there were only a few computers in the world, and they were used exclusively for mathematical and engineering purposes. In this paper, Turing describes The Imitation Game, which we now call The Turing Test for machine intelligence. The test is quite simple: an interrogator using a teletype has to converse via a Q&A session with two hidden entities. One is a person, and the other is an AI chatbot. If the person guesses wrong, that is, identifies the chatbot as a human, then the computer has passed the Turing Test. Rember Turing called this the Imitation Game. Hence, the computer is successfully imitating intelligence. We can leave philosophers to decide if the computer is actually intelligent (note: any group of philosophers will never agree on this).
Now consider the maths of the Turing Test. If the interrogator simply randomly guesses between Human or Computer and wastes no time paying any attention to the merits of the Q&A session, they will be correct fifty per cent of the time since there are only two options. So, a large experiment using the Turing Test needs to identify the computer correctly significantly more than fifty per cent of the time to prove the AI has failed the Turing Test.
One such large experiment involving three large language models, including GPT-4 (the AI behind ChatGPT) has recently been published: HUMAN OR NOT? A GAMIFIED APPROACH TO THE TURING TEST. Over 1.5 million participants spent two minutes chatting with either a person or an AI. The AI was prompted to make small spelling mistakes and to quit if the tester became aggressive. With this prompting, interrogators could only correctly guess whether they were talking to an AI system 60% of the time a little better than random chance.
However, if the ChatGPT was prompted to be vulgar and use rude language, its success increased, and interrogators only identified the AI correctly 52.1% of the time, causing the authors to observe "that users associated impoliteness with human behaviour."
Turing himself set a low threshold for passing his eponymous test: "I believe that in 50 years’ time, it will be possible to make computers play the imitation game so well that an average interrogator will have no more than 70% chance of making the right identification after 5 minutes of questioning.” Well, it's been seventy years, but AI has decreased the chance of identification to 60%, and no better than guesswork if the AI curses.
This is a historical milestone. Passing the Turing Test has been held up as a significant challenge for AI since Turing's paper was first published, akin to summiting Everest or splitting the atom. The philosophers (and theologists) will continue to argue about the nature of intelligence, consciousness and free will while computer scientists continue developing machines that imitate intelligence.
No comments:
Post a Comment