Chatbots Review: The Turing test IV – Different versions of the Turing Test and how they matched against each other.

There are at least three primary versions of the Turing test, two of which are offered in "Computing Machinery and Intelligence" and one that Saul Traiger describes as the "Standard Interpretation." While there is some debate regarding whether the "Standard Interpretation" is that described by Turing or, instead, based on a misreading of his paper, these three versions are not regarded as equivalent, and their strengths and weaknesses are distinct.

The Imitation Game
Turing's original game, as we have seen, described a simple party game involving three players. Player A is a man, player B is a woman and player C (who plays the role of the interrogator) is of either sex. In the Imitation Game, player C is unable to see either player A or player B, and can only communicate with them through written notes. By asking questions of player A and player B, player C tries to determine which of the two is the man and which is the woman. Player A's role is to trick the interrogator into making the wrong decision, while player B attempts to assist the interrogator in making the right one.

Sterret refers to this as the "Original Imitation Game Test," Turing proposes that the role of player A be filled by a computer. Thus, the computer's task is to pretend to be a woman and attempt to trick the interrogator into making an incorrect evaluation. The success of the computer is determined by comparing the outcome of the game when player A is a computer against the outcome when player A is a man. If, as Turing puts it, "the interrogator decide[s] wrongly as often when the game is played [with the computer] as he does when the game is played between a man and a woman", it may be argued that the computer is intelligent. and in contrast to Sterrett's opinion, posit that Turing did not expect the design of the machine to imitate a woman, when compared against a human.

The second version appears later in Turing's 1950 paper. As with the Original Imitation Game Test, the role of player A is performed by a computer, the difference being that the role of player B is now to be performed by a man rather than a woman.
"Let us fix our attention on one particular digital computer C. Is it true that by modifying this computer to have an adequate storage, suitably increasing its speed of action, and providing it with an appropriate programme, C can be made to play satisfactorily the part of A in the imitation game, the part of B being taken by a man?"
In this version, both player A (the computer) and player B are trying to trick the interrogator into making an incorrect decision.

The standard interpretation
Common understanding has it that the purpose of the Turing Test is not specifically to determine whether a computer is able to fool an interrogator into believing that it is a human, but rather whether a computer could imitate a human. While there is some dispute whether this interpretation was intended by Turing — Sterrett believes that it was and thus conflates the second version with this one, while others, such as Traiger, do not — this has nevertheless led to what can be viewed as the "standard interpretation." In this version, player A is a computer and player B a person of either gender. The role of the interrogator is not to determine which is male and which is female, but which is a computer and which is a human.

Imitation Game vs. Standard Turing Test
There has arisen some controversy over which of the alternative formulations of the test Turing intended. Sterrett argues that two distinct tests can be extracted from his 1950 paper and that, pace Turing's remark, they are not equivalent. The test that employs the party game and compares frequencies of success is referred to as the "Original Imitation Game Test," whereas the test consisting of a human judge conversing with a human and a machine is referred to as the "Standard Turing Test," noting that Sterrett equates this with the "standard interpretation" rather than the second version of the imitation game. Sterrett agrees that the Standard Turing Test (STT) has the problems that its critics cite but feels that, in contrast, the Original Imitation Game Test (OIG Test) so defined is immune to many of them, due to a crucial difference: Unlike the STT, it does not make similarity to human performance the criterion, even though it employs human performance in setting a criterion for machine intelligence. A man can fail the OIG Test, but it is argued that it is a virtue of a test of intelligence that failure indicates a lack of resourcefulness: The OIG Test requires the resourcefulness associated with intelligence and not merely "simulation of human conversational behaviour." The general structure of the OIG Test could even be used with non-verbal versions of imitation games.

Still other writers have interpreted Turing as proposing that the imitation game itself is the test, without specifying how to take into account Turing's statement that the test that he proposed using the party version of the imitation game is based upon a criterion of comparative frequency of success in that imitation game, rather than a capacity to succeed at one round of the game.

Saygin has suggested that maybe the original game is a way of proposing a less biased experimental design as it hides the participation of the computer.

Should the interrogator know about the computer?
Turing never makes clear whether the interrogator in his tests is aware that one of the participants is a computer. To return to the Original Imitation Game, he states only that player A is to be replaced with a machine, not that player C is to be made aware of this replacement. When Colby, FD Hilf, S Weber and AD Kramer tested PARRY, they did so by assuming that the interrogators did not need to know that one or more of those being interviewed was a computer during the interrogation. As Ayse Saygin and others have highlighted, this makes a big difference to the implementation and outcome of the test. Huma Shah & Kevin Warwick, who have organised practical Turing tests, argue knowing/not knowing may make a difference in some judges' verdict. Judges in the finals of the parallel-paired Turing tests, staged in the 18th Loebner Prize were not explicitly told, some did assume each hidden pair contained one human and one machine. Spelling errors gave away the hidden-humans; machines were identified by 'speed of response' and lengthier utterances. In an experimental study looking at Gricean maxim violations that also used the Loebner transcripts, Ayse Saygin found significant differences between the responses of participants who knew and did not know about computers being involved.

Based on http://en.wikipedia.org/wiki/Turing_test licensed under the Creative Commons Attribution-Share-Alike License 3.0

Chatbots Review

Saturday, December 31, 2011

The Turing test IV – Different versions of the Turing Test and how they matched against each other.

No comments:

Post a Comment