Sir Francis Galton, a cousin of Charles Darwin, developed the first tests designed to measure: intelligence. Galton, a naturalist and mathematician, was interested in individual differences. He invented the correlation coefficient (which play such an important role in psychology) and developed the ideas behind fingerprinting and eugenics. Galton administrated a battery of tests-measuring such variables as head size, reaction time, visual acuity, and memory for visual forms, breathing capacity, and strength of hand grip- to over 9000 visitors to the London Exhibition in 1884.

His somewhat strange collection of tests reflected his belief that superior intelligence was accompanied by superior physical vigour. He undoubtedly was disappointed to discover that eminent British scientist could not be distinguished from ordinary citizens on the basis of their head size, and that strength of grip was not much related to other measures of intelligence. Galton’s tests did not prove very useful.

They intelligence test as we know it today was formulated by the French psychologist Alfred Benet (1857-1911). The French government asked Benet to devise a test that would detect those children too slow intellectually to profit from regular schooling.

He assumed that intelligence should be measured by tasks requiring reasoning and problem solving, rather than perceptual- motor skills. In collaboration with Theodore Simon (1873-1961), another French psychologist, Benet published a scale in 1905, which he revised in 1908 and again in 1911. These Benet scales are the direct predecessors of contemporary intelligence tests.

ADVERTISEMENTS:

Mental-Age Scale (Benet’s Method)

Benet assumed that a dull child was like a normal child but retarded in mental growth; he reasoned that the dull child would perform on tests like a normal child of younger age. Benet decided to scale intelligence as the kind of change that ordinarily comes with growing older. Accordingly, he devised a scale of units of mental age.

Average mental-age (MA) scores correspond to chronological age (CA), that is, to the age determined from the date of birth. A bright child’s MA is above his CA; a dull child has an MA below his CA. The mental-age scale is easily interpreted by teachers and others who deal with children differing in mental ability.

Item Selection: Because the intelligence test is designed to measure brightness rather than the results of special training, it must consist of items that do not assume any specific preparation. In other words, the intelligence test is designed to be an aptitude test rather than an achievement test, and it must be constructed accordingly.

ADVERTISEMENTS:

There are two chief ways to find items for which success is uninfluenced by special training. One way is to choose novel items with which an untaught child has as good a chance to succeed as one who has been taught at home or in school. Figure 9.2 illustrates novel items. In this particular case the child is asked to choose figures that are alike with the assumption that the designs are unfamiliar to all children.

The second way is to choose familiar items, with the assumption that all those for whom the test is designed have had the requisite prior experience to deal with the items. The following problem provides an example of a supposedly familiar item:

Mark F if the sentence is foolish; mark S if it is sensible.

S F Mrs. Smith has had no children, and I understood that the same was true of her mother.

ADVERTISEMENTS:

This item is “Fair” only for children who know the English language, who can read and who understand all the words in the sentence. For such children, direction of the fallacy in the statement becomes a valid test of intellectual ability.

Many of the items on an intelligence test of the Benet type assume general familiarity. A vocabulary test, for example, appears in almost all the scale. Familiarity with the standard language of the test is necessarily assumed.

The intelligence test is in some respect a crude instrument, for its assumptions can never be strictly met. The language environment of one home is never exactly that of another, the reading matter available to the subjects differs, and the stress upon cognitive goals varies.

Even the novel items depend upon perceptual discriminations that may be acquired in one culture or subculture and not in another. Despite the difficulties, items can be chosen that work reasonably well. The items included in contemporary intelligence tests are those that have survived in practice after many others have been tried and found defective. It should be remembered however, that intelligence tests have been validated by success in predicting school performance within a particular culture.

ADVERTISEMENTS:

Item Testing: It is not enough to look at an item and to decide that it required intelligence to answer it successfully; Some “tricky” or “clever” items turn out to be poor because of the successes or failures that occur through guessing. More pedestrian items, such as matters of common information, sometimes turn out to be most useful. These are “fair” items if all have had a chance to learn the answers.

How did Benet and those who came after him know when they had hit upon a good item? One method of testing an item is to study the changes in proportions of children answering it correctly at different ages. Unless older children are more successful than younger ones in answering the item, the item is unsatisfactory in a test based on the concept of mental growth.

A second method of testing an item is to find out whether the results for it correspond to the results on the test as a whole. This can be done by correlating success and failure on the item with the score made on the remaining items. If all items measure something in common, then every single item ought to contribute a score that correlates with the total score.

These two requirements for an acceptable item (increase in percentage passing with age and correlation with total score) reflect both validity and reliability. The test requirement is an indirect way of guaranteeing validity, being based on the inference that what we mean by intelligence should distinguish an older child from a younger one; the second requirement is a guarantee of reliability through internal consistency of the measures.

ADVERTISEMENTS:

Contemporary Benet Tests

The test originally developed by Benet underwent several revisions in this country, the first by Goddard in 1911. For many years the best-known and most widely used revision was that made by Terman a Stanford University in 1916, commonly referred to as the Stanford -Benet. The test was revised in 1937, 1960, and 1972.

In the Benet tests, an items is age-graded at the level at which a substantial majority of the children pass it. The present Stanford-Benet has six items of varied content assigned to each year, each item when passed earning a score of two months of mental age.

The procedure of testing is first to establish the child’s basal inertial age, the mental-age level at which he passes all items. Two months of mental age are then added for each item passed at higher wage levels. Consider, for example, the child who passes all items at the mental-age level of six years.

ADVERTISEMENTS:

If the child then passes two items at the seven-year level, four months are added; passing an additional item at the eight-year adds who more months. This particular child will have an earned mental age of six years and six months, regardless of chronological age. The test allows for some unevenness in development, so that two children can earn the same mental age by passing different items on the test.

Intelligence Quotient (IQ)

Terman adopted a convenient index of brightness that was suggested by the German psychologist William Stern (1871-1938). This index is the intelligence quotient, commonly known by its initials IQ. It expresses intelligence as a ratio of mental age to chronological age:

Measurement and Evaluation

Mental age (MA) Q Chronological age (CA) The 100 is used as a multiplier to remove the decimal point and to make the IQ have a value of 100 when MA equals CA. It is evident that if the MA legs behind the CA, the resulting IQ will be less then 100; if the MA is above the CA, the IQ will be above 100.

How is the IQ to be interpreted? The distribution of IQs follows the form of curve found for many differences among individuals, such as differences in height; this is the bell-shaped “normal” distribution curve shown in Fig. 9.2; in this curve most cases cluster around a Midvale, tapering off to a few at both extremes. The adjectives commonly used to describe the various IQ levels are given in Table 9.2.

In the 1960, and subsequent revisions of the Stanford-Benet, the authors introduced a method of computing the IQ from tables. The meaning of an IQ remains essentially the same as before, but the tables permit corrections to allow the IQ at any age to be interpreted somewhat more exactly. It is now arranged so that for each age the IQ averages 100 and has a standard deviation of 16.

A modern IQ is merely a test score adjusted for the age of the person being tested. It is thereof no longer a “quotient” at all, but the expression IQ persists because of its familiarity and convenience.

Tests with more than one Scale

Tests following the pattern originated by Benet use a great assortment of items to test intelligent and a pass or a fail on one kind of item is scored the same as a pass or a fail on another. But those who are skilled in the use and scoring of the tests learn much more than the final IQ.

They may note special strengths and weaknesses; tests of vocabulary, for example, may be passed a higher level than tests of manipulating form boards. These observations lead to the conjecture that what is being measured is not one simple ability but a composite of abilities.

One way to obtain information on specific kinds of abilities, rather than a single mental-age score, it to separate the items into more than one group and to score the groups separately. The Wechsler Adult intelligence Scale (described in Table 9.3) and the Wechsler intelligence Scale for Children use items similar to those in the Benet test, but they divide the total test into two parts a verbal scale and a performances scale-according to the content of the items.

A performance item is one that requires manipulation or arrangement of blocks, beads, pictures or other materials in which both stimuli and responses are nonverbal. The separate scaling of the items within one test in convenient for diagnostic purpose. Figure 9.3 shows a test profile and how the test scores are summed to valid IQs.

Questions tap general range of information; e.g., “How many weeks in a year?”

Tests practical information and ability to evaluate past experience; e.g., “How would you find your way out, if lost in a forest?” Verbal problems testing arithmetic reasoning.

Asks in what way certain objects or concepts (e.g., egg and seed) are similar; measures abstract thinking.

Series of digits presented auditory (e.g. 7.5-6-3-8) are repeated in a forward or backward direction. Tests attention and rote memory. Tests word knowledge.

A timed coding task in which numbers must be associated with marks of various shapes; tests speed of learning and writing.

The missing part of an incompletely drawn picture must be discovered and named; tests visual alertness and visual memory.

Pictured designs must be copied with blocks; tests ability to perceive and analyze patens.