Mathematics Achievement of Chinese, Japanese, and American Children: Ten Years Later

Author(s): Stevenson, HW; Chen, C; Lee, SY | Abstract: A decade of heightened emphasis in the United States on mathematics and science education has had little influence on academic achievement or parental attitudes. American elementary school children in 1990 lagged behind their Chinese and Japanese peers to as great a degree as they did in 1980. Comparison of the performance of elementary and secondary school students between 1980 and 1990 reveals a decline from first to eleventh grade in the relative position of American students in mathematics. Parental satisfaction with American students' achievement and education remains high and standards remain low. Innate ability continues to be emphasized by Americans as a basis for achievement. American eleventh graders report more indications of stress than do their Chinese and Japanese counterparts.

Pleistocene 7, 143 (1990) 81 . Some of the contrasts in the Beringian archaeological record between the occupations associat· ed with the 12.000 to 11 ,000 yr B.P. interstad1al and those associated with the 11 .000 to 9.500 yr B.P. stadiaJ may reflect chmahc differences.
Nenana Complex assemblages contain p11k;es esquillees and planes. wt11ch may be related to woodworking, whereas Denah assemblages yield numerous burins, which appear to have been used on bone, antler, and ivory. Traces of former dwelling structures are reported from both layers 7 and 6 at Ushki t. but only those from layer 6 ARTICLES exhibit evidence of entrance tunnels (57). 82. J . H. Greenberg, C. G. Turner II, S. L. Zegura, Curr. Anlhropol. 27, 477 (1986 T he American educational system received greater attention and scrutiny in the 1980s than in any decade since the 1950s. President Bush and governors proposed an educational agenda for the nation, commissions were appointed, and boards of education and school systems throughout the country attempted to initiate reforms aimed at improving the academic achievement of American students. We are now at a point, 10 years after the reform movement began, where it is useful to ask whether these activities have resulted in any improvement in the performance of the students. In 1980, we initiated a comparative study of American, Japanese, and Chinese elementary school students in Minneapolis, Sendai Qapan), and Taipei {Taiwan). The results showed that Chinese and Japanese first and fifth graders greatly surpassed their American counterparts in mathematics and chat Chinese children were more capable readers than the Americans (I).
The low levels of achievement found in cause Minnesota students rank high among the states in mathematics achievement, and Minnesota has the highest percentage in the nation of students graduating from high school (2,3). When problems are fou nd in Minnesota, more severe ones might be expected to occur in many other states. Four years after our original study, we returned to the same schools and followed up the first graders who were now fifth graders. No significant improvement occurred in the mathematics achievement of Minneapolis fifth graders during the 4 years, and cross-cultural differences were as great in 1984 as they had been in 1980 (4).
We began a new study in 1990. Once again we visited the schools included in the original study and tested a third sample of fifth graders. We also attempted to locate the first graders we had tested in 1980 {who were now eleventh graders) to trace their subsequent levels of academic achievement. To supplement the longitudinal sample, we tested over 1000 eleventh graders in each city-Data from the samples of students tested in 1980, 1984, and 1990 and 1991 form the basis of this article-These data allow us co assess possible changes in the performance of elementary school students

Outline of the Study
Our sample in 1980 included approximately 240 first graders and 240 fifth graders in each city. Nearly half of the students who were first graders in 1980 attended fifth grade at the same schools in 1984, the time of our first follow-up.
In 1990 and 1991, we attempted to follow up the first graders in our original study (now in eleventh grade). We located and obtained the cooperation of 212 students in Minneapolis, 169 in Taipei, and 93 in Sendai. Japanese eleventh graders were reluctant to participate in the study after school hours, primarily because they were preparing for the college entrance examinations they would take the following year. To place the eleventh-grade follow-up students in a broader context, we also studied cross-sectional samples totaling nearly 4000 students in Minneapolis, Taipei, and Sendai. These students attended 9, 18, and 8 different high schools, respectively. A larger number of schools was necessary to obtain a representative sample in Taipei because there are separate schools for boys and girls, and there are both regular and vocational high schools.
New samples of approximately 240 fifth graders in each city were also studied in 1990 (5). We interviewed over 85% of the mothers of the elementary school students and of the longitudinal sample of eleventh graders in each city. Over 73% of the fathers of the eleventh graders in the longitudinal sample filled out a questionnaire. We also interviewed the mathematics teachers of the eleventh graders. 54 We constructed all tests, interviews, and questionnaires used in the study. The tests, based on detailed analyses of the textbooks used in each city, were highly reliable ( 6) . By including only concepts and operations common to all three locations, we reduced the potential problem of differential exposure, as discussed recently by Westbury (7). The same mathematics and reading tests were given to the fifth graders in all three testing periods.
We also included a test of general information. This test provided a measure of knowledge that children did not usually learn in school but had acquired through their everyday experiences. The test for elementary school students included items such as "What are two things a plant needs in order to grow?" and "Why can't people live under water?" The eleventh-grade test included more difficult items, such as "Why has it become possible to make smaller computers in recent years?" and "What do we mean by inflation when we talk about a country's economy?" (8).
The interviews and questionnaires covered parental attitudes and beliefs and many aspects of children's lives at home and after school. We included some common questions in the interviews and questionnaires constructed for each grade level and for each aspect of the study.  156, and 198 (1990). The Japanese-American difference at first grade was 0.53 ± 0.09, and the Chinese-American difference was 0.73 ± 0.09. At fifth grade the respective differences were 0.84 ± 0.10 and 1.19 ± 0.10; at eleventh grade, 0.80 ± 0.10 and 0.87 ± 0.11 . children in 1990 had scores as low as those of the average American child.

Academic Achievement
Reading ability also differed among students in the three cities. In 1980, Chinese children obtained the highest average score on reading vocabulary, and Japanese children received the lowest scores. In 1990, Japanese fifth graders had become the top performers, and American students received the lowest scores (Fig. lB).
Eleventh graders. Of the three groups, American students continued to receive the lowest scores in mathematics at eleventh grade. Distributions of the scores of 3 792 eleventh graders in the cross-sectional samples (Fig. 2) reveal a wide disparity between the average performance of the American and of the Chinese and Japanese students.
Only 14.5% of the Chinese and 8.0% of the Japanese students received scores below the average score of the American students. The distribution of scores in Taipei was bimodal. This was related to the type of school the students attended. The average score of students enrolled in vocational high schools was 18.3; for students enrolled in the regular academic high schools it was 30. 7-Even the vocational high school students obtained a higher average score than the Minneapolis students' average of 13.4 points.
The relative status of students at eleventh grade can also be compared with what existed at first and fifth grades. For this purpose, we computed standard scores at each grade level for students in the longitudinal sample (9). The data (Fig. 3) offer no evidence of improvement in the status of the American students as they moved from first through eleventh grade ( 10).
If we analyze the data from all students tested at each grade level rather than from only those in the smaller longitudinal sample, we find chat the achievement gap in mathematics increased between first and eleventh grades. At first grade, the average Chinese and Japanese scores differed from the American scores by 0. 72 ± 0.09 and 0.53 ± 0.09 standard deviation (SD) computed in the manner indicated above. By eleventh grade the corresponding differences were L.18 ± 0.04 and 0.92 ± 0.03 SD. The increased divergence of scores, 0.45 ± 0.10 for the Chinese-American comparison and 0.39 ± 0.09 for the Japanese-American comparison, was statistically significant.
Another question sometimes raised is whether average scores offer the most appropriate measure for comparing children's levels of achievement in different countries. For example, might not the scores of high achievers, say the top 10% of students in each location, be comparable? We sought to answer this question by comparing the scores of the top 10% of American fifth and eleventh graders with those of the top 10% of Chinese and Japanese students.
The performance of the top Minneapolis studen ts was more similar to that of the average Taipei and Sendai students than it was to that of the top students in chose cities. For example, at fifth grade the mean score of the American top achievers was 1.00 SD below the mean for the Asian top achievers but only 0.04 SD higher than the mean of all of the Asian students. The same effect occurred at eleventh grade: the cop American students were L 13 SD below the mean of the cop Chinese and Japanese students and only 0.3 7 SD above the mean of au Asian students.
Generally, boys and girls were equally capable in mathematics during elementary school. This was true in all three cultures. At eleventh grade, however, scores of the boys in all three locations were significantly higher than those of the girls, and the gender differences were greater among C hinese and Japanese than among American students (11).
General information. The general information test offers an interesting contrast to the test of mathematics, where performance is highly dependent on academic instruction. Asian superiority was not evident in the general information scores (Fig. 4). Rather than diverge as grade level increased, as happened with the mathematics scores, the average scores of the American, Chinese, and Japanese students on the general information test became increasingly similar. At first grade, the average scores of the American students exceeded those of their Chinese peers by 0. 72 ± 0.09 SD and of their Japanese peers by 0.43 ± 0.08 SD. By eleventh grade, the scores differed by only 0.13 ± 0. 11 and 0.17 ± 0.13 SD, respectively. Early American superiority was also found in the performance of nearly 900 kindergarten children in the three cities (12). Scores of the American kindergarten children exceeded those of the Chinese children by 1.33 ± 0.07 SD and of the Japanese children by 0.69 ± 0.06 SD. Thus, throughout their schooling American students proved to be as capable as or even more capable than the Asian students when they were tested with items not based on the school curriculum.
We attribute the early superiority of the American children to the greater cognitive stimulation provided by their parents, who indicated that they read more frequently to their young children, took them on more excursions, and accompanied them to more cultural events than did the Chinese or Japanese parents (13). As American children grow older, parents appear to be less likely to provide the kinds of enriched out-of-school experiences that they did before the children entered first grade.

Parents' Satisfaction
One of the most dismaying findings in our 1980 study was the high level of satisfaction expressed by American mothers when they were asked about their children's academic achievement. Few Chinese and Japanese mothers, but over 40% of the American mothers, expressed high degrees of satisfaction with their children's academic performance. One reason for their uncritical attitude may have been their lack of information about the relative status of American children compared to their peers in other industrialized countries. Since 1980, however, the media has disseminated many reports of the weaknesses of American educational systems and of American students in international comparisons. We can ask, therefore, whether the publication of these critical reports may have led to diminished satisfaction by American mothers. It is evident (Fig. 5) that it did not. If anything, somewhat more American mothers said they were "very satisfied" with their children's performance in 1990 compared with 1980.
The continued satisfaction of American mothers is surprising because they seemed to be aware of the country's low status in comparative studies. For example, we told the mothers of the eleventh graders, "Several recent studies have compared students' school achievement in different countries. One recent study compared the performance in math of high school students from eight industrialized countries. Among the countries, where do you think (American, Taiwan, Japanese) high school students ranked in math?" American mothers estimated that American students' scores would fall between sixth and seventh place. Chinese and Japanese mothers were aware of their students' high status: they estimated that students in Taiwan and Japan would be between second and third place. In other words, Americans appeared to be aware that American education is in trouble but did not ascribe the phenomenon to their own children. One reason American parents may express such satisfaction is that they seldom receive clear, explicit information about their children's standing in academic subjects. In the United States, most elementary school teachers convey their evaluations of children through general phrases, such as "satisfactory" or "needs improvement" or through drawings of smiling or frowning faces. As students progress through school, however, the grading system becomes more informative and parents should have a clearer idea of how their children are performing. Even so, parents of the American eleventh graders remained very positive about their children's level of achievement. One-third of the American mothers said they were "very satisfied" with their child's academic achievement, a strong contrast with the 10% of the Chinese and 2% of the Japanese mothers who said they were "very satisfied." Fathers expressed similar attitudes: the percentages of fathers of eleventh graders who were "very satisfied" were 37% (Minneapolis), 8% {Taipei), and 6% (Sendai).
American students, as well as their parents, had a high regard for their own academic accomplishments. For example, we asked the eleventh graders and their mothers and fathers to rate the students' academic achievement and achievement in mathematics compared with corresponding achievements of other students of the same age. For all three sets of ratings (Fig. 6), the average ratings made by the American students and their parents were above those made by their Chinese and Japanese counterparts and were consistently higher than the score of "4,'' which was defined as that of average students.
We also asked the parents about how good a job they thought their children's schools were doing in educating their child. The American mothers of fifth graders were consistently more positive than the Chinese and Japanese mothers (Fig. 7). The positive attitude persisted to eleventh grade: 79% of the American mothers, but only 44% of the Chinese and 48% of the Japanese mothers, rated their children's schools as doing a "good" or "excellent" job. American fathers were also very positive: 77% said their eleventh grader's school was doing a "good" or "excellent" job, but Chinese and Japanese fathers were no more positive than the mothers: 51% of the Chinese and 40% of the Japanese fathers gave their children's schools the top ratings.
The children assimilated these positive attitudes. As early as fifth grade, American students expressed confidence that they SCIENCE  were doing·as well in school as their parents and teachers wanted. Asian students were less sure ( 14).
Mothers' attitudes about the academic curricula were equally stable over the decade. Despite the publicity about the need for reform, for upgrading standards, and for assigning more homework, mothers of fifth graders were no more enthusiastic about instituting changes in 1990 than they had been in 1980 or 1984. Over 80% of the American mothers in all three periods thought the level of difficulty of the curriculum was "just right." In 1990, less than 10% thought it was either too hard or too easy. Very few Japanese or Chinese mothers ever thought the curriculum was too easy, but from 10 to 30%, depending on the year and location, thought it was too difficult.
American mothers expressed little enthusiasm about increasing the amount of homework. The great majority said the amount was "just right." A little more than a quarter of the American mothers thought their fifth graders had too little homework, both in 1980 (27%) and in 1990 (28%). Nearly a third of the Japanese mothers and 10% of the Chinese mothers expressed this attitude. These attitudes existed even though students in Taipei and Sendai spend much more time on homework than the students in Minneapolis (I 5).
In short, Minneapolis parents continued to be as satisfied in 1990 as they had been in 1980 with their children's academic achievement, the quality of education provided by their children's schools, the curricula, and the amount of homework assigned. In view of the persistence of such positive attitudes and of the continued poor performance of American students, one must wonder about the degree of popular iJhi WU ; Hfflriiffl& ¥Hi support that exists generally in the United States for extensive changes in elementary and secondary education.
Many American parents do not have high standards for their children's academic achievement. For example, we told the mothers, "Let's say there is a math test in which there are 100 points. The average score is 70. What score do you think your child would get?" We then asked, "What score would you be satisfied with?" Asian mothers were less easily satisfied than the American mothers. The tendency in all three locations, especially among mothers of fifth graders, was to expect that their child would receive an above-average score (Fig. 8). American mothers tended to give the highest estimates and Japanese mothers, the lowest. In responding to the second question, American mothers said they would be satisfied with the score they expected their child would receive (16). Chinese and Japanese mothers required a score higher than the expected score to be satisfied.
We found similar results when we asked the fathers and students these questions and when we asked about reading as well as mathematics. Americans consistently indicated that the score with which they would be satisfied was about the same as or lower than the score they expected their children or they would receive.

Ablllty versus Effort
In earlier reports, we presented evidence of cultural differences in the relative emphasis given to effort and ability in accounting for the academic achievement of elementary school children. Chinese and Japanese fifth graders and their mothers, following longheld tenets of Asian philosophy, stressed the importance of hard work as the route to success. American mothers, to a greater degree than Chinese and Japanese mothers, emphasized the importance of innate ability. We wondered whether this characteristic would remain applicable when students were in high school. Years of trying hard and not succeeding or of finding that high ability cannot compensate for failure to study might lead eleventh graders and their parents to modify their earlier beliefs. This did not prove to be the case. Eleventh graders in Japan and Taiwan remained strong adherents to the belief that hard work was of primary importance, and their American peers retained their strong belief in the importance of ability as a modifier of the effects of effort.
Several types of evidence support this conclusion. For example, we told the eleventh graders, "Here are some factors that may influence students' performance in mathematics: a good teacher, innate intel-ligence, home environment, and studying hard. Which do you think is the most important factor?" More Chinese and Japanese than American students thought studying hard was the most important factor (59 and 72% versus 27%, respectively). In contrast, more than twice as many American as Chinese or Japanese students chose "a good teacher" (54% versus 18 and 14%, respectively).
We asked the teachers of the eleventh graders the same question. Among Japanese teachers, 93% selected "studying hard." Only 26% of the American teachers chose this alternative. In contrast, the first choice of 41 % of the American teachers but of only 7% of the Japanese teachers was innate intelligence. Chinese teachers' choices fell between those of the American and Japanese teachers.
In another set of questions, we asked the students how strongly they believed in such statements as "Everyone in my class has about the same natural ability in math." American students disagreed to a significantly greater degree than did Chinese or Japanese students (17). When mothers were asked this type of question, American mothers strongly disagreed; Chinese and Japanese mothers were more neutral (18).

Psychological Adjustment
Critics of the academic success of Chinese and Japanese students often suggest that their high levels of performance come at great psychological cost. This criticism is based, we suspect, on informal reports and old data; we know of no recent comparative studies supporting this belief. We sought such information in our study by asking the eleventh graders to tell us about how frequently within the past month they had experienced feelings of stress, depression, aggression, and somatic complaints of possible psychological origin, such as being unable to sleep. We also asked about the frequency with which they felt nervous when they took tests or when the teacher handed tests back.
Japanese, not American students, reported the lowest frequencies of occurrence of all these characteristics. American students reported the most frequent feelings of stress, academic anxiety, and aggression. The most common source of stress was school. It was mentioned by 70% of the American students-more than three times as often as four other major sources (peers, family, sports, and jobs). Chinese students expressed somewhat more frequent feelings of depression and somatic problems (19).
These data do not support the Western stereotype of Asian students as tense young persons driven by relentless pressures for academic excellence. Had they exhibited SCIENCE • VOL. 259 • 1 JANUARY 1993 W i&Wiiili&i.WEUI itiii ARTICLES frequent indications of psychological disturbance, it would be reasonable co argue that high standards for academic achievement may occur at unacceptable costs. Rather, it was the American students who were more likely to express indications of distress. We believe this occurred because American students do not have a clear idea about the importance they should place on education. Chinese and Japanese students are expected to devote themselves primarily to their studies. American students, in contrast, are faced with many opposing demands. For example, more than three times as many American as Asian students have afterschool jobs (74% versus 21%) and more than twice as many have "dates" (over 85% versus 37%). The motivation for economic independence and broad social experience, as well as the desire to engage in sports and to assist with family chores, make it difficult for American high school students to devote themselves wholeheartedly to their studies.

Implications
Unless radical reforms can be instituted in American educational systems, it seems unlikely that American students will lead the world in mathematics by the end of this century, one of the well-publicized goals adopted by the governors and President Bush in 1990. American schools may currently be fulfilling the roles expected of them by the American public, but its expectations prove to be insufficient when judged by international standards-American students received significantly lower scores on a curriculum-based mathematics test than their Chinese and Japanese peers, there was no significant improvement in the scores of fifth graders on the same test given three times over a period of 10 years, and the status of American students relative to their Chinese and Japanese peers declined between the first and eleventh grades.
These differences cannot be attributed to differential sampling. Enrollment in school is nearly universal among first and fifth graders in all three locations, and the percentage of adolescents enrolled in high school is similar if vocational high schools are included, as was the case in our study. Other factors were also carefully controlled, such as the time in the school year the data were collected, testing conditions, and the relevance of the test items.
We conclude that the achievement gap is real, that it is persistent, and that it is unlikely to diminish until, among other things, there are marked changes in the attitudes and beliefs of American parents and students about education. American parents appeared to be no more likely in ft•' 1990 and 1991 than they were in 1980 to believe that there is an urgent need for educational reform. They did not seem to be incensed by the low levels of performance by American students. Rather, they appeared to be pleased with their children's academic achievement, to be satisfied with the job their children's schools were doing, and to believe that children's innate abilities guide their course of progress through school. Attitudes and beliefs are difficult to change. But the likelihood of improving the nation's competitive position through better education depends, at least in part, on changing such optimistic but ultimately self-defeating views. Taiwan such rights of consent are vested in school authorities and teachers. The students and their parents were told that if there was anything they felt uncomfortable about answering, they could go on to the next question. The examiners, all native speakers and residents of each city, explained that the session could be terminated if the respondent wished. 6. The mathematics test given to first and fifth graders had 54 items arranged in order of difficulty. Some items required only computation; others required application of mathematical principles to word problems. We tested all children individually. Any child who failed to answer four successive items correctly was stopped.  18 (1992). 8. The general information test for the elementary school students contained 26 items; the eleventhgrade test contained 12 items. At first and filth grades. students proceeded through the test until they missed four successive items. Eleventh graders responded to as many of the questions as possible within a 12-minute lime limit. The reliability of the test ranged from O. 79 to 0.91 for first and filth graders; at eleventh grade it was 0.67 (Sendai) , 0.80 (Taipei), and 0.82 (Minneapolis). 9. We determined standard scores separately for each grade by combining data for students in all three locations in a single distribution. We then computed the standardized scores at each grade level using the mean and standard deviation of the weighted sample to yield equal proportions of students from each of the three locations. 10. Longitudinal comparisons across the 10-year span assume that the eleventh-grade sample of students was not biased. This assumption proved to be valid for the Chinese and Japanese students. Comparisons of the first-grade scores in mathematics, reading, and a cognitive test composed of ten subtests revealed no significant differences between eleventh graders included in the longitudinal sample and those who were not. For the American students, the average firstgrade scores of the 29 students we were unable to include in the follow-up sample were significantly lower on all three tests than were those of the 2 12 follow-up students. On the mathematics test, for example, the respective scores were