DOCUMENT DE TREBALL XREAP2011-18 Does grade retention affect achievement? Some evidence from PISA J. Ignacio García-Pérez (University Pablo de Olavide) Marisa Hidalgo-Hidalgo (University Pablo de Olavide) J. Antonio Robles-Zurita (University Pablo de Olavide) Does grade retention a¤ect achievement? Some evidence from PISA J. Ignacio García-Pérezy Marisa Hidalgo-Hidalgoz J. Antonio Robles-Zuritax November 9, 2011 Abstract Grade retention practices are at the forefront of the educational debate. In this paper, we use PISA 2009 data for Spain to measure the e¤ect of grade retention on students’ achievement. One important problem when analyzing this question is that school outcomes and the propensity to repeat a grade are likely to be determined simultaneously. We address this problem by estimating a Switching Regression Model. We …nd that grade retention has a negative impact on educational outcomes, but we con…rm the importance of endogenous selection, which makes observed di¤erences between repeaters and non-repeaters appear 14.6% lower than they actually are. The e¤ect on PISA scores of repeating is much smaller (-10% of non-repeaters’average) than the counterfactual reduction that non-repeaters would su¤er had they been retained as repeaters (-24% of their average). Furthermore, those who repeated a grade during primary education su¤ered more than those who repeated a grade of secondary school, although the e¤ect of repeating at both times is, as expected, much larger. Keywords: Grade retention, educational scores, PISA JEL Classi…cation: D63, I28, J24. We are grateful to Giorgio Brunello, Antonio Cabrales, Rajashri Chakrabarti, M. Dolores Collado and Juan J. Dolado for their useful comments and discussion. We also thank seminar participants at the 2010 XREAP Workshop on the Economics of Education. Financial support from the Spanish Ministry of Education and Science (SEJ 2007-67734) and Junta de Andalucía (SEJ-2905, SEJ-426) is gratefully acknowledged. Address for correspondence: Marisa Hidalgo-Hidalgo, Department of Economics, Universidad Pablo de Olavide, E-41013, Sevilla, Spain. E-mail: mhidalgo@upo.es y University Pablo de Olavide,Seville z University Pablo de Olavide,Seville x University Pablo de Olavide,Seville 1 1 Introduction In most countries, students are promoted from one grade to the next on the basis of their academic performance. The PISA 2009 Report shows considerable variation in grade retention rates across OECD countries, with the grade retention rate de…ned as the percentage of 15- year-old pupils who are not in their country’ reference grade. s The report shows that the Netherlands, Austria and Portugal have relatively high rates of grade retention (with up to 50% of pupils having repeated one year or more) whereas countries such as Denmark, Sweden, Japan, Norway and the UK have no grade retention at all (see Belot and Vandenberghe, 2011). Spain belongs to the …rst group, with about 40% grade retention on average. These disparities may be due to di¤erences in policies, with some countries allowing students to be promoted to higher grades regardless of their performance and others conditioning promotion on students’educational achievements. The recent interest in academic performance di¤erences across countries as a result of increased international competition has brought the retention policy to the forefront of the educational debate. The PISA 2009 Report shows important di¤erences in this dimension. In particular, countries such as Spain, Portugal, Italy, France and Greece are clearly below average among the OECD countries and did not show any improvement with respect to the 2000, 2003 and 2006 reports.1 Moreover, even within a single country, scores vary widely across regions. For example, in Spain, the average math score in southern regions (e.g., the Canary Islands and Andalusia) is between 61 and 34 points below the OECD average whereas the average math score in some northern regions (e.g., Castile Leon or Navarre) is about 18 points above the OECD average. However, students’ poor performance on international tests is not the only concern of policy makers and academics. Increasing drop-out rates (see OECD, 2009) are also a major worry.2 Among a number of policies devised to help reduce school dropout rates and improve academic performance, we focus on grade retention regulation in this paper. Our objective is to estimate the grade retention e¤ect on educational outcomes for the whole Spanish sample and for each of the Spanish regions with enlarged sample. There is a great deal of controversy regarding the practice of grade retention. The proponents of retention argue that it may reinforce a student’ knowledge, with pos The average PISA 2009 test scores of Spanish students in math, reading and science are 480, 484 and 488, respectively, which are 13, 12 and 13 points below the respective OECD means and, obviously, much smaller than the scores in the best-performing countries, the Republic of Korea and Finland, where students score above 530 points in all disciplines 2 See Dearden et al. (2006) for an analysis of policies aimed at reducing drop-out rates in the UK. 1 2 tential bene…ts for his or her subsequent outcomes. Additional exposure to teaching, especially in early grades, may make a student more likely to pursue higher levels of education. Indeed, repetition may also improve the quality of the match between the school and the student if his development makes him more apt to succeed in a certain grade at a later age. The main argument in favor of grade retention is that it provides incentives to increase e¤ort, making it an e¢ cient mechanism to reallocate students. However, this e¢ ciency may come at a cost because retained students take longer to pass through the educational system. The critics of retention argue that it does not lead to improvements in school achievement and, instead, harms those low-achieving students who are most at risk of failure. They base their opinion on a large body of research on education and pedagogy that documents the negative e¤ects of retention, particularly in terms of reducing the high school completion rate.3 The challenge in identifying the e¤ect of grade failure on subsequent school outcomes lies in the fact that latent school outcomes (i.e., those that would be observed in the absence of grade failure) and the propensity to repeat a grade are likely to be determined simultaneously. Characteristics of the student (ability or motivation), the socioeconomic background and the school are likely to a¤ect grade retention and attainment simultaneously. Such correlations will likely overestimate the impact of grade failure on subsequent outcomes and compromise the identi…cation of a causal e¤ect of retention on scores. In addition, note that most tests that evaluate students’ knowledge in some particular discipline may not be appropriate for studying grade retention. Because repeaters are enrolled in lower grades, they have completed a less advanced curriculum and thus have a lower expected score. A growing body of literature examines the relationship between grade retention and educational outcomes. Some studies provide quasi-experimental evidence of the e¤ects of grade retention. For example, Manacorda (2008) exploits a discontinuity induced by a rule establishing automatic grade retention for pupils missing more than 25 days of school during a single academic year and shows that grade retention leads to a substantial increase in the drop-out rate and lower educational attainment 4 or 5 years later. Jacob and Lefgren (2004) …nd no consistent di¤erences in the performance of retained versus promoted students in the short run. However, Jacob and Lefgren (2009), who study the long-run e¤ects of retention on high school completion, …nd positive e¤ects of grade retention on education attainment for low-achieving third Some studies have found that retention is associated with increased drop-out rates (see Jimerson et al. (2002) and Roderick (1994), among others). However, as retention decisions are typically made by the teacher or school principal on the basis of a number of unobservable student characteristics (such as maturity or parental involvement), all of these studies are plagued by serious selection concerns. 3 3 graders but no signi…cant e¤ect for sixth graders. They use a regression discontinuity design strategy based on promotional decisions tied to performance on standardized tests in the Chicago public schools. Because PISA exams are aimed not at evaluating students’curricular knowledge but at assessing their general abilities, the score on this test is a more appropriate measure of the impact of grade retention on educational attainment. To circumvent the identi…cation problem noted above, we suggest using a switching regression approach.4 Our main identi…cation strategy is based on the fact that some variables may a¤ect students’outcome only through their e¤ect on the probability of repeating a grade. In this sense, we use the student’ quarter of birth as an instrumental s variable. We argue that this variable a¤ects the probability of grade retention but does not directly a¤ect educational outcomes. Several important results are found. First, if we consider grade retention as exogenous to the individual unobserved heterogeneity, the e¤ect of grade repetition on Spanish students’PISA outcomes is about 80 points out of an average math score of 480. If we take into account the two di¤erent educational processes for repeater and non-repeater students, this …gure does not change much. However, once endogeneity is properly controlled for, the retention e¤ect is reduced considerably for repeaters. With our model, we are able to measure the predicted e¤ect of grade retention on those who are actually retained: retention reduces their score by about 56 points. However, if we calculate the potential e¤ect of grade retention on non-retained students, the estimated e¤ect is above 125 points. That is, had they been retained as repeaters, their PISA outcomes would have been reduced by more than twice the observed reduction for repeaters. Estimation of di¤erent types of repetition e¤ects at the primary and secondary levels yields some interesting …ndings. Those students who were held back during their primary education su¤ered a larger impact on their educational outcomes compared with those who were retained during secondary school. In contrast, for non-repeaters, repeating a grade of secondary school is estimated to have a larger e¤ect. Moreover, we observed that repeating a grade in both primary and secondary school has a much larger negative e¤ect compared with repeating only one grade in either primary or secondary school. We …nd that grade retention varies substantially across regions: the retention e¤ect among repeaters is larger in some northern regions (e.g., Castile Leon and Rioja), which have the best educational outcomes, and is much smaller in Baleares (-38.5 points) and Canarias (-46.8). Many studies use this type of model to analyze di¤erent aspects of the labor market. See, among others, García-Pérez and Jimeno (2007), Carrasco (2001) and Prescott and Wilton (1992). 4 4 Finally, we decompose the observed di¤erence between repeaters’and non-repeaters’ scores into three di¤erent components: observed di¤erences in characteristics, di¤erences in the predicted e¤ects of each of these observable characteristics (returns) and di¤erences due to endogenous selection. We …nd that the observed di¤erences among repeaters and non-repeaters in Spain are essentially explained by di¤erent returns to observed individual, socioeconomic, and school characteristics that explain educational outcomes. This component accounts for 89% of the total di¤erence whereas the component due to di¤erences in observed characteristics accounts for only 25%. What is more interesting is that endogenous selection makes observed di¤erences appear 15% smaller than they actually are once we control for self-selection into the groups of repeaters and non-repeaters. Thus, without accounting for such endogeneity in the retaining status, di¤erences between repeaters and non-repeaters would be overestimated. Interestingly, this bias is most important in Catalonia and the Basque Country, the two regions where the percentage of retained students is the lowest among all Spanish regions. Hence, these regions seem to be implementing a slightly di¤erent retention policy, although this policy is not reducing the di¤erences between repeaters and non-repeaters. On the contrary, the smaller observed di¤erences in these two regions are due largely to increased larger self-selection into the two student groups. Our …nal result clearly demonstrates the importance of grade retention and the possibility that it is depressing the Spanish average. We perform a counterfactual exercise that shows that the Spanish average score would increase by about 25 points if grade retention were not considered. The rest of the paper is structured as follows. Section 2 describes the Spanish education system and presents a descriptive analysis based on our PISA 2009 data. Section 3 explains our methodology and identi…cation strategy. Section 4 presents the results, and …nally, Section 5 presents some important concluding comments. 2 Background and Data We …rst brie‡ describe the Spanish education system. The school system is orgay nized into three cycles: primary (grades 1-6), secondary (grades 7-10) and pre-college (grades 11-12). The …rst two cycles are compulsory (a student can choose to leave school at age 16). In 2009, the reference academic year in our study, the grade retention policy was as follows. At the primary and secondary levels, students could repeat a grade if their performance was deemed insu¢ cient. More speci…cally, students were required to repeat a grade if they failed three or more subjects. Students can only repeat a grade once during their primary education. In secondary school, 5 they can only repeat the same grade once, and they can only repeat grades twice in total.5 Rules on grade retention are the same in every region in Spain, and as can be observed in Table 1, retention practices are similar throughout all Spanish regions. In this paper, we use the PISA 2009 sample for Spain, and in particular the data for the regions with enlarged samples.6 The PISA 2009 database provides individuallevel information on demographics (e.g., gender, immigration status, month of birth), socioeconomic background (parental education), school-level variables and achievement test scores. We use math test score as our dependent variable here, as it shows the most variation between retained and non-retained students.7 Every student in the sample was born in 1993 (i.e., they were 15 years old when they took the PISA exams). In Spain, all students born in the same calendar year must enter school in the same academic year, with the 10th grade being the reference grade for 15-year-old students. Thus, we will call "non-repeater" students those enrolled in grade 10 and "repeater" students those enrolled in lower grades (8th or 9th).8 The total Spanish sample comprises 25,887 students, of whom 8,209 are repeaters. Regional sample sizes are similar to each other, at approximately 1,500 students per region, with the exception of Basque Country, which includes data from almost 5,000 students.9 Table 1 presents summary statistics for "non-repeater" students (columns 3 and 4) and "repeater" students (columns 5 and 6). We observe that the Canary Islands and Andalusia have the highest percentages of repeaters, at 45.5% and 42.9%, respectively. In addition, the same two regions present the lowest mean test scores for both repeaters and non-repeaters.10 However, the best-performing regions, Castile Leon The prevailing educational law in 2009 was the 2006 Organic Educational Law (LOE). For more statistics and details on the Spanish educational system, visit http://www.educacion.gob.es/ievaluacion/publicaciones/indicadores-educativos/SistemaEstatal.html. 6 The regions with a representative sample are Andalusia, Aragon, Asturias, Balearic Islands, Canary Islands, Cantabria, Castile Leon, Catalonia, Galicia, La Rioja, Murcia, Madrid, Navarre, Basque Country and Ceuta-Melilla. We refer to the three regions for which no representative sample is available (Extremadura, Castilla-Mancha and Valencia) as "the rest of Spain". 7 The PISA program assesses students’performance in three disciplines: science, math and reading. PISA 2009 edition focused on reading. Following the OECD’ recommended methodology, we s use the 5 plausible values and 80 sampling weights in the PISA Technical Report to calculate each student’ educational outcome and the standard errors of the estimated coe¢ cients. s 8 This de…nition is based on questions 1 and 3 of the PISA Student Questionnaire. 9 The PISA sample has a strati…ed two-stage design. First, schools with 15-year-old students are selected, and second, within each school, individual students are selected. See PISA 2009 Technical Report (2011). 10 Ceuta and Melilla, which participate jointly in PISA, are the cities with the poorest performance (e.g., their average math score is 417). However, because they have small relative dimensions within 5 6 and Aragon, do not have the lowest percentage of repeaters (we analyze this result in more detail below). Regarding individual variables, we observe that the proportion of repeaters is higher among males than among females (41.1% and 31.9%, respectively). Nevertheless, females achieve lower test scores than males do. In addition, the proportion of repeaters is larger among younger students (those born in the 3rd and 4th quarters of 1993). The proportion of repeaters is higher among immigrants compared with native students. Finally, the percentage of repeaters increases with the frequency of PC game use and with decreasing computer use. The socioeconomic variables have the expected relation with grade retention. That is, the number of repeaters is higher among students with low-educated parents.11 We can also observe that the proportion of repeaters decreases with the number of books at home. In addition, the percentage of repeaters is higher among those students whose parents (especially the mother) do not live at home. Finally, regarding the school-level variables, we …nd that the number of repeaters is higher in schools with more than 50% female students compared with other schools. School type (ownership) also a¤ects the proportion of repeaters: whereas only 19.5% and 25.6% of students in private schools repeat a grade (in independent and government-dependent schools, respectively), 43.6% of students in public schools do so.12 We also consider parents’pressure on the school and differentiate between schools with a majority of parents demanding very high academic standards and schools with only a minority of parents doing so (or no parents at all). We observe here that the proportion of repeaters is lower in schools where parents exert signi…cant pressure. Class size is also crucial for grade retention. This variable is categorized into two groups based on the median class size of 21 students. Interestingly, the percentage of repeaters is larger in those schools with smaller class size.13 Table 2 shows the distributions of the explanatory variables across regions. With respect to individual-level variables, we observe some di¤erences regarding the perSpain, we considered them in our econometric analysis but we do not comment on them when reporting some of our results 11 A father’ education is "high" if he has a secondary or higher education degree and "low" if he s has a primary or lower education degree. The same categories hold for mothers’education. 12 Regarding school ownership, we distinguish between public, government-dependent private (i.e., those with a percentage of public funding above 50%) and independent private (i.e., those with a percentage of public funding less than or equal to 50%). 13 There is no clear empirical evidence on the impact of class size. Angrist and Lavy (1999) …nd that reducing class size induces a signi…cant and substantial increase in test scores. However, Hanushek (1998) …nds no signi…cant impact of class size reduction on scores. Lazear (2001) argues that the reason why there is no consensus in the literature is because class size is a choice variable: schools adapt class size to students’type and behavior. 7 centages of immigrants: whereas in Madrid and Baleares more than 15% of students are immigrants, fewer than 6% are immigrants in Andalusia and Galicia. Students from Catalonia use computers more often than students in any other region. However, regional di¤erences in socioeconomic variables are larger. For example, there is a 22.3-percentage-point gap between Madrid, the region with the highest percentage of highly educated mothers, and Andalusia, the region with the lowest percentage. The same gap in fathers’ education is 21 percentage points. The region with the fewest students whose parents are highly educated is Andalusia, at only 44.7%. The region with the most parents who are highly educated is Cantabria, at 64%. As we can see in this table, Andalusia has the lowest percentage of students belonging to a household with more than 200 books at home, at only 17.9%, whereas this percentage is 33% in Madrid. The Canary Islands have the highest percentage of students whose mother or father does not live at home (4% and 15.2%, respectively). Finally, we also observe important di¤erences in the distributions of school-level variables across regions. For example, Spanish regions di¤er greatly in the percentages of students attending each type of school. The regions with the highest percentages of students in public schools are Canarias and Murcia (between 75% and 80%), whereas in the Basque Country, Catalonia and Madrid this rate is much lower (between 42% and 60%). However, the Basque Country and Catalonia di¤er signi…cantly in the percentages of students attending private schools: whereas, in the Basque Country, 58% of students attend government-dependent private schools, this percentage in Catalonia is only 21%. The regions with the highest percentage of parents exerting pressure are Catalonia and Madrid (84.5% and 55%, respectively). Finally, Murcia has the largest class size: 66% of students attend classes with 21 students or more. In Asturias, only 30.2% of students attend such large classes. 2.1 Grade retention and scores In this section, we provide some primary analysis on the relationship between grade retention and PISA test scores. Table 3 shows the mean and several percentiles of the distributions of PISA scores for non-repeater and repeater students. The observed average di¤erence between the two groups is impressive: more than 100 points, not only at the mean level, but also at the three percentiles shown. We distinguish three subgroups within repeater students depending on when they repeated (primary and/or secondary school). Table 3 also displays the means and percentiles for these three types of repeaters: those students who repeated only in primary school (Repeaters_P), those who repeated in both primary and secondary school (Repeaters_PS) and those who repeated only at the secondary level (Re8 peaters_S).14 As Table 3 indicates, the worst performers are those who repeated a grade at both educational levels. Density 0 .002 .004 .006 .008 0 200 400 Outcomes Non-Rep 600 Rep 800 Density 0 .002 .004 .006 .008 (a) Non-Repeaters vs Repeaters (b) Non-Repeaters vs Repeaters_P 0 200 400 Outcomes Non-Rep 600 Rep_P 800 Density 0 .002 .004 .006 .008 0 200 400 Outcomes 600 Rep_PS 800 Density 0 .002 .004 .006 .008 (c) Non-Repeaters vs Repeaters_PS (d) Non-Repeaters vs Repeaters_S 0 200 400 Outcomes Non-Rep 600 Rep_S 800 Non-Rep Source: Pisa 2009 Figure 1: Math score histogram by grade retention in subgroups of repeaters Figure 1 displays the histogram of PISA 2009 math scores for each group of students. There is heterogeneity within the complete distribution of scores for both repeaters and non-repeaters. However, what is really interesting is that the distribution of PISA scores for repeaters overlaps that of non-repeaters. Hence, there are repeater students in our sample who score better than some non-repeaters, most likely because of the e¤ects of observed or unobserved determinants of their performance. Finally, the distributions of scores for repeaters only at the primary or secondary level seem to be more spread out than the distribution of scores for repeaters at both the primary and secondary levels. Figure 2 below o¤ers some more evidence about the relationship between grade retention and math score. In this case, we aggregate data at the regional level and This de…nition is based on question 7 of the PISA Student Questionnaire. Note that there is a slight di¤erence between the number of repeaters according to the general de…nition above (that is, based on questions 1 and 3 of the PISA Student Questionnaire) and the total number of repeaters obtained by adding Repeaters_ P, Repeaters_ PS and Repeaters_ S. We assume this di¤erence to be due to measurement error. 14 9 compare the percentages of repeaters and their average math scores. We see a negative relationship between these two variables, which is consistent with the descriptive statistics above. The negative slope in panel (a) shows that, in general, those regions with better performance also have fewer repeaters. However, this relationship is not deterministic (e.g., Catalonia has a lower average math score and a lower percentage of repeaters than Castile and Leon). In panels (b), (c) and (d), the percentages of repeaters in the three subgroups are plotted. The negative relationship between scores and percentages of repeaters remains, in particular for those students who repeated a grade only at the secondary level.15 (a) Repeaters Percentage of repeaters 0 .1 .2 .3 .4 .5 CyM Can And Bal (b) Repeaters_P Percentage of repeaters 0 .1 .2 .3 .4 .5 Ara Res Rio Mur Gal Mad Can CL Ast Nav Cat Bas CyM Can Bal And Nav Mur GalCat RioBas L Mad Can Res Ast Ara C 420 440 460 480 Average score 500 520 420 440 460 480 Average score 500 520 (c) Repeaters_PS Percentage of repeaters 0 .1 .2 .3 .4 .5 Percentage of repeaters 0 .1 .2 .3 .4 .5 (d) Repeaters_S CyM Can And Bal Rio Ara Res Mur Gal Mad CL Can Ast Cat Nav Bas CyM Can Bal And Res Ast Can Rio Mur Gal Mad Ara C L Bas Nav Cat 420 440 460 480 Average score 500 520 420 440 460 480 Average score 500 520 Source: Pisa 2009 Figure 2: Relation between percentage of repeaters and average math score across Spanish regions Notice that the percentage of repeaters is the sum of the percentages of repeaters in each subgroup (primary only, primary and secondary, secondary only). Thus, the slope in panel (a) is the sum of the slopes in (b), (c) and (d). 15 10 3 3.1 Methodology The empirical model In this paper, we study the e¤ect of grade retention on test scores. Prior studies have attempted to study this e¤ect by estimating the following basic model: y i = Ii + X i + u i + " i ; (1) where yi is student achievement, X i is a vector of individual, socioeconomic and school variables and Ii is a binary variable that takes the value one if the student is retained and zero otherwise; ui represents unobserved student ability and "i is the error term. Several comments can be made here. First observe that general tests that evaluate students’knowledge in some particular discipline may not be appropriate for studying grade retention. As repeaters are enrolled in lower grades, they have completed a less advanced curriculum and thus have a lower expected score. In this sense, we believe that the PISA test is a proper one, as it does not aim to evaluate students’curricular knowledge but their general abilities.16 Second, note that if students are selected into retention on the basis of factors that are unobservable and that in‡ uence educational outcomes (e.g., parental e¤ort or a course-speci…c curriculum), then the estimation of is likely to be biased. Observe that being a repeater is due to low scores in previous years. Hence, di¤erences between repeaters and non-repeaters are not only due to grade retention. Indeed, repeaters may have di¤erent characteristics that in‡ uence their own educational attainment. More speci…cally, our initial hypothesis is that students who do not pass are those with the worst learning characteristics. To the extent that these characteristics are unobservable, estimated di¤erences in educational outcome between repeaters and non-repeaters may be biased under OLS. The typical approach to dealing with this endogeneity problem is using instrumental variables techniques. Note that this approach implies imposing equal e¤ects on the rest of the regressors in the educational outcome equations (see Equations (4) and (5) below) for both repeaters and non-repeaters. However, we believe that there must be other di¤erences between these two groups besides a change in the levels PISA assesses the extent to which students near the end of their compulsory education have acquired some of the knowledge and skills that are essential for full participation in modern societies. PISA seeks not only to assess whether students can reproduce knowledge but also to examine how well they can extrapolate from what they have learned and apply it in unfamiliar settings both in and outside of school (see PISA 2009 Report). 16 11 of such outcomes. To address this issue, we propose to estimate a switching regression model (SRM) to allow unbiased estimation of the model coe¢ cients, controlling for endogenous selection of repeaters and non-repeaters, and to allow for potentially di¤erent e¤ects of the variables included in the model for each group. As usual, we estimate this model by maximum likelihood.17 We specify the probability of repeating as a function of student characteristics. This probability acts as the selection equation in the Switching Regression model for repeaters’ and non-repeaters’ scores. In this model, the selection mechanism is described through a latent variable denoted by Ii , with the following process: Ii = Zi + ei ; (2) where Zi is a vector of speci…c explanatory variables that describes the determinants of the selection process, is the corresponding vector of unknown parameters, and ei is the random component of the selection equation, which includes unobservable variables that could be correlated with the observable and unobservable characteristics in the educational outcomes equations below. However, we only observe the realization of this latent variable Ii as follows: Ii = ( 1 if f Ii > 0 0; 0 if f Ii (3) that is, Ii is an indicator variable that equals 1 if the student repeats and 0 otherwise. Furthermore, as explained above, we will consider a di¤erent equation for each group of students: repeaters, yRi and non-repeaters yN Ri : yRi = Xi yN Ri = Xi R + uRi ; + uN Ri : (4) (5) NR We will refer to the previous two equations as the educational outcomes equations. We allow for endogeneity in the selection equation by assuming that ei , uRi and uN Ri have a normal trivariate distribution with mean zero and a covariance matrix as follows: 2 3 As the error term of each student’ score equation is correlated with the error term of the selection s equation, the estimation of the wage equations by OLS would be inconsistent. Furthermore, full maximum likelihood is more e¢ cient than the two-step estimation method proposed by Heckman (1979). 17 6 =4 2 R 2 R;N R 2 e;R 2 NR 2 e;N R 2 e 7 5; (6) 12 where 2 denotes the variance of the error term in the selection equation (2) and 2 e R 2 and N R are the variances of the error terms in the education outcome equations (4) and (5), respectively. Finally, we denote by 2 and 2 R the covariances between e;R e;N uRi and ei and between uN Ri and ei , respectively. These terms capture the correlation between the probability of grade retention and the educational attainment of repeaters and non-repeaters, respectively. The interpretation of these terms is as follows. If, for example, 2 < 0, then there exists a negative relationship between the unobserved eR variables that make a student more likely to repeat and the unobserved characteristics that increase a repeating student’ test score. That is, those factors that make a s student more likely to fail also make a repeater earn a worse test. On the contrary, if 2 e;R > 0, then what makes a student more likely to repeat also make a repeater have a better educational result.18 Finally, if 2 = 0, then there is no correlation between e;R the errors of the selection equation and the educational attainment of repeaters. The interpretation of 2 R is similar. e;N We denote by j for j = R; N R the correlation coe¢ cient between ei and uRi for j = R or uN Ri for j = N R. These two coe¢ cients are jointly estimated with the rest of the parameters in the model, and their interpretation is analogous to that of 2 e;R and 2 R . Hence, given the assumption about the distribution of error terms, the e;N log-likelihood function of the equations system (4) and (5) to maximize is: X i ln L = Ii + (1 Ii ) ln(1 ( Zi + p ln( ( R uRi = R) 1 2 R ) + ln( (uRi = N R) R )= R ) ! where ( ) is the cumulative distribution function of the selection process conditional on educational scores, and ( ) is the density function of educational scores. Now observe that we can obtain unconditional and conditional educational score predictions. The unconditional educational score is de…ned as the average predicted score for students with average unobserved characteristics, 1 and 0 for repeaters and non-repeaters, respectively. That is: 1 0 18 ( Zi + N R uN Ri = p ( 2 1 NR ) + ln( (uN Ri = N R )= N R ) ! (7) ; = X1 = X0 R; N R; (8) (9) For example, if the experience of repeating makes the student’ subsequent e¤ort increase, we s may observe a higher PISA score among repeaters compared with the counterfactual of what would have happened had the student not repeated. 13 where X 1 and X 0 denote the average observed characteristics for repeater and nonrepeater students, respectively. The conditional score, y 1 and y 0 R ; represents the R N mean predicted score for each student type, that is, from Equations (2) to (9): y 1 = E (yR j I = 1) = E yR j I R 0 = 1 + 0 e;R (z ) ; (z ) e:N R (10) (z ) ; (z ) (11) y 0 R = E (yN R j I = 0) = E yN R j I < 0 = N 1 We will use these two expressions when trying to breakdown the educational gap between repeater and non-repeater students. Using Equations (8) to (11), we can decompose the educational gap between repeater and non-repeater students as follows: y0 R N y1 = R (X 0 X 1) N R+ ( NR R )X 1 [ e;N R 1 (z ) (z ) + e;R (z ) (z ) ]: (12) The …rst term on the right-hand side in the equation above corresponds to the observed di¤erences in characteristics, the second term measures di¤erences in the predicted e¤ect of each of these observable characteristics (returns) and the third term corresponds to di¤erences due to endogenous selection. In Section 4.2 below, we estimate each of these components. In addition, we may compute the following conditional scores: (z ) ; 1 (z ) (z ) : N R + e:N R (z ) e:R y 0 = E (yR j I = 0) = E yR j I < 0 = X 0 R y 1 R = E (yN R j I = 1) = E yN R j I N R (13) (14) 0 = X1 These counterfactuals allow us to compute the grade retention e¤ect for repeaters, GRE 1 , and for non-repeaters, GRE 0 , as follows: GRE 1 = y 1 R GRE 0 = y 0 R y1 R N y0 R N (15) (16) In Section 4, we estimate the grade retention e¤ect as measured by the previous expressions to understand the e¤ects of self-selection into repeaters and non-repeaters. 3.2 Identi…cation Our model will be identi…ed once we allow for di¤erent regressors in each equation of the switching model (see Maddala (1988)). The identi…cation is also possible due to the assumptions about the joint normal distribution of the three error terms (uN i , 14 uN Ri and ei ). Nonetheless, we also identify the model by considering instrumental variables. The assumption now is that these instruments have an impact on the propensity of grade retention, but they do not directly a¤ect a student’ PISA score. s Hence, our speci…cation will allow identi…cation of the model by introducing variables in the selection equation (2) that are signi…cant for explaining the probability of repeating but are mostly uncorrelated with the student’ scores (Equations (4) and s (5)). Following the existing literature, we choose students’quarter of birth as an instrument. The quarter of birth is generally assumed to have an impact on pre-primary and primary test scores and thus on grade retention. Bedard and Dhuey (2006), among others, show that the relative age of a child in his class does not have a signi…cant long-term impact, but most of the e¤ect of relative age comes from programs such as grade retention and selection of pupils into di¤erent grades. Indeed, during the very …rst days of school, relative age is quite important because the oldest students may be much more mature than the youngest ones. As relative maturity is likely to be an important determinant of achievement during the early grades, it may play a crucial role in the decision of grade retention. The remainder of this section is aimed at showing data supporting this instrument. We base our argument on two kinds of analyses. First, we show that unconditional analyses in Table 1 and Figure 3 below give us some reason to choose this variable for the identi…cation of the model. Second, we report conditional analyses that clear up any doubts about the appropriateness of our instrument. As we can observe in Table 1, the average test score for non-repeater students does not vary signi…cantly with the quarter of birth (it ranges from 518 for those born in Q3 to 522 for those born in Q4). The same can be said for repeater students. However, the probability of repeating varies greatly with the student’ quarter of s birth (ranging from 30% for those born in Q1 to 43% for those born in Q4). These two results are required for an appropriate instrument. Figures 3(a) and 3(b) below show the average math score by quarter of birth, with Figure 3(b) di¤erentiating between repeaters and non-repeaters. Figure 3(c) displays the percentage of repeaters by quarter of birth. As can be observed in Figure 3(a), the quarter of birth a¤ects math scores if we do not di¤erentiate according to retention status. Those students who were born in the fourth quarter scored more than 10 points lower than students who were born in the …rst quarter. However, once we distinguish between repeaters and non-repeaters (see Figure 3(b)), the quarter of birth shows almost no e¤ect on scores. Finally, as can be observed in Figure 3(c), the quarter of birth has an important e¤ect on the propensity to repeat a grade. 15 530 530 (a) Average math scores by quarter of birth (b) Average math scores by quarter of birth (c) Percentage of repeaters by quarter of birth .5 510 510 430 430 410 410 Q1 Q2 Q3 Quarter of birth All students Q4 Q1 Q2 Q3 Quarter of birth Rep Non-Rep Q4 .2 .25 Percentage of repeaters .3 .35 .4 Average math scores 450 470 490 Average math scores 450 470 490 .45 Q1 Q2 Q3 Quarter of birth Rep Q4 Source: Pisa 2009 Figure 3: Instrument: Quarter of birth To further check the robustness of our instrument, we perform an additional conditional analysis. We estimate the impact of the instrumental variable on students’ scores once we control for all explanatory variables in our empirical model, joint with an indicator about whether the student is a repeater. Quarter of birth is introduced by two dummy variables that allow us to estimate the e¤ect of being born in the third and fourth quarters with respect to a reference student born in the …rst or second quarter. Coe¢ cient t-tests indicate that none of these dummies are signi…cant predictors of PISA test scores once we consider in the same equation whether the student is a repeater. The coe¢ cient and standard deviation are -2.02 and 2.19, respectively, for students born in the 3rd quarter and 0.74 and 2.35, respectively, for students born in the 4th quarter. Hence, we can conclude that quarter of birth can be used as an instrument, enabling us to identify our structural model. 4 The results In this section, we …rst comment on the results regarding the explanatory variables of the probability of grade retention. Second, we elaborate on the impact of grade retention on PISA test scores, show the educational outcome equations’estimation and decompose di¤erences between repeaters and non-repeaters. 16 4.1 The probability of grade retention In this section, we …rst comment on the results regarding the explanatory variables of the probability of grade retention. Second, we elaborate on the impact of grade retention on PISA test scores, show the educational outcome equations’estimation and decompose di¤erences between repeaters and non-repeaters.19 With respect to the regional variables, the coe¢ cients of both models are very similar, although some di¤erences emerge in, for example, the signi…cant e¤ect of regions such as Galicia, Catalonia or Basque Country when compared with the Canary Islands in the SRM. Regarding the individual variables, we …nd that our instrumental variable proposed above is a signi…cant predictor of the probability of repeating. Observe that the probability of grade retention increases with the student’ quarter of birth. Most s socioeconomic and school variables in the selection equation are signi…cant. The probability of repeating is negatively related to being female, the frequency of computer use for homework, parental education, the number of books at home and attending a government-dependent private school. A high probability of repeating is also related to being an immigrant, playing PC games very often, having a parent who does not live at home or going to a school with a majority of girls. Regarding class size, we …nd that increasing class size has a positive impact on the probability of being promoted, but this positive e¤ect diminishes with class size. In particular, we …nd that the optimum class size in terms of minimizing the probability of repeating is about 30 students. Above that …gure, the probability of repeating increases. Finally, observe that the coe¢ cients of the Probit and the SRM models are very similar. Moreover, the negative impacts on the probability of repeating of the frequency of computer use for homework and of attending a private government-dependent school, and the positive e¤ect of being born in the 4th quarter, having a mother who does not live at home or attending a school with a majority of girls become stronger once we control for the endogeneity of grade retention. In contrast, being born in the 3rd quarter and attending a private independent school become a bit less signi…cant. 4.2 The e¤ect of grade retention on scores The main objective of our study is to estimate the e¤ect of grade retention on educational attainment. Our estimation strategy (SRM) allows us to estimate two di¤erent The reference student in both equations is a male from the Canary Islands, a native of Spain, born in the …rst or second quarter of the year, with low frequency of using a computer for homework and games, whose mother and father are low educated and living at home, with fewer than 26 books at home. Regarding the school variables, the reference student attends a public school with a minority of boys and low parental pressure. 19 17 grade retention e¤ects: one for repeaters and another for non-repeaters. This model takes into account that these two groups of students may have di¤erent unobserved characteristics that may bias the estimation if they are not correctly controlled for. In this model, we assume that the educational production functions for the two groups of students di¤er. In addition to the SRM, we estimate two models to compare the estimation of the grade retention e¤ect when endogeneity of selection into repeaters and non-repeaters is not considered: OLS(a) and OLS(b). The former consists of OLS estimation of two di¤erent educational outcome equations, as in (4) and (5), but without controlling for selection bias. The latter is an OLS estimation of just one educational outcome equation, which is the same for both groups of students as in (1). Finally, we have also estimated an IV model that controls for endogeneity based on the OLS(b) speci…cation. Before focusing on grade retention e¤ects, we report the main results regarding the explanatory variables in the models. Table 5 shows the e¤ects of individual, family and school variables on educational outcomes according to SRM, OLS(a), OLS(b) and IV. Our SRM results show that high educational achievement in math is found among males, natives, those who frequently use a computer for homework, those with highly educated parents, those with a large number of books at home and those attending a school where a majority of the students are girls. We …nd that class size has a positive e¤ect on math scores, but with decreasing returns. Speci…cally, we …nd that the optimum class size to maximize students’math scores is 25 students for nonrepeaters and 22 for repeaters. This …nding is consistent with the existing literature on class size (see footnote 12 above). We also …nd some important di¤erences in the impact of the explanatory variables on educational achievement between repeaters and non-repeaters. For example, being an immigrant is much less favorable for nonrepeaters than for repeaters. The impact of the number of books at home is also much larger for non-repeaters. In Table 5, we also compare the SRM estimation with these three models and …nd similar results. However, we …nd some important di¤erential e¤ects for variables such as immigrant status, parents’educational status or class size. For example, the optimal class size in both OLS models is about 24 students, for both repeaters and non-repeaters, whereas the SRM, as emphasized above, predicts smaller optimal class size for repeaters than for non-repeaters. The di¤erent results found using these models are not surprising. The SRM implies estimating two correlation coe¢ cients between the unobservable factors that a¤ect each of the two educational outcomes and unobserved variables that a¤ect the probability of repeating a grade. Speci…cally, we get a positive estimate of R (0.31) and a negative (and signi…cant) value for N R (-0.22), as can be seen at the button 18 of Table 5.20 The intuition behind N R < 0 could be that potential non-repeaters may have unobservable characteristics that make them perform better than potential repeaters when they are promoted. A consequence of this will be that the negative grade retention e¤ect is bigger for non-repeaters than for repeaters. Thus, this result is capturing the impact of students’unobserved ability. Although not signi…cant in this speci…cation, we consider it useful to interpret the result of R > 0; which has emerged as signi…cant in some of the models explained in footnote 20 above. This sign may mean that repeaters have unobservable characteristics that make them also perform better in case they must repeat a grade. As a result, the negative e¤ect of grade retention will be lower for a repeater than for a non-repeater. The intuition behind this result can be found in parental interest and students’ e¤ort. Namely, those students who must repeat recruit greater support from their parents, improving educational attainment. To estimate the grade retention e¤ect, we use several models. Table 6 shows the results. First, OLS(b) estimates a unique and linear grade retention e¤ect on PISA scores (coe¢ cient in equation (1)) without allowing for endogenous selection into repeater and non-repeater groups. The estimated e¤ect of repeating is equal to -80.4 points (see column 4 in Table 6). If we control for the endogeneity of repeating in the outcome equation but assume a unique education process for repeaters and nonrepeaters (IV), we …nd that repetition diminishes PISA scores by 73.4 points (see column 5 in Table 6). Next, we relax the hypothesis of one educational equation and instead assume that repeaters and non-repeaters have a di¤erent education production process (OLS(a)), …nding that the grade retention e¤ect for repeaters (-78.9) is slightly lower than that for non-repeaters (-85.7). Finally, we comment on the SRM results, which di¤er greatly from those described above. A huge negative impact of grade retention on scores is estimated for both repeaters and non- repeaters. Nonetheless, the retention e¤ect for the latter (-125.1) is more than double the repetition impact for the former (-56.2). This result is a direct consequence of the correlation between the unobservable factors in the selection The results regarding N R and R are robust to other sets of instrumental variables. For example, we explore the validity of adding two instruments to the quarter of birth instrument: whether the student’ mother and/or father does not live at home and frequency of playing computer s games. We claim here that a student’ father or mother may not live at home because of a previous s parental death or divorce, which, according to the existing literature, does not negatively a¤ect teenagers’ cognitive skills (see Sanz de Galdeano and Vuri (2007)), such as the skills measured in PISA scores. However, parental death or divorce may a¤ect a student’ probability of repeating a s grade by the time it occurs. Finally, we consider computer games as another instrument, as this instrument is not signi…cant in explaining the PISA test scores (see Table 5), but it has a huge impact on the propensity of repeating a grade (see Table 1 and Table 4). 20 19 and the outcome equations. To con…rm this connection, suppose that two students, one with the average characteristics of a repeater and another one with the average characteristics of a non-repeater, do not repeat a grade. Then, the negative sign of N R implies that the average non-repeater has better unobservable factors (e.g., ability), which allows her to perform better than the average repeater. Now suppose that these two students have been retained; then the positive sign of R implies that the student with the repeater characteristics has unobservable factors (e.g., parental e¤ort or support) that lead her to achieve better results than the student with nonrepeater characteristics. Thus, both N R < 0 and R > 0 imply that a repeater student loses little as a result of being retained in comparison with a non-repeater. Columns 6-8 of Table 6 show that retention e¤ect estimates depend on the type of repetition to which a student has been subjected. The e¤ect is highly negative regardless of whether repetition occurs at the primary or secondary level. However, two interesting features arise. First, we estimate the impact of repeating at both educational levels, primary and secondary, to be much larger than the e¤ect of repeating only once, at either the primary or secondary level. These di¤erences account for more than 37 and 27 points for repeaters and non-repeaters, respectively. As we …nd that repeating one grade has a negative impact on scores, this result is not surprising. Second, our results show that the e¤ect of repeating at the secondary level is slightly di¤erent than that of repeating at the primary level. In particular, secondary repeaters lose 10 fewer points than primary repeaters because of the repetition. Nonetheless, if a non-repeater were subjected to grade retention, the impact on his or her score would be 9 points larger if the retention happened during secondary school. Table 7 shows the grade retention e¤ect for students in every region in our sample. Similarly to the pooled sample, in every region the e¤ect of grade retention on scores is highly negative and larger for non-repeaters than for repeaters. However, we note some interesting di¤erences across regions: the Balearic Islands, with a grade retention e¤ect of -39 and -109 points for repeaters and non-repeaters, respectively, shows a much di¤erent result with respect to the best performing regions, such as La Rioja, whose corresponding …gures are -67 and -139, and the pooled …gures for all of Spain (-56 and -125 points for repeaters and non-repeaters, respectively; that is, 10% and 24% of the non-repeating outcome in each case). Finally, we estimate the three di¤erent components in Equation (12) to decompose the estimated di¤erences between repeaters and non-repeaters. Table 7 reports the percentage of observed di¤erences between repeaters and non-repeaters due to each of these three components. This exercise is done for the whole sample and for each separate region. The results are shown in the bottom panel of Table 7. If we 20 analyze this decomposition for the national sample, we can see that the majority of the di¤erences, 89%, is due to di¤erences in the coe¢ cients, that is, in the predicted e¤ect of the observable characteristics for each student group. Hence, repeaters obtain a worse score than non-repeaters because the impact of the observable variables on their outcomes is stronger for non-repeaters. With respect to di¤erences in observable characteristics, we …nd that they account for about one-fourth of the observed di¤erences. What is more interesting in the context of our analysis is the negative sign of the endogenous selection component (-15%). This …nding means that, in case of not allowing for endogenous selection, existing di¤erences between repeaters and non-repeaters would be overestimated. As shown in Table 7, this pattern is the same in every region, although Catalonia and the Basque Country are the two regions whose coe¢ cients of observable characteristics explain the most, almost 100% of the di¤erences, and also the ones with the highest …gures for endogenous selection (-26% and -24%, respectively). This …nding is interesting given that precisely these two regions have the lowest grade retention rates among Spanish regions. We can conclude from this analysis that their low retention rates are not reducing the grade of self-selection into the repeater and non-repeater student groups. On the contrary, those regions with fewer repeaters seem to have more selection into both groups of students, indicating that it is even more important to control for such selection issues to properly measure the retention e¤ect in these two regions and to compare them with other regions in Spain. To conclude, we propose the following question: how could the mean score for the whole sample change if there were no grade retention policy at all in Spain? We construct a counterfactual based on SRM estimations to handle this issue. The actual average math score y A is computed as the weighted average of actual repeaters’and non-repeaters’scores, that is: y A = E (yR j I = 1) PR + E (yN R j I = 0) PN R ; (17) where PR and PN R denote the percentages of repeaters and non-repeaters in the sample. Assume now that there is no grade retention policy in place. Then, we can compute a counterfactual for the average math score, the predicted average math score y P , where we introduce the expected score for repeaters had they not repeated, that is: (18) y P = E (yN R j I = 1) PR + (yN R j I = 0) PN R : The results for Spain are y A = 491:5 and y P = 515:2. That is, by eliminating the grade retention policy, the PISA score would be almost 25 points higher. Note that we are implicitly assuming that, in the no-grade-retention scenario, students’ behavior 21 does not change (for both repeaters and non-repeaters); that is, they exert the same e¤ort, and thus, their score does not diminish. However, we have no clear evidence about the e¤ect of the grade retention policy on student e¤ort. Indeed, it is di¢ cult to believe that this e¤ect could exceed the positive e¤ect of removing the grade retention policy estimated in this study. Nonetheless, this is a strong assumption, and thus, we try to relax it. Suppose that we could measure students’e¤ort using students’duration of selfstudy at home.21 Then, we could try to determine the impact of this variable on educational outcomes and check whether a change in the duration of self-study would be enough to counteract the bene…ts we estimate. We are able to do this because students were asked about their self-study time in PISA 2006 (unfortunately this variable is not present in PISA 2009). In Figure 4, we can see that scores increase only slightly with weekly hours of self-study. In fact, there is a certain point of self-study frequency above in which the educational outcomes do not vary or even worsen.22 If we assume this to be the e¤ect of e¤ort, it is di¢ cult to believe that a change in student behavior could balance the bene…ts of removing the grade retention policy. Hence, we support the idea that if the grade retention policy were eliminated, the Spanish average score would increase, as measured above, for 15-year-old students. 5 Concluding Remarks Our results show that grade retention has a substantial negative impact on educational outcomes as measured by the PISA program. In addition, we …nd that this negative e¤ect is bigger for non-repeaters than for repeaters (-24% vs. -10% of the non-repeater average score). That is, had they been retained as repeaters, nonrepeaters’PISA outcomes would have been reduced more than by twice the reduction observed for repeaters. In other words, grade retention improves the quality of the match between the school and the student. Moreover, di¤erent types of grade repetition do not change much as the impact is highly negative for both repeaters and non-repeaters, whether it occurs during Observe that there are two implicit assumptions in this exercise that are worthy of mention: …rst, the underlying criteria to analyze the optimal grade retention policy is utilitarianism, and second, students’utility is linear in e¤ort. 22 We think that the inverted-U shape in this graph could demonstrate that students who study frequently are those with more learning di¢ culties. As students’e¤ort, measured by self-study time, may be endogenous to educational achievement, the impact of this factor should be estimated by the appropriate technique (i.e., Instrumental Variables). However, we will not discuss the impact of students’e¤ort on achievement here as it is not the focus of the paper. 21 22 400 420 Average math scores 440 460 480 500 520 0h <2 h 2-4 h 4-6 h Weekly hours of self study in math Non Rep Rep >6 h Source: Pisa 2006 Figure 4: Math scores in PISA 2006 by hours of self study in math primary or secondary education. Our results show that if a student was retained at the primary level, she will su¤er a causal decrease in her performance, but this situation could be even worse if this student was subjected to a second grade retention in secondary school. Finally, we decompose the observed di¤erence among Spanish repeaters’and nonrepeaters’scores. We …nd that the observed di¤erences among these two groups are essentially explained by the di¤erent returns to the observed individual, familiar and school characteristics that explain educational outcomes. This component accounts for about 89% of the total di¤erence whereas the component due to di¤erences in such observed characteristics is only 25%. What is more interesting is that endogenous selection makes observed di¤erences appear 15% lower than they actually are. Thus, if such endogeneity in the retaining status was not considered, di¤erences between repeaters and non-repeaters would be overestimated. Interestingly, this bias is most important in Catalonia and the Basque Country, the two regions where grade retention is the lowest among all Spanish regions. Hence, these regions’slightly di¤erent retention policy does not seem to decrease the di¤erences between repeaters and nonrepeaters. On the contrary, the smaller observed di¤erences in these two regions are due largely to increased self-selection into both student groups. Several extensions of this work are possible. An important future study could perform a cost-bene…t analysis regarding the grade retention policy. The cost of grade retention includes any additional years of schooling provided to students who 23 are held back.23 Another interesting extension could be to study the long-run e¤ects of a grade retention policy, for example, by considering its impact on drop-out rates, college attendance and job-market results. Finally, we believe our results to be of special interest in the actual debate on economics of education and educational policies. First, note that the regional di¤erences we observe may be due to di¤erences in the management of the public educational services at the regional level, as we control for individual and socioeconomic variables. Thus, the worst-performing regions can learn from the best performers regarding the management of retention policies. Second, and more importantly, in a context of increasing interest in academic performance di¤erences across countries (as the importance of human capital accumulation to growth becomes well known), it is important to evaluate the educational policies in place. This is particularly true for those policies that are supposed to serve as a remedial for poor academic performance, as is the case for the grade retention policy. For example, the average cost of schooling in Spain in 2007, in terms of government and family expenditures, was, at current prices, 4,870e and 6,508e per student at the primary and secondary level, respectively. These …gures amount to a yearly cost of schooling of 811e (for six years) and 1627e (for four years), respectively (see Instituto de Evaluación, 2010). 23 24 References [1] Angrist, J., and Lavy, V. (1999) “Using Maimonides’Rule to Estimate the E¤ect of Class Size on Scholastic Achievement” Quarterly Journal of Economics, 114 (2): 533-575. [2] Bedard, K. and E. Dhuey (2006) “The Persistence of Early Childhood Maturity: International Evidence of Long-Run Age E¤ects” Quarterly Journal of , Economics 121(4): 1437-1472. [3] Belot, M. and Vandenberghe, V. (2011) “Evaluating the “Threat” E¤ects of Grade Repetition” Discussion Papers (IRES - Institut de Recherches Economiques et Sociales) 2011-026. [4] Carrasco, R. (2001) “Binary Choice with Binary Endogenous Regressors in Panel Data: Estimating the E¤ect of Fertility on Female Labor Participation”Journal of Business & Economic Statistics 19(4): 385-94. [5] Dearden, L., Emmerson, C., Frayne and Meghir, C. (2006), “Education Subsidies and School Drop-Out Rates” CEE Discussion Papers 0053, Centre for the , Economics of Education, LSE. [6] García-Pérez, J. I., and Jimeno, J. J. (2007). “Public sector wage gaps in Spanish regions”The Manchester School, 75(4), 501– 531. [7] Hanushek, E. (1998). “The Evidence on Class Size” Occasional Paper Number 98-1, W. Allen Wallis Institute of Political Economy, University of Rochester. [8] Heckman, J. (1979). "Sample selection bias as a speci…cation error" Econometrica 47 (1), 153– 61. [9] Instituto de Evaluación (2010), “Sistema Estatal de Indicadores de Educación” , Ministerio de Educación. [10] Jacob, B. A. and Lefgren, L. (2004). “Remedial Education and Student Achievement: A Regression-Discontinuity Analysis.” Review of Economics and Statistics, LXXXVI (1): 226-244: [11] Jacob, B. A. and Lefgren, L. (2009) “The e¤ect of grade retention on high school completion”American Economic Journal: Applied Economics 1:3, 33-58. 25 [12] Jimerson, S., Ferguson, P., Whipple, A., Anderson, G. and M. Dalton (2002). “Exploring the Association Between Grade Retention and Dropout: A Longitudinal Study Examining Socio-Emotional, Behavioral, and Achievement Characteristics of Retained Students”The California School Psychologist, Vol. 7, 51-62. [13] Lazear, E.P. (2001) “Educational Production” Quarterly Journal of Economics 116(3): 777-803. [14] Maddala, G. (1988). Introduction to econometrics. New York: Macmillan. [15] Manacorda, M. (2008), “The Cost of Grade Retention” CEP Discussion Papers , 0878, Centre for Economic Performance, LSE, forthcoming Review of Economics and Statistics. [16] OECD, (2011), “PISA 2009. Technical Report” [17] OECD (2010), "PISA 2009 Results: What Students Know and Can Do – Student Performance in Reading, Mathematics and Science (Volume I)". http://dx.doi.org/10.1787/9789264091450-en [18] OECD, (2009), “Education at a glance 2009. OECD indicators” [19] Prescott, D. and Wilton, D. (1992) “The Determinants of Wage Changes in Indexed and Nonindexed Contracts: A Switching Model”Journal of Labor Economics 10(3): 331-55. [20] Roderick, M. (1994) “Grade retention and school dropout: Investigating the association”American Educational Research Journal 31: 729-759. [21] Sanz de Galdeano, A. and D. Vuri (2007) “Parental Divorce and Students’Performance: Evidence from Longitudinal Data”Oxford Bulletin of Economics and Statistics, 69(3): 321-338 26 Table 1: Descriptive statistics. Grade retention and PISA 2009 Math scores Variable REGIONS Andalusia Aragon Asturias Balearic Islands Canary Islands Cantabria Castile Leon Catalonia Galicia La Rioja Madrid Murcia Navarre Basque Country Ceuta y Melilla Rest of Spain N Non-repeaters Repeaters % score % score 57.1 60.5 68.9 59.6 54.5 63.8 64.5 76.7 62.5 60.9 61.7 63 71.8 77.6 52.6 60.7 58.9 68.1 69.5 66.2 61.9 56.7 66.9 33.5 60.0 71.3 66.1 54.3 503.7 548.7 529.4 503.4 474.4 533.7 551.6 517.3 526.2 551.3 537.8 513.4 542.7 533.5 471.2 521.7 535.2 507.7 520.9 521.9 518 522.3 523.5 483.8 520.3 523.1 519.4 533.5 42.9 39.5 31.1 40.4 45.5 36.2 35.5 23.3 37.5 39.1 38.3 37.0 28.2 22.4 47.4 39.3 41.1 31.9 30.5 33.8 38.1 43.3 33.1 66.5 40.0 28.7 33.9 45.7 405.8 439.6 414.2 407.2 387.6 425.7 446.7 424 427.4 429.2 429.8 417.5 431 427.1 356.5 423.5 431.8 401.8 421.5 417 419.2 418.5 423.9 400 421.6 418.5 420.1 421.3 1,416 1,514 1,536 1,463 1,448 1,516 1,515 1,381 1,585 1,288 1,453 1,321 1,504 4,768 1,370 809 INDIVIDUAL VARIABLES 13,141 Female 12,746 Born in 1st quarter 6,284 Born in 2nd quarter 6,558 Born in 3th quarter 6,705 Born in 4th quarter 6,340 Native 23,188 Immigrant 2,227 < every week PC use for homework 21,013 Every week PC use for homework 4,284 < almost every day play PC games 21,013 Almost every day play PC games 4,284 Male 27 Table 1 (cont.): Descriptive statistics. Grade retention and PISA Math scores Variable SOCIO-ECONOMIC VARIABLES Mother low education Mother High education Father low education Father high education 0-25 books at home 26-200 books at home >200 books at home Mother lives at home Mother does not live at home Father lives at home Father does not live at home N Non-repeaters Repeaters % Score % Score 53.3 71.6 55.0 71.5 38.8 66.3 80.8 64.6 35.0 65.8 51.5 67.3 59.7 80.5 74.4 56.4 54.4 72.1 505.8 529 506 530.3 471.1 518.5 546.3 521.4 495 521.7 517.5 522.8 518.5 535.8 525.1 516.1 515.2 525.1 46.7 28.4 45.0 28.5 61.2 33.7 19.2 35.4 65.0 34.2 48.5 32.7 40.3 19.5 25.6 43.6 45.6 27.9 413.1 428.1 416.2 426.6 390.2 434.3 451.1 420.8 401.4 421.3 416.1 417.7 420.6 434.4 430.1 416.2 414.6 428.5 8,796 16,180 8,987 15,369 5,331 13,153 7,074 24,853 567 22,224 2,544 12,405 12,349 885 8,154 15,336 12,778 12,314 SCHOOL VARIABLES Less than 50% girls More than 50% girls Private independent school Private govern-depend.school Public <=21 students per class > 21 students per class 28 Table 2: Descriptive statistics. Regional distribution of explanatory variables (%) Variables INDIVIDUAL VARIABLES Females Born in 2nd quarter Born in 3th quarter Born in 4th quarter Inmigrants Ev. week PC for homework Al. every day play PC SOCIO-ECO. VARIABLES Mother High education Father High education 26-200 books at home >200 books at home Mother does not live at home Father does not live at home SCHOOL VARIABLES More than 50% girls Private independent school Private govern-depend. sch. > 21 students per class And Ara Ast Bal Cana Cant Cast Cat 47.4 23.5 25.9 27.1 5.8 37.9 19.6 46.8 44.7 51.7 17.9 2.2 10.4 47 1.4 24.1 60.3 49.4 23.9 27.7 23.9 12.2 37.3 17.9 65.8 64.7 51.6 31.3 2.3 9.2 57.7 4 26.4 47.6 47.4 27.1 26.4 22.6 5.2 37.1 22.2 69 65.7 51.9 30.3 3 12.9 56.2 2.1 30.6 30.2 50 23.5 25.6 26 15.3 48.1 20 58.5 57.8 51.5 26.2 2.4 12.9 34.6 4.5 29.8 48.1 47.6 22 27.1 26.3 11.7 44.7 15.9 53.7 49.8 45.2 13.8 4 15.2 61.4 0 18.2 54.2 49.1 25.1 25.3 26.6 7.1 40.5 15 68.7 64 52.3 27.2 2.4 11.6 39.9 3.5 35.3 35.3 51 26.1 25.4 24.4 5.3 33.9 16.1 66.6 62.5 53.1 35.3 2.2 7.1 55.2 9.2 23.7 48 48.7 24.4 26.4 25.2 11.2 60.2 18.8 62.9 61.6 50.9 29.4 1.2 10.3 54.4 15.1 24.1 63.1 29 Table 2 (cont.): Descriptive statistics. Regional distribution of explanatory variables (%) Variables INDIVIDUAL VARIABLES Females Born in 2nd quarter Born in 3th quarter Born in 4th quarter Immigrants Ev. week PC for homework Al. every day play PC SOCIO-ECO. VARIABLES Mother High education Father High education 26-200 books at home >200 books at home Mother does not live at home Father does not live at home SCHOOL VARIABLES More than 50% girls Private independent school Private govern-depend. sch. > 21 students per class Gal Rio Mad Mur Nav Basq CyM Rest All 49.7 25 24.9 26.5 4.2 26.6 16.8 59.5 57.8 54 26.5 2.7 11.7 53.6 6.3 25.5 44 48.9 27.7 26.8 23.7 13.1 41.8 16.3 64.8 60.7 52.6 28.4 2.2 8.9 55.1 0 32.7 58.6 50 26.8 25.7 23.9 16.3 37.4 16.7 69.1 64.9 50.7 30.3 1.9 11.8 46.9 7.3 32.1 62.5 50.2 25.5 26.7 22.4 12.5 35.9 17.6 52 55.4 51.4 20.3 1.8 8.7 54.8 2.4 22.7 66.1 47.6 25.8 26.1 23.1 12.7 44.4 15 71.8 68.8 51 29 2 7.9 55 2.6 34.7 56.7 48.5 26.5 24.2 23.3 4.7 45.3 14.1 77.5 77 52.6 33.4 2 9.8 41.9 0 57.7 33.7 50.8 23.1 26.6 26.9 10.7 47.2 20.8 47.8 54.4 42.3 15 4.3 10.8 49.3 2.9 17.6 65 50.8 25.1 28 24.7 9.2 36.9 17.4 59.1 56.1 54.2 25.1 1.4 8.4 66 4.3 17.5 52.3 49.2 24.9 26.3 25.2 9.5 40.9 17.8 59.7 57.3 51.8 25.5 2 10.3 53.4 5.2 25.7 55.4 30 Table 3: Distribution of Math scores Number of observations 17,678 8,209 1,071 1,406 5,374 Mean 520.71 418.93 422.72 371.45 442.25 P25 475.94 369.15 371.95 326.93 398.04 P50 522.29 420.17 414.64 373.90 445.79 P75 568.86 470.95 483.26 419.70 487.31 Non-repeaters Repeaters Repeaters_P Repeaters_PS Repeaters_S 31 Table 4: Selection equation estimation SRM REGIONS Andalusia Aragon Asturias Balearic islands Cantabria Castile Leon Catalonia Galicia La Rioja Madrid Murcia Navarre Basque Country Ceuta y Melilla Rest of Spain PROBIT 0:01 0:02 0:08 (0:06) (0:06) (0:06) (0:06) 0:07 0:38 (0:06) 0:33 (0:06) 0:03 (0:07) 0:01 (0:08) 0:11 (0:07) (0:07) 0:03 (0:07) (0:07) 0:01 0:08 0:4 (0:09) 0:44 (0:09) 0:11 (0:06) (0:061) 0:03 (0:06) 0:02 0:05 (0:06) (0:06) 0:03 (0:06) 0:04 0:12 (0:06) 0:13 (0:05) 0:32 (0:06) 0:32 (0:06) 0:44 (0:06) 0:4 (0:06) 0:06 0:01 (0:06) 0:12 (0:06) (0:07) (0:07) 0:04 INDIVIDUAL VARIABLES Gender (female) Born in 3th quarter Born in 4th quarter Immigrant Every week use PC for homework Almost every day playing PC games 0:26 (0:03) 0:23 (0:03) 0:1 (0:04) (0:04) 0:13 0:19 0:6 (0:04) 0:28 0:59 (0:04) (0:06) (0:06) 0:33 (0:04) 0:25 (0:04) 0:21 (0:05) 0:22 (0:05) 32 Table 4 (cont.): Selection equation estimation SRM SOCIO-ECONOMIC VARIABLES Mother high and father low education Mother low and father high education Mother and father high education 26-200 books at home >200 books at home Mother is not at home Father is not at home PROBIT 0:26 (0:05) 0:22 (0:05) 0:16 (0:05) 0:17 (0:05) 0:38 (0:05) 0:38 (0:05) 0:46 (0:05) 0:48 ( 0:05) 0:74 (0:07) 0:8 ( 0:07) 0:48 0:23 (0:12) (0:07) 0:39 0:24 (0:12) (0:07) SCHOOL VARIABLES Majority of girls in school Private gov-dependent school Private independent school Class size Class size ^2 Constant Loglikehood (or pseudo) Note 1: , and 0:1 (0:04) (0:047) 0:08 0:23 (0:04) 0:16 (0:04) 0:18 (0:077) 0:25 (0:08) 0:13 (0:02) 0:13 (0:02) 0:002 1:99 (0:0006) (0:23) 0:002 1:98 (0:0007) (0:24) 1; 944; 306:3 10; 628:9 means that coe¢ cient is signi…cant at 10%, 5% or 1%, respectively. Note 2: Number of observations is 21,360. Standard errors in brackets 33 Table 5: E¤ect of individual, socio-economic and school variables on Math score Non-repeaters SRM INDIVIDUAL VARIABLES Gender (female) Immigrant Every week use PC for homework Almost every day playing PC games SOCIO-ECO VARIABLES Mother high father low educ. Mother low father high educ. Mother and father high educ. 26-200 books at home >200 books at home Mother is not at home Father is not at home SCHOOL VARIABLES Majority of girls in school Private gov-dependent school Private independent school Class size Class size ^2 Constant Retained Correlation coe¢ cients R 2 (3:98) (4:11) (4:26) Repeaters SRM OLS (a) All OLS (b) All IV OLS (a) 27:73 (2:14) 29:43 (2:053) 34:64 (4:86) 31:03 (3:283) 30:02 (1:82) 29:48 (2:71) 26:8 (5:79) (2:29) (3:55) 21:75 (5:51) (1:971) (3:213) 4:68 (8:65) 12:19 (4:358) 16:66 (3:302) (1:612) (2:828) 17:97 (6:77) (3:26) 4:4 2:27 5:18 5:01 (5:54) 0:67 (3:215) 1:15 0:32 1:87 3:64 2:4 (5:79) 5:64 (4:893) 0:26 (3:42) 3:81 (3:935) (4:254) 2:16 10:77 (7:63) (7:51) (4:43) 13:56 5:82 (4:2) 6:03 (3:21) 6:49 (3:71) 4:68 3:46 3:26 (6:559) (4:415) (3:825) 3:98 (4:33) 4:42 17:15 38:85 63:43 (3:47) 14:58 34:86 57:93 (3:103) 1:37 6:65 12:42 33:44 54:75 6:5 (2:854) 13:27 34:51 56:38 (4:40) (3:52) (3:236) 27:34 35:59 4:56 (7:10) 33:46 46:28 (3:803) (2:648) (5:26) (4:34) (3:565) (11:83) (6:297) (3:438) (7:38) 21:28 (9:27) (4:09) 17:66 (8:863) (4:107) (13:24) (5:38) 0:57 (11:244) (4:429) 7:56 (9:27) (3:62) (7:612) (3:352) 0 1:77 6:39 3:55 3:09 2:43 0 (3:884) 0:6 (4:71) (6:05) (11:23) (2:10) 5:73 3:4 (4:592) (4:986) (11:057) 4:51 (3:425) (3:787) (8:373) 1:88 (3:57) (4:19) (8:58) 1:69 0:2 (3:9) (8:49) 1:61 (3:923) (8:518) 6:94 1:08 1:47 2:04 0:91 1:84 4:76 2:59 2:87 7:38 (1:51) 6:32 (1:56) 2:58 3:93 (1:192) 4:82 (0:91) 5:14 (1:43) 0:15 (0:04) 0:13 (0:037) 0:06 (0:05) 0:08 (0:033) 0:1 (0:024) 0:10 (0:03) 354:57 (18:81) 383:01 (17:639) 347:43 (15:77) 341:03 (12:034) 405:17 80:4 (9:852) 396:69 (33:00) 73:41 (26:97) (2:03) 0:22 (0:11) 0:31 (0:3) 0:21 , and m eans that coe¢ cient is signi…cant at 10% , 5% or 1% , resp ectively. 0:17 0:41 0:46 Note 1: Note 2: Numb er of total observations is 21,360, and 14,969 and 6,391 of NR and R resp ectively. Standard errors in brackets Note 3: Log pseudolikelihood (SRM ) = -1,944,306.3; 34 Table 6: Grade retention e¤ect on Math scores Main model: REP Repeaters Type REP_S Predictions y1 R y1 R N GRE = y0 R y0 R N GRE = 0 1 REP_P REP_PS SRM 423.9 480.2 OLS (a) 424.5 503.4 -78.9 OLS (b) -80.4 IV -73.4 SRM 442.7 489.8 -47.1 SRM 425.1 482.9 -57.7 SRM 377.1 471.8 -94.7 y1 R y1 R N -56.2 397.2 522.4 y0 R y0 R N -125.1 442.5 528.2 -85.7 -80.4 -73.4 417.6 528.2 -110.5 426.9 528.2 -101.2 391.1 528.6 -137.4 Note 1: The e¤ects of the four types of repetition is based on di¤erent estimations where I=0 if student is Non-Repeater and I=1 if student is repeater of each the four types: Rep, Rep_S, Rep_P and Rep_PS, respectively. Note 2: The number of observations are 14,969, 6,391, 3,718, 693 and 1,036 for Non-Repeaters, Repeaters, Repeaters_S, Repeaters_P and Repeaters_PS respectively. Note 3: The equations yRi and yN Ri are the same when estimating by OLS or IV. Note 4: The estimated SRM correlation coe¢ cient NR for Rep_S and Rep_P estimations is not signi…cant, however for Rep_PS is negative (-0.26) and signi…cant (t-student= signi…cant at 5%). On the contrary, estimated coe¢ cient R 1:94) as in the main model estimation (-0.22, is not signi…cant in any of the models. 35 Table 7: Grade retention e¤ect. Decomposition of di¤erences between R and NR. By region. Spain SRM GRE 1 = y 1 R GRE = y0 R N 0 And -56.7 -123.2 21.2 91.5 -12.6 Ara -61.0 -129.9 24.9 87.4 -12.3 Ast -58.2 -131.9 29.6 86.9 -16.6 Bal -38.5 -108.8 34.0 80.1 -14.1 Cana -46.8 -110.9 19.1 94.0 -13.2 Cant -60.9 -130.8 26.2 88.0 -14.2 Cast -66.9 -135.6 19.6 95.6 -15.1 Cat -48.1 -121.3 25.1 101.4 -26.4 y1 R N y0 R N -56.2 -125.1 25.3 89.3 -14.6 y0 R (%) y1 R Characteristics Coe¢ cients Endogenous selec. Table 7 (cont.): Grade retention e¤ect. Deco. of di¤erences between R and NR. By region. Gal SRM GRE 1 = y 1 R GRE = y0 R N 0 Rio -67.3 -138.6 27.3 84.2 -11.5 Mad -59.2 -129.9 27.9 84.7 -12.6 Mur -52.7 -120.7 21.6 95.0 -16.6 Nav -55.3 -130.1 32.5 85.7 -18.3 Basq -57.6 -132.2 24.1 99.6 -23.7 CyM -57.4 -126.9 29.0 80.4 -9.5 Rest -56.6 -124.3 23.4 89.1 -12.5 y1 R N y0 R N -58.3 -125.6 19.4 97.1 -16.5 y0 R (%) y1 R Characteristics Coe¢ cients Endogenous selec. 36 SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP 2006 CREAP2006-01 Matas, A. (GEAP); Raymond, J.Ll. (GEAP) "Economic development and changes in car ownership patterns" (Juny 2006) CREAP2006-02 Trillas, F. (IEB); Montolio, D. (IEB); Duch, N. (IEB) "Productive efficiency and regulatory reform: The case of Vehicle Inspection Services" (Setembre 2006) CREAP2006-03 Bel, G. (PPRE-IREA); Fageda, X. (PPRE-IREA) "Factors explaining local privatization: A meta-regression analysis" (Octubre 2006) CREAP2006-04 Fernàndez-Villadangos, L. (PPRE-IREA) "Are two-part tariffs efficient when consumers plan ahead?: An empirical study" (Octubre 2006) CREAP2006-05 Artís, M. (AQR-IREA); Ramos, R. (AQR-IREA); Suriñach, J. (AQR-IREA) "Job losses, outsourcing and relocation: Empirical evidence using microdata" (Octubre 2006) CREAP2006-06 Alcañiz, M. (RISC-IREA); Costa, A.; Guillén, M. (RISC-IREA); Luna, C.; Rovira, C. "Calculation of the variance in surveys of the economic climate” (Novembre 2006) CREAP2006-07 Albalate, D. (PPRE-IREA) "Lowering blood alcohol content levels to save lives: The European Experience” (Desembre 2006) CREAP2006-08 Garrido, A. (IEB); Arqué, P. (IEB) “The choice of banking firm: Are the interest rate a significant criteria?” (Desembre 2006) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP CREAP2006-09 Segarra, A. (GRIT); Teruel-Carrizosa, M. (GRIT) "Productivity growth and competition in spanish manufacturing firms: What has happened in recent years?” (Desembre 2006) CREAP2006-10 Andonova, V.; Díaz-Serrano, Luis. (CREB) "Political institutions and the development of telecommunications” (Desembre 2006) CREAP2006-11 Raymond, J.L.(GEAP); Roig, J.L.. (GEAP) "Capital humano: un análisis comparativo Catalunya-España” (Desembre 2006) CREAP2006-12 Rodríguez, M.(CREB); Stoyanova, A. (CREB) "Changes in the demand for private medical insurance following a shift in tax incentives” (Desembre 2006) CREAP2006-13 Royuela, V. (AQR-IREA); Lambiri, D.; Biagi, B. "Economía urbana y calidad de vida. Una revisión del estado del conocimiento en España” (Desembre 2006) CREAP2006-14 Camarero, M.; Carrion-i-Silvestre, J.LL. (AQR-IREA).;Tamarit, C. "New evidence of the real interest rate parity for OECD countries using panel unit root tests with breaks” (Desembre 2006) CREAP2006-15 Karanassou, M.; Sala, H. (GEAP).;Snower , D. J. "The macroeconomics of the labor market: Three fundamental views” (Desembre 2006) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP 2007 XREAP2007-01 Castany, L (AQR-IREA); López-Bazo, E. (AQR-IREA).;Moreno , R. (AQR-IREA) "Decomposing differences in total factor productivity across firm size” (Març 2007) XREAP2007-02 Raymond, J. Ll. (GEAP); Roig, J. Ll. (GEAP) “Una propuesta de evaluación de las externalidades de capital humano en la empresa" (Abril 2007) XREAP2007-03 Durán, J. M. (IEB); Esteller, A. (IEB) “An empirical analysis of wealth taxation: Equity vs. Tax compliance” (Juny 2007) XREAP2007-04 Matas, A. (GEAP); Raymond, J.Ll. (GEAP) “Cross-section data, disequilibrium situations and estimated coefficients: evidence from car ownership demand” (Juny 2007) XREAP2007-05 Jofre-Montseny, J. (IEB); Solé-Ollé, A. (IEB) “Tax differentials and agglomeration economies in intraregional firm location” (Juny 2007) XREAP2007-06 Álvarez-Albelo, C. (CREB); Hernández-Martín, R. “Explaining high economic growth in small tourism countries with a dynamic general equilibrium model” (Juliol 2007) XREAP2007-07 Duch, N. (IEB); Montolio, D. (IEB); Mediavilla, M. “Evaluating the impact of public subsidies on a firm’s performance: a quasi-experimental approach” (Juliol 2007) XREAP2007-08 Segarra-Blasco, A. (GRIT) “Innovation sources and productivity: a quantile regression analysis” (Octubre 2007) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP XREAP2007-09 Albalate, D. (PPRE-IREA) “Shifting death to their Alternatives: The case of Toll Motorways” (Octubre 2007) XREAP2007-10 Segarra-Blasco, A. (GRIT); Garcia-Quevedo, J. (IEB); Teruel-Carrizosa, M. (GRIT) “Barriers to innovation and public policy in catalonia” (Novembre 2007) XREAP2007-11 Bel, G. (PPRE-IREA); Foote, J. “Comparison of recent toll road concession transactions in the United States and France” (Novembre 2007) XREAP2007-12 Segarra-Blasco, A. (GRIT); “Innovation, R&D spillovers and productivity: the role of knowledge-intensive services” (Novembre 2007) XREAP2007-13 Bermúdez Morata, Ll. (RFA-IREA); Guillén Estany, M. (RFA-IREA), Solé Auró, A. (RFA-IREA) “Impacto de la inmigración sobre la esperanza de vida en salud y en discapacidad de la población española” (Novembre 2007) XREAP2007-14 Calaeys, P. (AQR-IREA); Ramos, R. (AQR-IREA), Suriñach, J. (AQR-IREA) “Fiscal sustainability across government tiers” (Desembre 2007) XREAP2007-15 Sánchez Hugalbe, A. (IEB) “Influencia de la inmigración en la elección escolar” (Desembre 2007) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP 2008 XREAP2008-01 Durán Weitkamp, C. (GRIT); Martín Bofarull, M. (GRIT) ; Pablo Martí, F. “Economic effects of road accessibility in the Pyrenees: User perspective” (Gener 2008) XREAP2008-02 Díaz-Serrano, L.; Stoyanova, A. P. (CREB) “The Causal Relationship between Individual’s Choice Behavior and Self-Reported Satisfaction: the Case of Residential Mobility in the EU” (Març 2008) XREAP2008-03 Matas, A. (GEAP); Raymond, J. L. (GEAP); Roig, J. L. (GEAP) “Car ownership and access to jobs in Spain” (Abril 2008) XREAP2008-04 Bel, G. (PPRE-IREA) ; Fageda, X. (PPRE-IREA) “Privatization and competition in the delivery of local services: An empirical examination of the dual market hypothesis” (Abril 2008) XREAP2008-05 Matas, A. (GEAP); Raymond, J. L. (GEAP); Roig, J. L. (GEAP) “Job accessibility and employment probability” (Maig 2008) XREAP2008-06 Basher, S. A.; Carrión, J. Ll. (AQR-IREA) Deconstructing Shocks and Persistence in OECD Real Exchange Rates (Juny 2008) XREAP2008-07 Sanromá, E. (IEB); Ramos, R. (AQR-IREA); Simón, H. Portabilidad del capital humano y asimilación de los inmigrantes. Evidencia para España (Juliol 2008) XREAP2008-08 Basher, S. A.; Carrión, J. Ll. (AQR-IREA) Price level convergence, purchasing power parity and multiple structural breaks: An application to US cities (Juliol 2008) XREAP2008-09 Bermúdez, Ll. (RFA-IREA) A priori ratemaking using bivariate poisson regression models (Juliol 2008) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP XREAP2008-10 Solé-Ollé, A. (IEB), Hortas Rico, M. (IEB) Does urban sprawl increase the costs of providing local public services? Evidence from Spanish municipalities (Novembre 2008) XREAP2008-11 Teruel-Carrizosa, M. (GRIT), Segarra-Blasco, A. (GRIT) Immigration and Firm Growth: Evidence from Spanish cities (Novembre 2008) XREAP2008-12 Duch-Brown, N. (IEB), García-Quevedo, J. (IEB), Montolio, D. (IEB) Assessing the assignation of public subsidies: Do the experts choose the most efficient R&D projects? (Novembre 2008) XREAP2008-13 Bilotkach, V., Fageda, X. (PPRE-IREA), Flores-Fillol, R. Scheduled service versus personal transportation: the role of distance (Desembre 2008) XREAP2008-14 Albalate, D. (PPRE-IREA), Gel, G. (PPRE-IREA) Tourism and urban transport: Holding demand pressure under supply constraints (Desembre 2008) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP 2009 XREAP2009-01 Calonge, S. (CREB); Tejada, O. “A theoretical and practical study on linear reforms of dual taxes” (Febrer 2009) XREAP2009-02 Albalate, D. (PPRE-IREA); Fernández-Villadangos, L. (PPRE-IREA) “Exploring Determinants of Urban Motorcycle Accident Severity: The Case of Barcelona” (Març 2009) XREAP2009-03 Borrell, J. R. (PPRE-IREA); Fernández-Villadangos, L. (PPRE-IREA) “Assessing excess profits from different entry regulations” (Abril 2009) XREAP2009-04 Sanromá, E. (IEB); Ramos, R. (AQR-IREA), Simon, H. “Los salarios de los inmigrantes en el mercado de trabajo español. ¿Importa el origen del capital humano?” (Abril 2009) XREAP2009-05 Jiménez, J. L.; Perdiguero, J. (PPRE-IREA) “(No)competition in the Spanish retailing gasoline market: a variance filter approach” (Maig 2009) XREAP2009-06 Álvarez-Albelo,C. D. (CREB), Manresa, A. (CREB), Pigem-Vigo, M. (CREB) “International trade as the sole engine of growth for an economy” (Juny 2009) XREAP2009-07 Callejón, M. (PPRE-IREA), Ortún V, M. “The Black Box of Business Dynamics” (Setembre 2009) XREAP2009-08 Lucena, A. (CREB) “The antecedents and innovation consequences of organizational search: empirical evidence for Spain” (Octubre 2009) XREAP2009-09 Domènech Campmajó, L. (PPRE-IREA) “Competition between TV Platforms” (Octubre 2009) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP XREAP2009-10 Solé-Auró, A. (RFA-IREA),Guillén, M. (RFA-IREA), Crimmins, E. M. “Health care utilization among immigrants and native-born populations in 11 European countries. Results from the Survey of Health, Ageing and Retirement in Europe” (Octubre 2009) XREAP2009-11 Segarra, A. (GRIT), Teruel, M. (GRIT) “Small firms, growth and financial constraints” (Octubre 2009) XREAP2009-12 Matas, A. (GEAP), Raymond, J.Ll. (GEAP), Ruiz, A. (GEAP) “Traffic forecasts under uncertainty and capacity constraints” (Novembre 2009) XREAP2009-13 Sole-Ollé, A. (IEB) “Inter-regional redistribution through infrastructure investment: tactical or programmatic?” (Novembre 2009) XREAP2009-14 Del Barrio-Castro, T., García-Quevedo, J. (IEB) “The determinants of university patenting: Do incentives matter?” (Novembre 2009) XREAP2009-15 Ramos, R. (AQR-IREA), Suriñach, J. (AQR-IREA), Artís, M. (AQR-IREA) “Human capital spillovers, productivity and regional convergence in Spain” (Novembre 2009) XREAP2009-16 Álvarez-Albelo, C. D. (CREB), Hernández-Martín, R. “The commons and anti-commons problems in the tourism economy” (Desembre 2009) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP 2010 XREAP2010-01 García-López, M. A. (GEAP) “The Accessibility City. When Transport Infrastructure Matters in Urban Spatial Structure” (Febrer 2010) XREAP2010-02 García-Quevedo, J. (IEB), Mas-Verdú, F. (IEB), Polo-Otero, J. (IEB) “Which firms want PhDs? The effect of the university-industry relationship on the PhD labour market” (Març 2010) XREAP2010-03 Pitt, D., Guillén, M. (RFA-IREA) “An introduction to parametric and non-parametric models for bivariate positive insurance claim severity distributions” (Març 2010) XREAP2010-04 Bermúdez, Ll. (RFA-IREA), Karlis, D. “Modelling dependence in a ratemaking procedure with multivariate Poisson regression models” (Abril 2010) XREAP2010-05 Di Paolo, A. (IEB) “Parental education and family characteristics: educational opportunities across cohorts in Italy and Spain” (Maig 2010) XREAP2010-06 Simón, H. (IEB), Ramos, R. (AQR-IREA), Sanromá, E. (IEB) “Movilidad ocupacional de los inmigrantes en una economía de bajas cualificaciones. El caso de España” (Juny 2010) XREAP2010-07 Di Paolo, A. (GEAP & IEB), Raymond, J. Ll. (GEAP & IEB) “Language knowledge and earnings in Catalonia” (Juliol 2010) XREAP2010-08 Bolancé, C. (RFA-IREA), Alemany, R. (RFA-IREA), Guillén, M. (RFA-IREA) “Prediction of the economic cost of individual long-term care in the Spanish population” (Setembre 2010) XREAP2010-09 Di Paolo, A. (GEAP & IEB) “Knowledge of catalan, public/private sector choice and earnings: Evidence from a double sample selection model” (Setembre 2010) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP XREAP2010-10 Coad, A., Segarra, A. (GRIT), Teruel, M. (GRIT) “Like milk or wine: Does firm performance improve with age?” (Setembre 2010) XREAP2010-11 Di Paolo, A. (GEAP & IEB), Raymond, J. Ll. (GEAP & IEB), Calero, J. (IEB) “Exploring educational mobility in Europe” (Octubre 2010) XREAP2010-12 Borrell, A. (GiM-IREA), Fernández-Villadangos, L. (GiM-IREA) “Clustering or scattering: the underlying reason for regulating distance among retail outlets” (Desembre 2010) XREAP2010-13 Di Paolo, A. (GEAP & IEB) “School composition effects in Spain” (Desembre 2010) XREAP2010-14 Fageda, X. (GiM-IREA), Flores-Fillol, R. “Technology, Business Models and Network Structure in the Airline Industry” (Desembre 2010) XREAP2010-15 Albalate, D. (GiM-IREA), Bel, G. (GiM-IREA), Fageda, X. (GiM-IREA) “Is it Redistribution or Centralization? On the Determinants of Government Investment in Infrastructure” (Desembre 2010) XREAP2010-16 Oppedisano, V., Turati, G. “What are the causes of educational inequalities and of their evolution over time in Europe? Evidence from PISA” (Desembre 2010) XREAP2010-17 Canova, L., Vaglio, A. “Why do educated mothers matter? A model of parental help” (Desembre 2010) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP 2011 XREAP2011-01 Fageda, X. (GiM-IREA), Perdiguero, J. (GiM-IREA) “An empirical analysis of a merger between a network and low-cost airlines” (Maig 2011) XREAP2011-02 Moreno-Torres, I. (ACCO, CRES & GiM-IREA) “What if there was a stronger pharmaceutical price competition in Spain? When regulation has a similar effect to collusion” (Maig 2011) XREAP2011-03 Miguélez, E. (AQR-IREA); Gómez-Miguélez, I. “Singling out individual inventors from patent data” (Maig 2011) XREAP2011-04 Moreno-Torres, I. (ACCO, CRES & GiM-IREA) “Generic drugs in Spain: price competition vs. moral hazard” (Maig 2011) XREAP2011-05 Nieto, S. (AQR-IREA), Ramos, R. (AQR-IREA) “¿Afecta la sobreeducación de los padres al rendimiento académico de sus hijos?” (Maig 2011) XREAP2011-06 Pitt, D., Guillén, M. (RFA-IREA), Bolancé, C. (RFA-IREA) “Estimation of Parametric and Nonparametric Models for Univariate Claim Severity Distributions - an approach using R” (Juny 2011) XREAP2011-07 Guillén, M. (RFA-IREA), Comas-Herrera, A. “How much risk is mitigated by LTC Insurance? A case study of the public system in Spain” (Juny 2011) XREAP2011-08 Ayuso, M. (RFA-IREA), Guillén, M. (RFA-IREA), Bolancé, C. (RFA-IREA) “Loss risk through fraud in car insurance” (Juny 2011) XREAP2011-09 Duch-Brown, N. (IEB), García-Quevedo, J. (IEB), Montolio, D. (IEB) “The link between public support and private R&D effort: What is the optimal subsidy?” (Juny 2011) SÈRIE DE DOCUMENTS DE TREBALL DE LA XREAP XREAP2011-10 Bermúdez, Ll. (RFA-IREA), Karlis, D. “Mixture of bivariate Poisson regression models with an application to insurance” (Juliol 2011) XREAP2011-11 Varela-Irimia, X-L. (GRIT) “Age effects, unobserved characteristics and hedonic price indexes: The Spanish car market in the 1990s” (Agost 2011) XREAP2011-12 Bermúdez, Ll. (RFA-IREA), Ferri, A. (RFA-IREA), Guillén, M. (RFA-IREA) “A correlation sensitivity analysis of non-life underwriting risk in solvency capital requirement estimation” (Setembre 2011) XREAP2011-13 Guillén, M. (RFA-IREA), Pérez-Marín, A. (RFA-IREA), Alcañiz, M. (RFA-IREA) “A logistic regression approach to estimating customer profit loss due to lapses in insurance” (Octubre 2011) XREAP2011-14 Jiménez, J. L., Perdiguero, J. (GiM-IREA), García, C. “Evaluation of subsidies programs to sell green cars: Impact on prices, quantities and efficiency” (Octubre 2011) XREAP2011-15 Arespa, M. (CREB) “A New Open Economy Macroeconomic Model with Endogenous Portfolio Diversification and Firms Entry” (Octubre 2011) XREAP2011-16 Matas, A. (GEAP), Raymond, J. L. (GEAP), Roig, J.L. (GEAP) “The impact of agglomeration effects and accessibility on wages” (Novembre 2011) XREAP2011-17 Segarra, A. (GRIT) “R&D cooperation between Spanish firms and scientific partners: what is the role of tertiary education?” (Novembre 2011) XREAP2011-18 García-Pérez, J. I.; Hidalgo-Hidalgo, M.; Robles-Zurita, J. A. “Does grade retention affect achievement? Some evidence from PISA” (Novembre 2011) xreap@pcb.ub.es