Summative student evaluations of teaching (SETs) have been widely criticized, especially by teachers, for failing to measure teaching effectiveness accurately. Surveys have shown that a majority of teachers believe that a teacher's raising the level of standards and/or content would result in worse SETs for the teacher, and that students in filling out SETs are biased in favor of certain teachers' personalities, looks, disabilities, gender and ethnicity. The evidence that some of these critics cite indicates that factors other than effective teaching are more predictive of favorable ratings. To get favorable ratings, teachers are likely to present the content that the slowest student can understand, and consequently, the content is affected. Quantitative fields tend to receive lower student evaluations. Many of those who are critical of SETs have suggested that they should not be used in decisions regarding faculty hires, retentions, promotions, and tenure. Some have suggested that using them for such purposes leads to the
dumbing down of educational standards. Others have said that the typical way SETs are now used at most universities is demeaning to instructors and has a corrupting effect on students' attitudes toward their teachers and higher education in general. The economics of education literature and the economic education literature are especially critical. For example, Weinberg et al. (2009) find that SET scores in first-year economics courses at Ohio State University are positively related to the grades instructors assign but are unrelated to learning outcomes once grades are controlled for. Others have also found a positive relationship between grades and SET scores, but unlike Weinberg et al. (2009), they do not directly address the relationship between SET scores and learning outcomes. A paper by Krautmann and Sander (1999) finds that the grades students expect to receive in a course are positively related to SET scores. Isely and Singh (2005) find that it is the difference between the grades students expect to receive and their cumulative GPA that is the relevant variable for obtaining favorable course evaluations. Another paper by Carrell and West (2010) uses a data set from the U.S. Air Force Academy where students are randomly assigned to course sections (reducing selection problems). It found that calculus students got higher marks on common course examinations when they had instructors with high SET scores, but did worse when they took later courses requiring calculus. Hamermesh and West (2005) find that students at the
University of Texas at Austin gave attractive instructors higher SET scores than less attractive instructors. However, the authors conclude that it may not be possible to determine if attractiveness increases the effectiveness of an instructor, possibly resulting in better learning outcomes. It may be the case that students pay more attention to attractive instructors. Meanwhile, a 2017 lawsuit was filed on grounds of xenophobic discrimination in course evaluations at the
University of Kansas, with Peter F. Lake, the director of
Stetson University's Center for Excellence in Higher Education Law and Policy, suggesting this is no isolated incident. The empirical economics literature contrasts sharply with the educational psychology literature, which generally argues that teaching evaluations are a legitimate means of assessing instructors and are unrelated to
grade inflation. However, as in the economic literature, other researchers outside educational psychology have reported negative findings on course evaluations. For example, some papers have examined online course evaluations and found them to be heavily influenced by the instructor's attractiveness and willingness to give high grades in return for very little work. Another criticism of these assessment instruments is that the data they produce are largely difficult to interpret for purposes of self- or course-improvement, given the number of variables that can affect evaluation scores. Finally, paper-based course evaluations can cost a university thousands of dollars over the years, while an electronic survey is offered at minimal cost to the university. Another concern raised by instructors is that response rates to online course evaluations are lower (and therefore the results may be less valid) than those for paper-based in-class evaluations. The situation is more complex than response rates alone would indicate. Student-faculty engagement is offered as an explanation, where course level, instructor rank, and other variables lacked explanatory power. ==See also==