Some quick thoughts on the problems I (and many instructors) have with student evaluations...
TL;DR - Evaluations are worthless when measured from any angle and it's time to come up with something better.
The main problems I have are mathematical uncertainty, polarization bias/creep (e.g. negativity effect), and the fact that many students simply aren't honest in their evaluations. Recent studies are finding that low evaluations are correlated with instructor difficulty. On top of it all, the culture surrounding evaluations have changed since their introduction in the 1960s and administrators haven't taken this into account. Whereas there were relatively few opportunities to evaluate things in the past, the internet has fostered an "evaluation culture" in which people are now conditioned to anticipate the dopamine hit they receive when they've affixed a score or opinion to something, particularly when they know they will face no repercussions for that action. This, in turn, "rewards" students for attaching strong feelings for their review, which makes any attempt at objectivity nearly impossible.
For example, only two students evaluated me in a particular course (out of twelve) and here were their overall impressions:
And another (out of sixteen):
And in another, when asked to rate me "overall" from 1 to 5 (1 being "poor" and 5 being "excellent"), a class scored me (Nr=5 respondents; Nt=16 total in class):
From these types of scores and comments, it's pretty obvious that disgruntled students use polarization escalation to try to gain some form of power over the process or enact some form of retribution. Students know that these scores are averaged and use their knowledge of math and their experience in trying to game ratings on the internet to attempt to slant the overall evaluation. This means that many (if not most) students who disapprove of the class make the entire exercise statistically invalid since those who give higher scores are probably not trying to manipulate while those giving lower ones are.
Assuming that the students who rated me poorly scored me a "1" (there was one for each of these classes) and the students who rated me highly gave me a "5" (in the first class, there were only two respondents, so, this happened) we're now set up for some mathematical conundrums...
The solution might be to look at both and average them, which would (out of 1-5), put me at a 3. But we have the fallacy of the middle ground at work, here. If one student loves me and the other hates me about as much as Hitler, does the middle ground really reflect any sort of truth as to what kind of instructor I am? My guess is that it does not. Obviously I wasn't "rude," etc toward all students the second student claims, so what to do?
On top of that, there's a problem with "weight" given to each student's responses. By their own admission, two students wouldn't change anything I did or perhaps no more than 5% (although from their language, 0%). The other students, on the other hand, would probably find some things I were doing correctly and might only change, at most, 60% of how I instructed. If that's the case, then the "1" score carries less theoretical mathematical "weight" than the "5" scores do, but this can't be calculated given the information at hand.
I'll write about this more someday, but in any given class only about 25% of the total number of students respond. This creates problems since the numbers of respondents won't provide a sufficient cross section of the course to be statistically useful in any way.
In addition to these issues, it seems like many negative reviews include half truths or outright lies, which don't explain the situation in an intellectually or situationally honest manner. Since students don't put their name on reviews and professors can't respond directly to them in an official manner, these falsifications are allowed to affect lives completely unchecked. Some examples:
In another example, students were asked: "Did the mentor adequately respond to your questions?" The list of responses were:
Here's another set of results:
Note the final two; it seems that attitude/being able to stick with the course made a big difference in this course. It will be difficult to give someone full credit if you won't assume good faith:
The list of these kinds of comments goes on and on. There is no way for an instructor to counter these kinds of comments--they just sit there for all eternity in their un-substantiated form. If you're going to bag on me in writing, then please have the ethics and honesty to fully disclose the circumstances.
These are only a few of the reasons why myself and many other professors (particularly those that understand math and statistics) have problems using student evaluations to tell us about what's going on in our classrooms (there are more).
TL;DR - Evaluations are worthless when measured from any angle and it's time to come up with something better.
The main problems I have are mathematical uncertainty, polarization bias/creep (e.g. negativity effect), and the fact that many students simply aren't honest in their evaluations. Recent studies are finding that low evaluations are correlated with instructor difficulty. On top of it all, the culture surrounding evaluations have changed since their introduction in the 1960s and administrators haven't taken this into account. Whereas there were relatively few opportunities to evaluate things in the past, the internet has fostered an "evaluation culture" in which people are now conditioned to anticipate the dopamine hit they receive when they've affixed a score or opinion to something, particularly when they know they will face no repercussions for that action. This, in turn, "rewards" students for attaching strong feelings for their review, which makes any attempt at objectivity nearly impossible.
For example, only two students evaluated me in a particular course (out of twelve) and here were their overall impressions:
- He was a great teacher, I wouldn't change anything he did.
- Dr. Arnold was one of the most unprofessional individuals I have encountered, during my scholastic career. He was demeaning, rude, obtuse and simply unprofessional in his interactions with me and my fellow students. His grading style was ludicrous and unpredictable and his methods to comment on papers was completely unprofessional and unhelpful. I know I'll pass the class but I could not be more disappointed in Dr. Arnold as a mentor.
- Very poor and confusing.
- He was perfect!
- Was great! He is a tough professor but very helpful. I have learned a lot in this course.
- Excellent.
- Dr. Arnold challenged me to do best the I could and gave me advice on how I could improve.
- 10 [sic. I think meaning 10 out of 10]
- He is an excellent mentor
And another (out of sixteen):
- The mentoring from Dr. Arnold was the best of my academic experience.
- I did not receive any mentoring in this class that will be profitable in my future. I can say that with all honesty. I can also say that this is the ONLY class I have taken through [school] that gave me this experience. All other professors were tremendous in my time here.
And in another, when asked to rate me "overall" from 1 to 5 (1 being "poor" and 5 being "excellent"), a class scored me (Nr=5 respondents; Nt=16 total in class):
- 1
- 0
- 0
- 0
- 4
From these types of scores and comments, it's pretty obvious that disgruntled students use polarization escalation to try to gain some form of power over the process or enact some form of retribution. Students know that these scores are averaged and use their knowledge of math and their experience in trying to game ratings on the internet to attempt to slant the overall evaluation. This means that many (if not most) students who disapprove of the class make the entire exercise statistically invalid since those who give higher scores are probably not trying to manipulate while those giving lower ones are.
Assuming that the students who rated me poorly scored me a "1" (there was one for each of these classes) and the students who rated me highly gave me a "5" (in the first class, there were only two respondents, so, this happened) we're now set up for some mathematical conundrums...
The solution might be to look at both and average them, which would (out of 1-5), put me at a 3. But we have the fallacy of the middle ground at work, here. If one student loves me and the other hates me about as much as Hitler, does the middle ground really reflect any sort of truth as to what kind of instructor I am? My guess is that it does not. Obviously I wasn't "rude," etc toward all students the second student claims, so what to do?
On top of that, there's a problem with "weight" given to each student's responses. By their own admission, two students wouldn't change anything I did or perhaps no more than 5% (although from their language, 0%). The other students, on the other hand, would probably find some things I were doing correctly and might only change, at most, 60% of how I instructed. If that's the case, then the "1" score carries less theoretical mathematical "weight" than the "5" scores do, but this can't be calculated given the information at hand.
I'll write about this more someday, but in any given class only about 25% of the total number of students respond. This creates problems since the numbers of respondents won't provide a sufficient cross section of the course to be statistically useful in any way.
In addition to these issues, it seems like many negative reviews include half truths or outright lies, which don't explain the situation in an intellectually or situationally honest manner. Since students don't put their name on reviews and professors can't respond directly to them in an official manner, these falsifications are allowed to affect lives completely unchecked. Some examples:
- When following up on grading was told to please stop asking, I'll get it when I get it.
- "Gives 10% grade reductions for minor grammatical, citations, punctuation errors" (and wrote nothing more about this in the Eval).
- "It would have been great if Dr. Arnold had provided clear instructions for the modules rather than sending us revised instructions at a date later than my preference."
In another example, students were asked: "Did the mentor adequately respond to your questions?" The list of responses were:
- Yes.
- Yes.
- Yes and I appreciated it.
- Yes, he was always available to answer questions.
- The mentor adequately responded to my questions and ensured that I understood the module.
- Yes he did.
- No...often asked us to figure out the answers on our own or ask other students to get their opinions.
Here's another set of results:
Note the final two; it seems that attitude/being able to stick with the course made a big difference in this course. It will be difficult to give someone full credit if you won't assume good faith:
- Dr. Arnold is, without a doubt, the best mentor I've had in my time at Thomas Edison, and the best instructor I've had in my entire college career.
- I would take a course from Dr. Arnold again any day.
- While Dr. Arnold was a very strict professor in many regards, as I entered the last month of the course, his feedback began to make sense and the strict requirements took on a new light I began to see the big picture of our professional Capstone project.
- I have spent thousands of dollars with this institution. Being in this class with Bruce Arnold is the worst experience of my college career. So much so that I lost all desire to participate with any of my courses 1/2 way through the term....If more professors were like him at TESU, I would literally change schools.
The list of these kinds of comments goes on and on. There is no way for an instructor to counter these kinds of comments--they just sit there for all eternity in their un-substantiated form. If you're going to bag on me in writing, then please have the ethics and honesty to fully disclose the circumstances.
These are only a few of the reasons why myself and many other professors (particularly those that understand math and statistics) have problems using student evaluations to tell us about what's going on in our classrooms (there are more).