Don’t judge me; sport’s struggles with things that cannot be timed or measured.

Deep into the Olympics and getting reacquainted with the elation and agony of competition, I started obsessing about measurement. Noah Lyles grabbing 100m gold by five thousandths of a second – Wow! I’m not sure that margin was even measurable three Olympics ago. Fixated, I wanted to think about events where distance, time and finishing order don’t count – specifically, the events that are judged.

Subjectively what is reality?
In many events, scoring and medals come from judges’ interpretations of skill execution and aesthetics – the latter largely subjective and socially and culturally determined. Any time a sport is subjectively judged, controversy is inevitable. Being curious I did a bit of research on the validity and reliability of ‘judging’. It turns out that there has not been a lot of inquiry in that space. Really difficult to design a valid study, I guess. Of course, part of the research challenge is that decisions can be hard to evaluate when we cannot be sure what the ‘correct decision’ was or is.

It’s not easy
We all have opinions on the abilities of referees, umpires and sports officials, but hopefully understand that theirs’ is a difficult position. We may be dazzled by seemingly impossible physical feats on screen that even with multiple slow mo replays will remain imprecise and difficult to grasp. Calls are made and scores are allotted based on what is seen and how it is perceived and interpreted. Judges and officials not only have to be highly conversant with rules, they need a calm demeanour, an ethic of fairness, great peripheral vision, and be able to do what they do under duress. Even though multiple camera angles and technologies can recreate virtual occurrences of what just took place, performances are still interpretable points of view, and we scream unfair. I think you have to be impressed with what referees, umpires and judges are seeing in real time and how accurate they are with observations made in the heat of competition, at high speed and under pressure to post a rapid score.

Bias is there
According to Allen et al. (2021) judging biases track back to the order of the performers, corruption and the relationship between the performers. ‘Top’ judges are a sought after, and there is a tendency to lean on and flog that group to judge multiple competitions over lengthy sessions. While experiences help make good judges, those same experiences expose them to most competitors and thus foster an unconscious bias towards and around those familiar athletes. Past performances and reputations translate into current expectations and it is hard for a judge to discount those expectations (Bouwens et al., 2022). There is also evidence that judges tend to favour athletes from their own nation and from the nations of their co-judges (Osório, 2020), so judges are prone to unintended biases. Sure they have trialled taking the average of judges’ scores, or removing high and low scores before averaging the rest – but somehow contentious results persist.

Seeing different things
I like the idea that some judges may be looking for different things in their judging of aesthetic performances. A study on artistic swimming judges found that some viewed performances from a strong aesthetics lens whereas others were very kinematic and biomechanical in their analyses, searching for things like optimal joint angles and straightness. Maybe that kind of balance in a judging panel is needed.

Selling it
An interesting bit of research came out of surfing (Furley et al., 2020) where waves and riders are being judged. This study explored whether a surfer’s post ride celebrations – how they sold their ride – and yes it had an influence on judging. I get that body language and team celebrations after a routine or ride might be a not so subtle message to the judges that, hey I was fabulous! Spectators can also play a part here. Some research has found that football referees are susceptible crowd noise as a proximal cue to the severity of fouls. So athletes and supporters can be further cues that judges unconsciously sense.

Let’s be kind
Officials are under duress in many ways, but being on a world stage where there is not only a large live audience, but a mass of television viewers, likely unconsciously influences judging. Simone Biles was, I think, expected to be dominant. I do not want to take a single thing away from her athletic genius, but unless she really messed up she was probably well on her way to gold before the competition even started. So just because that elite diver did not score well does not mean that the judges have gotten it wrong. It is near impossible to prove that they have erred! We can rate what we see but we cannot objectively measure so many things in sport. And there is the beauty.

Well this was good for my blood pressure as I hunkered down for another Olympic session….

Note: I wrote this piece a while back before we had the experience of judging ‘Raygun’ or the post Olympic controversy of Jordan Chiles’ medal. So I could have done better….

Articles referenced or for interest

  1. Allen, E., Fenton, A., & Parry, K. D. (2021). Computerised gymnastics judging scoring system implementation–an exploration of stakeholders’ perceptions. Science of Gymnastics Journal, 13(3), 357-370.
  2. Bouwens, J.; Hofmann, C.; Lechner, C. Transparency and Biases in Subjective Performance Evaluation. Account. Transpar. 2022, 72, 1–42.
  3. Furley, P., Thrien, F., Klinge, J., & Dörr, J. (2020). Claims in surfing: The influence of emotional postperformance expressions on performance evaluations. Journal of Sport and Exercise Psychology, 42(1), 26-33
  4. León-Prados, J. A., & Jemni, M. (2022). Reliability and agreement in technical and artistic scores during real-time judging in two European acrobatic gymnastic events. International Journal of Performance Analysis in Sport, 22(1), 132-148.
  5. Mack, M., Schmidt, M., & Heinen, T. (2021). The relationship between the perceived movement quality and the kinematic pattern of complex skills in gymnastics. Journal of human kinetics, 77(1), 5-13.
  6. Osório, A. (2020). Performance evaluation: subjectivity, bias and judgment style in sport. Group Decision and Negotiation, 29, 655-678.
  7. Ponciano, K. R., et al. (2021). Reliability in the evaluation of international and national judges in an artistic swimming routine. Revista Brasileira de Cineantropometria & Desempenho Humano, 23, e76587.