JARS v38n2 - An Experiment in Computer Assisted Judging

An Experiment in Computer Assisted Judging
Joe W. Jones, Portland, OR

Since before our marriage my wife, Adele, has been a student of botany. Her continued interest in flowers and plants rubbed off on me over the years. I have now expanded my interest beyond the point of appreciating only the beauty of the blossoms. For me this interest has been focused on rhododendrons through the subtle, and sometimes not so subtle, efforts of Lansing Bulgin, past president of the Portland Chapter. Ultimately this lead to my becoming "hooked" and joining the ARS a relatively few years ago. I am letting you know this so you can appreciate the innocence, or is it naivet, with which I approached the responsibilities of cochairman, with Adele, of the Portland Chapter's early show for 1983.
Adele and I had attended shows for four or five years, but had only been involved in helping with plant sales. We had not even entered any trusses until the May, 1982, show. As a result I approached the task of show judging with no preconceived ideas from prior experience. I do, however, work with people who are experts in evaluation and assessment. As a result I have learned some of their concerns.
I don't know how judging is done at your shows, but the rules for the Portland early show state:
"Score Points for Judging a Cut Truss
1. Size according to variety ------- 25 points
2. Color -------------------------- 20 points
3. Form -------------------------- 20 points
4. Foliage ------------------------ 15 points
5. Substance --------------------- 10 points
6. Condition --------------------- 10 points"
They also indicate that "Each entry shall be judged against all other entries in its group or class..."
It may be my lack of experience, or one of my frailties, but I could not see how without help I could evaluate several (many) trusses in a class for six different criteria, assign points for each of those criteria, total the points and compare them to select the trusses with the highest points for winners. But this is the computer age, so I looked for a computerized solution to what for me would be a formidable problem if I were to be judge.
Fortunately I had available to me the resources necessary to start planning for an experiment in computer assisted judging. I had access to an HP 85, which is a self contained microcomputer. It includes a small screen and a built-in printer with four inch paper tape output. I also have a son, Brad, who is majoring in Computer Information Systems who could program the computer for me. Brad coded the judging program in BASIC to my design specifications over Christmas vacation.
The program was designed so that each truss can be evaluated in turn for each of the six criteria. This can be done by each judge assigning up to the maximum allowable points for each criteria under consideration. When all of the criteria for the truss are evaluated, there is an opportunity to make corrections. After that judging begins on the next truss. When all of the trusses in a class are judged, the computer is instructed to print a list of trusses by the number of points assigned in descending order. The top three are to be the ribbon winners unless otherwise disqualified by the judges.
My next task was to convince the chapter leadership that the experiment should be undertaken. I called Herb Spady, Chapter President, and discussed my plan with him. Herb had some initial reservations about acceptance of the experiment by the members and the time it might take, but gave his approval to proceed.
Having obtained approval, I then recruited three judges who were willing to participate in the experiment Doris Jewett, Jim Brotherton and Ed Egan.
At our meeting to discuss the use of the computer in the judging process there became an increased awareness on their part of the judging criteria and the weights assigned. It was observed that using this process there would less likely be a dominant judge in the team. It was believed the judges would not have to negotiate to reach consensus of the award winners as that would be automatically computed. Most importantly they wondered whether the results would be essentially the same using this method as under the old method. At the conclusion of the meeting each judge agreed to become even more familiar with each of the criteria and applicable points prior to the show.
The winter of 1983 was one of the warmest in years in the Pacific Northwest. There were no hard frosts in the Portland area. Some gardens had no frost at all. This had significant implications for our experiment. In prior years the Early Show averaged about 225 trusses entered. This show had 467 trusses entered. With one team of judges the task of judging all of the entries with the new process did not look promising.
Judges were able to begin about twenty minutes earlier than scheduled. As they proceeded through the entries the judges discussed each of the criteria applied to the truss under examination. Then each announced the points they believed appropriate. These were entered into the computer. The order in which the judges announced the points was rotated to insure there was no bias created by one judge always announcing first. When all the trusses in the class were judged, the computer printed the results which were then turned over to the recorders for marking and recording the winning entries.
As the judges proceeded their pace picked up. Still at the end of one half hour they had finished only three classes with a total of 16 entries. I looked around the exhibit hall at all the remaining entries to be judged, and as agreed in advance, called a halt to the experiment. With all of the entries there was not going to be enough time to carry out the experiment for the whole show.
In our discussion after the judging was completed, the following conclusions were reached:

Using the experimental process forced an evaluation of each of the criteria for each truss. As a result no truss could be inadvertently overlooked.
The process ensured an application of the criteria weights as specified in the rules.
The judges believe their judging after the experiment ended was improved because of their enhanced awareness of the criteria. Because of this they believe this process could be an excellent tool in helping to train new judges.
They believe the process could have been most helpful in selecting the best of show winners from among the blue ribbon winners.

I still believe that judging teams experienced in the process may be able to use computer assistance to more quickly and thoroughly judge entries in shows. Perhaps someday I will have a chance to try another experiment hopefully, on a smaller show.