To submit or not submit : The burden of evaluation on postgraduate medical trainees

Purpose Academic centers utilize web-based surveillance systems to administer their evaluations, but little is known about their impact on the evaluation responsibilities delegated to medical residents. Method Using a mixed-methods approach, a retrospective content analysis was conducted of the evaluation activities experienced by a cohort of 29 residents as they completed their training in general internal medicine from 2009-2012. These data were triangulated with group interviews conducted with current internal medicine residents in 2012-2013. Results The internal medicine program electronically requested that its residents complete 8,614 evaluation reports on clinical faculty, curriculum, and junior trainees (345 requests annually per resident). Residents reported feeling overwhelmed by their ongoing evaluation workload, and admitted that their motivation to submit high-quality appraisals was dissipating. Residents perceived that their program valued certain evaluations more than others, and this was a major factor in their decision regarding whether or not they would eventually submit an appraisal. Feedback submitted on program evaluation-related appraisals were viewed as having the least value, and residents were significantly less likely to submit these evaluations. Conclusions Zibrowski E, Crukley J, Malett J, Myers K MedEdPublish https://doi.org/10.15694/mep.2016.000058 Page | 2 Although web-based surveillance systems are efficient in distributing thousands of evaluations, residency programs to engage in ongoing vigilance of the unintended consequences associated with their use.

Many North American academic centres, including our own, introduced web-based surveillance networks to administer its evaluations of clinical faculty, trainees, and medical education programs (Civetta et al 2001;Deretchin et al 2002;D'Cunha et al 2003;Bennett et al 2004;Feldman & Triola 2004;Natt et al 2006).While it was anticipated that the shift from a hard-copy dependent to electronic platform would reap benefits including greater compliance, improved data quality, and increased access to evaluative information, to date, the literature has been focused on implementation processes, rather than outcomes, associated with these surveillance systems.We argue that meaningful assessment systems can evolve only through careful exploration of their facets and complexities including the workflow, and impact of these activities, experienced by its user groups.Undoubtedly, one user group involves medical residents, who are delegated the professional responsibility to serve as evaluators throughout the tenure of their postgraduate education.While we have previously reported that residents struggle to meet their obligations as evaluators (Myers et al 2012), we know little about their actual evaluation workloads.Moreover, while 'survey fatigue' has been thought to negatively impact survey return rates (Porter et al 2004), the degree to which medical residents' experience and react to these feelings, within the context of an electronic surveillance system, has not been previously explored.The purpose of this study was to explore the evaluation workflow and habits experienced by postgraduate trainees during residency training.
We conducted a mixed-methods study involving a content analysis of the evaluation activities undertaken by a historical cohort of 29 general internal medicine (GIM) residents.From 2009-2012, these residents completed a three-year residency program at a single academic centre in London, Canada.The unit of analysis was the individual, electronically delivered evaluation form that the residents received throughout each academic year by means of the electronic software program used by the department's Education Office.Using this web-based software, GIM residents' received email prompts alerting them to the arrival of an evaluation form to their individual electronic account along with a timed reminder that an appraisal was due.Evaluation requests sent by other administrative units to the members of the historical cohort, including those from external rotations, and requests sent from the dean of postgraduate education, were not included in the sample.Text-based records relating to academic year, evaluation form name, and whether or not it submitted was by the resident were exported into an Excel spreadsheet in August, 2012.Submitted evaluation forms were not made available to the research team, and residents' data was individually aggregated via generic code.Blank copies of the forms were used to record its characteristics including type of evaluation and its number of items.The formatted dataset was then imported to SPSS for analyses (IBM SPSS Statistics for Mac, Version 22).We supplemented the secondary evaluation data with information gathered from semi-structured, group interviews with GIM trainees from August 2012 -January 2013.Although the residents who participated in the interviews were not members of the historical cohort, they were representative members of the very next group of trainees to undergo GIM training, and therefore, were similary exposed to evaluation practices as per their immediate predecessors.Residents were sent email inviting them to consider participation in the study if they had received at least one evaluation request to from the Education Office's software program.Group interviews were

Methods
scheduled by convenience, and were conducted by a trained researcher who did not know the interviewees.The interview guide was informed by the by initial analyses of the secondary dataset, and by our previous research in rater-based assessment (Myers et al 2012;Zibrowski et al 2011;Myers et al 2011).We audio-recorded the interviews, and these were transcribed into verbatim electronic documents.Interviewees received a $25 gift card as an honorarium.Over the course of interviews we used an inductive, constant comparative approach so that the data was reviewed after each session, by a minimum of two members of the research team.This allowed the interview guide to be edited to maximize capture of ideas emerging within and across the group discussions.We summarized secondary evaluation records with mean and standard deviation and/or percent response, and minimum and maximum values.95% confidence intervals were calculated around point estimates.In order to explore the potential relationship between evaluation form type and year in residency program (PGY), we used a chi-square test of association and the relationship between the total number of items probed by an evaluation and whether or not the form was submitted was estimated by a point-biserial correlation with 95% confidence interval.
The Education Office sent a grand total of 8,614 evaluation requests to the 29 residents from July 1, 2009 to June 30, 2012.On average, resident received, 344 +126 forms [minimum = 148,maximum = 578;95% CI (341.3,346.7)].These involved three major types of requests: a) program evaluation ( academic session evaluation, OSCE evaluation, rotation evaluation): 5529/8614 (64.2 %); b) residents' appraisal of a physician acting in a supervisory role (clinical teacher, consultant, senior resident): 3006/8614 (34.9%); and c) residents' appraisal of a junior trainee (mid-rotation clerkship evaluation, physical exam evaluation, medical dictation evaluation): 79/8614 (0.9%).The mean number of items on the set of forms was 18 (7 SD, minimum = 4, maximum = 45).A statistically-significant relationship was detected between postgraduate year of training and type of evaluation request with more requests sent out during year one (x 2 4 =172.8;p<.001).The majority of these were for appraisals of clinical faculty and academic sessions (83%).Seven residents, one PGY1, three PGY2, three PGY3, were interviewed in the group sessions.All spoke frustratingly about the plethora and volume of evaluation requests they continually received."They can really accumulate and then you can feel overwhelmed by the different ones."(R1) "These tend to be tremendously onerous to the point where it's completely impractical to do them.Because you just do not have that much time".(R2) A couple of the residents complained about the electronic system not being user-friendly, and it seemed to complicate their access to the forms, " I could do it on my phone right away, like, if they just send me the form on my phone.But I have to go log in on a computer."(R6) "It's a horrible website".(R5) Moreover, some trainees commented on how the evaluation tasks competed for attention with other aspects of their life.As this resident rationalized, "The reason I don't do them is that we get caught up in everyday things.You come from work, you're tired, you have a couple errands to do and then you know, you find the dinner if you can, and you're tired.You just want to go to bed".(R7) Residents submitted a grand total of 7,428 (86.2%) of the forms they received.Individual submission totals varied from 100% (8/29 residents) to 3.4% (1/29 residents).When asked about their motivation to submit evaluations, the trainees who were interviewed admitted that their motivation to submit evaluations had declined since entering residency, and that this was impacted, to some degree, by the volume of requests: "I mean having gone through these for a few years, I would spend less time on them than I used to".(R3)."I definitely let it go over a year or more.I knew I had them there.I didn't really Zibrowski E, Crukley J, Malett J, Myers K MedEdPublish http://dx.doi.org/10.15694/mep.2016.000058

Results forget them. I did keep putting it off and by nature I'm bit of a procrastinator anyways. I ended up getting to the point just where it looked pretty overwhelming". (R4)
That being said, some residents acknowledged that their program's evaluation system seemed to be set up to monitor and encourage residents' completion of certain forms, such as ratings of clinical faculty." You will not get your evaluation if you do not complete theirs.So, those tend to get done a little more quickly for me".(R1) "Our program director says, 'you're not filling out your online evaluation form' and you're thinking, alright, alright, I guess, I'll go home and sort that out." (R5) A weak correlation was detected between number of items probed and submission of an evaluation [rpb= -.12; p<.001; 95% CI (-.14 to -.10)].Residents' openly described their personal strategies for managing the volume of forms they received including: "I do them right away.You just want to get it over with".(R2) "I think the highest number I got to was 115 in my box.Eventually I did do these over an afternoon slash evening.I think I did most of them or entered all of them at once ". (R4) In their discussion about their personal strategies, residents admitted that even though an evaluation was technically submitted back to the GIM program, its quality might not be optimal.All of the residents who were interviewed described shortcutting strategies they used to complete their evaluations as quickly as possible, "I can honestly say I've never elected to evaluate more than one person "(R3), "I'm just getting it done.Just getting the thing done and even if it's horrible, even if it's not great, it's quick and I answer down the middle ". (R6) A significant relationship was also found between type of evaluation form and whether or not it was submitted (x 2 2 =427.6.5;p<.001).Ninety percent of the forms that were not submitted were program evaluation-related, with the majority of these being requests for an appraisal of an academic session (994/1078).When asked about how they decide whether to submit or not submit an evaluation, residents revealed that a major factor in these decisions appeared to be the value they anticipated their program would have for the feedback.As this resident indicated, "My perception is that for faculty evaluation, there is more weight placed on that because it affects how those people are probably ultimately remunerated or promoted.Those things do get fed on and that I think, it does actually get implemented in some way down the road in terms of how people perform on service.But I think, when you're talking about the didactic lectures that are given there's been a tremendous amount of feedback over the years of how about these need to be improved and really there's been very minimal change".(R3) Moreover, some trainees felt that the web-based program was not the ideal venue for communicating constructive feedback because, "I think there are better venues for that issue to arise rather than on some paper, some electronic document" (R5), and "If I really want to change something then I would go speak to the person that I needed to".(R6).
This study is the first to explore, using naturalistic data, the evaluation workflow of a cohort of general internal medicine residents.Based on our examination of these data, we theorize that the expectations placed on residents as evaluators is undermining their ability to complete high-quality reports, and perhaps more importantly, is at odds with the overarching goal of the evaluation system, that is, continuous quality improvement of the residency program.We found that the evaluation software enabled the GIM training program to place considerable workloads onto its trainees, as they were sent, on average, nearly 345 forms to complete per year.It should be noted that this estimate took only into account internal requests; it did not include those made Zibrowski E, Crukley J, Malett J, Myers K MedEdPublish http://dx.doi.org/10.15694/mep.2016.000058

Discussion
annually by other administrative units and programs.Therefore, at minimum, they were asked to contemplate nearly 19,000 questions during their residency experience.While it could be argued that trainees are not required to answer every question posed to them on an evaluation, and therefore, a given form should take only a few minutes to complete, less than one-third of the historical cohort submitted all of their forms.Moreover, the residents interviewed by this study, expressed that the volume of their evaluation workload left them feeling overwhelmed and frustrated.Rather than feeling an enhanced sense of compliance, they sensed their overall motivation to continually work on these forms was dissipating.They admitted to using shortcuts, which we have described previously (Myers et al 2012), to help just 'get it done ', and in some cases the form was not submitted at all.Porter and colleagues (2004) commented that even in situations where the number of surveys individuals are asked to complete is low, the process of sending multiple, overlapping requests can stimulate feelings of response burden or 'survey fatigue'.Although we were unable to aggregate the evaluation requests sent per month/rotation across the historical cohort, the total number sent to these trainees, their tendency to submit fewer forms as their training progressed, along with remarks from interviewees' regarding accumulations in their e-accounts, suggests that residents commonly encounter situations where multiple forms are competing for their attention.As such, residency programs need to be vigilant to thresholds for requests made to residents, beyond which impacts the reliability and validity of the appraisals they make.Future research into the mental workload expended by postgraduate trainees as raters is warranted.Griffin and Cook (2009) remarked that many higher education programs have problems 'closing the feedback loop', with their stakeholders, and often do not make explicit the link between feedback and institutional change They posit that in order for stakeholders to remain engaged, and trust in the evaluation process, they must experience some outcomes of their feedback.The residents' reactions to, and the submission frequencies for, certain evaluations, like the ratings of academic half-days, raises the question of how well the program is responding to its data.We have previously reported that residents' lack knowledge on the evaluation process, and sense that their training program valued some appraisals, like clinical teaching evaluations, more than others (Myers et al 2012).The results of this study further suggest that these perceptions influence their decisions about whether or not they will ultimately submit an evaluation.We have advocated that residency programs invest in educating its trainees on how it collects, stores, and disseminates evaluative data to its clinical faculty and higher administration (Myers et al 2012).We build on this, and suggest that efforts made to enhance transparency also need to include delivery of tangible evidence to postgraduate trainees of how their residency program will prioritize and enact meaningful improvements, in response to their critiques.Although web-based software can be very effective in distributing a surfeit of evaluation requests, the results of our study highlight the need for residency programs to engage in ongoing vigilance of the unintended consequences associated with their use.Although we conducted this study within a single academic center, the prevalence of electronic evaluation systems within medical education suggests that our results will likely resonate with other schools, but further research could enhance the transferability of our findings.Future insight may also be gained through exploration at the level of the residency program to explore how they manage, review and evaluate the data in order to close the evaluative feedback loop.