START – introducing a novel assessment of consultant readiness in Paediatrics: the entry not the exit

This article was migrated. The article was marked as recommended. The Royal College of Paediatrics and Child Health (RCPCH) developed a new end-of-training assessment held for the first time in 2012, known as START, the Specialty Trainee Assessment of Readiness for Tenure as a consultant. It is a novel, formative, multi-scenario, OSCE-style, out-of-workplace assessment using unseen scenarios with generic, external assessors undertaken in the trainees’ penultimate training year. This paper describes the introduction and structure of this formative assessment. While many other colleges have summative exit exams the inception of this assessment was designed to be formative, providing feedback on consultant-readiness skills, and not a high-stakes hurdle towards the end of training, It was developed from the College’s examinations question-setting group and following two pilots in 2009 and 2010, the assessment evolved and the first live diet was held in November 2012.


Background
This paper describes the Royal College of Paediatrics and Child Health's (RCPCH), assessment towards the end of specialist paediatric training, known as START, an acronym of the Specialty Trainee Assessment of Readiness for Tenure as a consultant.

Assessments in Paediatric Specialty Training
Paediatric trainees in the UK at the current time undertake an indicative 8-year programme following successful application into specialist training, usually after the first 2 postgraduate years, known as Foundation Years.The first 3 years of specialist training are known as level 1, the next two as level 2 and the last three as level 3 training (Royal College of Paediatrics and Child Health, 2020a).
The RCPCH assessment guide describes the programme of assessment (Royal College of Paediatrics and Child Health, 2020b).An overview table of assessments details those required for progression in each training year, where progress is reviewed at the Annual Review of Competence Progression (ARCP) meeting.These include supervised learning eventsworkplace-based assessments including feedback on written correspondence, observation of procedural skills and a number of specific assessments focussing on safeguarding, leadership, handover and acute care of patients.
There is a specific requirement for progression from level 1 to level 2 which is the College's Membership examination consisting of 'written' theory papers, now undertaken by computer-based testing.These are usually completed in the first 2 specialty training (ST) years.The clinical membership examination would usually be expected to be successfully completed by the end of ST year 3.This is a high-stakes, summative OSCE examination and is normally required for progression into ST year 4 and level 2 working (Royal College of Paediatrics and Child Health, 2020c).It is a multistation high stakes clinical examination testing candidate performance in areas of communication, history taking and clinical examination technique.It is a pass/fail assessment so the full Membership, both computer-based tests and the clinical exam determine progression from the first to the second level of training.
The other progression point is at the end of training.Once training has been completed a doctor is entered into the Specialist Register and is given a Certificate of Completion of Training (CCT).This allows application for appointment to Consultant grade which is the most senior grade of doctor in the UK's National Health Service.

Development of the assessment
In 2007 the RCPCH reviewed their requirements for completion of paediatric training.Unlike some other colleges who hold high-stakes, pass/fail, summative assessments (computer-based knowledge, situational judgement tests, or formal clinical viva voce examinations) there was no 'exit' examination in paediatric training.At that time Paediatricians leading training did not want to make trainees take a high-stakes hurdle to become a paediatric consultant after eight years of specialist training with a risk they might not pass a one-off summative assessment.In that spirit, a formative assessment in the penultimate training year was proposed.The aim was to assess trainees in their final training period (known as 'level 3' training) in different scenarios across multiple domains in the style of an Objective Structured Clinical Examination (OSCE) (Harden and Gleeson, 1979).A multi-station, formative assessment taken in the seventh Specialty Training year, called the 'ST7 Assessment' (ST7A), was devised.Twelve eight-minute stations, mainly using a structured oral as a basis for a directed discussion covering a series of predetermined consultant-orientated scenarios, were assessed by consultants trained to be assessors in this assessment, judging key competencies against the expected agreed standard of a newly appointed consultant.Two pilots ran in 2009 and 2010 and data generated from these, and from questionnaires to trainees and assessors, reported positive responses about the assessment with trainees feeling they had not been tested in these areas in other ways in training and welcoming the opportunity to 'think like a consultant' in preparation for consultant posts.Assessors also viewed the assessment favourably too (McGraw, 2010).
Following these successful pilots, the General Medical Council (GMC) gave the RCPCH a mandate to include the assessment within the College's assessment strategy and the name of the assessment was changed to START.
Since the assessment is formative, a different lexicon of terminology was developed to use when discussing START as opposed to the summative college membership exams.This is detailed in Table 1.

Scenarios and circuit
The stations, known as 'scenarios', cover the following areas: Case-based discussion, Ward round and handover, Logistics and organisation, Safeguarding children, Critical appraisal of literature, Safe prescribing, Ethics, Consent and law, Teaching and Conflict and risk management.At the time of writing, each trainee completes 12 scenarios: 6 specialtyspecific and 6 general paediatric.
START scenarios are written around real-life clinical, managerial and logistical episodes which the trainee discusses with the assessor.The trainees are given a vignette and have four minutes to think through their approach.For the critical appraisal and prescribing they have a 45-minute block to prepare set tasks.In an OSCE format the trainees move through the 12 scenarios, holding a discussion with an assessor for 8 minutes in each one.Knowledge, while tacit, is not the sole determinant of the assessment of performance.Some of the scenarios allow the trainees to demonstrate higher order skills from Miller's pyramid (Miller, 1990;Cheek, 2010), for example, writing a prescription, doing a critical appraisal and the real-time micro-teach to medical students which evolved after the early diets (Reece and Fertleman, 2015).

Feedback on performance
Trainees are graded on their intra-scenario performance during a professional conversation, in the style of Schön (1983), who believed that exploring specific experiences would help learners acquire 'knowing-in-action' if coached by expert practitioners.Assessors are generic and not specialist and grade trainees' performance in the scenario across six domains mapping to the GMC's Good Medical Practice as 'further development required', 'performed at expected standard' or 'performed well above the expected standard' (General Medical Council, 2013), recorded as the rating for each item, giving an item rating.As well as this, a global rating of 'development needed', 'meets competence', 'above competence' or 'significant concern' is given.The 'significant concern' identifies very sub-standard performance in that scenario requiring specific attention.The benchmarking grid structure is shown in Supplementary File 1.Each assessor types feedback on the performance in each scenario directly into an electronic repository during the assessment.This is subsequently reviewed and released to trainees about 6 weeks after the assessment following grammatical, spelling and sense check.All 'significant concern' ratings are scrutinised by senior assessors both during the assessment and at a review meeting after the assessment (the START Executive Committee).The feedback is then available to the trainee and their educational supervisor and informs a Personal Development Plan supporting targeted learning and training opportunities in the trainee's final training year, documented and evidenced in their learning e-portfolio.So, the value of START hinges on valuable feedback and support from the trainee's educational supervisor.A document has been produced to enable educational supervisors to support trainees in making the most of the feedback.Access to relevant local learning opportunities vary locally within Deaneries.
Performance and feedback at START are not the sole determinants of progression at Assessment Review of Competence Progression (ARCP) but one of the assessment tools alongside workplace-based assessments, multi-source feedback, reflection, trainers' reports and ePortfolio evidence.START is not used to inform a consultant appointment interview panel as it is not designed for that purpose.After each sitting the RCPCH surveys trainees and assessors.

Assessors
Assessors are Consultant Paediatricians and Fellows of the RCPCH who have applied to assess START; many are involved in assessment and have a particular interest in education and training.They attend a single day of training and are given a refresher on the day of the assessment itself.During the assessment, over later diets, they have been given peer- review of their performance in-assessment by supporting assessors.This is reviewed and returned to the individual consultant assessor to use in their own education portfolio.

Results
Performance data from all diets to date Number of trainees and assessors The numbers of trainees and assessors for each assessment are indicated in Table 2 below.Many assessors have assessed across many assessments.The sitting in November 2017 was a two-circuit assessment held outside London.This was an extra assessment was held that year to ensure all the trainees in approaching their final year could access an assessment.
Table 3 details the specialty mix of the START sessions to date.

Data Analysis
A psychometrician reviews all the data and produces a report after each diet which is reviewed by the START Executive Committee.For the most part the report presents stacked bar charts showing percentages of the descriptor categories.While it is important not to regard the descriptors on a numerical scale, to make some statistical sense of the assessment, the use of numerical scales to present global ratings, average global ratings, internal consistencies and assessor error bars were used.The global ratings for all scenarios were calculated using the following numbers assigned to the scale from the benchmarking standard scales: Significant concerns = 1, Development needed = 2, Meets competency = 3 and Above competence = 4. Item scores for each domain were calculated similarly: Further development required = 1, Performed at expected standard = 2 and Well above standard = 3.
Cronbach's alphas are calculated to provide a measure of the internal consistency of START; this is a measure of the reliability of the assessment and that the scenarios measure the same overarching construct.Separate alpha values were calculated for the global ratings and item ratings.It is possible to determine the aggregated score of the six competency ratings per trainee per assessment.Alpha values are stable for the whole cohort for the 12 scenarios across diets, with means of α = 0.70 for the global ratings and α = 0.72 for the item ratings.Table 4 details the scores for the START sessions to date.

Examples of formative feedback
The trainees receive feedback that is not numerical in nature; they receive descriptors for global and item ratings as well as written feedback.
An exemplar of the formative feedback provided to trainees is included in Appendix 1.

Discussion
Over 12 diets the assessment is now embedded, well-regarded and understood in the main, in comparison to the early days soon after its introduction (Minson, Brightwell and Long, 2012).
Each variable was weighted according to the importance attached by the user in a particular assessment context to denote compromise necessary in certain areas of assessment.This has been used to review START.

Reliability
van der Vleuten et al. (2010) suggest structured and standardised instruments do not guarantee reliability and subjective evaluations are acceptable.Global ratings reduce inter-rater reliability but are offset by a larger gain in inter-station reliability.START's trainee grading scheme would uphold that reliability.Global ratings are a more faithful reflection of expertise than a checklist (van der Vleuten and Schuwirth, 2005).
Cronbach's alpha as measure of reliability is acceptable for an OSCE assessment of this nature.It will always be challenging for START to achieve high alpha values due to the homogeneity of the trainees undertaking the assessment in terms of knowledge and skill (as START is placed at the end of their training programme), the relatively small number of scenarios and the varying facets of clinical decision making and scenario thought processes being assessed; although these are all key skills for practicing as a consultant (which is the overarching construct), the scenarios cover a broad range of topics.

Validity
In assessing the 'does' at the pinnacle of Miller's pyramid, global rating, performance on rating scales and written narrative comments on positive and negative points on performance are appropriate (Miller, 1990;van der Vleuten et al., 2010).While usually applied to direct observations in-situ, using such formative models in an assessment is novel.As well as real-time prescribing, teaching and critical appraisal, START allows rehearsal of the professional conversation with a colleague.Some of the 'doing' scenarios are reported as more challenging.Some scenarios allow actual performance within the structured objective format allowing task-competency to be assessed.

Educational impact
There is no doubt that assessment drives learning (Schuwirth and van der Vleuten, 2004) and in that way more senior paediatric trainees make efforts to hone their critical appraisal skills and prescribing skills as well as consider the other aspects of the scenario domains.However the college does not advocate preparation for START as such and training itself should be enough.Now that the assessment is embedded, educational supervisors have more experience at supporting trainees through the aftermath of the assessment feedback and interpret the feedback into a useful Personal Development Plan for the penultimate year.This constructive alignment (Biggs, 1996) maps their intra-assessment experience with a documented and evidenced outcome within their e-portfolio as they move into their penultimate training year.Much of the subject matter has been shown to be helpful not only for consultant working once appointed, but with the transition into the role, especially the consultant interview which may want to check out a trainee's thinking on the way to appointability at that role (Reece and Foard, 2020, in press).

Cost
No assessment is without resource implications.But cost can be offset for the organisation by careful budget management and finding value in assessing many trainees in one sitting.Multi-station assessments, like this one, are more efficient.
Trainees paid separately to do START early on, but it is now offered as part of the cost of training, included in their annual training fee to the college.

Acceptability
Assessment needs to be acceptable to both students and faculty.START has survived the first 6 years and, in that time, 13 diets without mutiny from either.That is not to say there have not been challenges.Some of which are discussed in the linked paper (Reece and Foard, 2020, in press).As one of the tools for determining progress and supporting learning in the workplace by giving direction, it has inherent value in the paediatric training assessment portfolio.
Much is made of the London-centric nature of the assessment, being held in the RCGP Examination Centre in London (which challenges the notion that it is 'not an exam' but a formative assessment).The logistics of needing to assess a large number of trainees from any number of 20 sub-specialties means that three concurrent circuits are run in two sessions over two days.The exception of this being the extra "half diet" (two circuits only over one day) undertaken in November 2017.
The smaller numbers allowed us to move the assessment to a venue in the Midlands which demonstrates a level of flexibility and moved the assessment away from central London reducing travel logistics for some trainees.

Conclusion
The RCPCH has successfully incepted, piloted and introduced a novel assessment for senior paediatric trainees towards the end of specialty training bringing externality at this stage of training.The formative nature of the assessment gives trainees areas of development to work on in their final year.The domains of the multi-scenario, OSCE-style assessment map to the domains of Good Medical Practice and map easily to the GMC's Generic Professional Capabilities (General Medical Council, 2017).As increasing numbers of trainees have taken the assessment, it has become embedded as a useful tool providing the trainees with feedback to help them develop themselves further in the final training year in preparation for consultant readiness, supporting their transition.

Take Home Messages
A novel mandatory, formative, multi-scenario, OSCE-style assessment has been successfully introduced in penultimate year of paediatric specialty training.
Aspects of this assessment hold up well to a described utility model for assessment methods including reliability, validity, educational impact, acceptability and cost.

Table 1 .
Terminology used in the START assessment compared to exams

Table 2 .
Number of trainees and assessors for START

Table 3 .
The mix of trainees from different specialities over each assessment

Table 4 .
Cronbach's alpha for the whole cohort (number of trainees given in brackets) for global and item ratings having converted to scores as indicated above *extra cohort, therefore smaller n and only 1 day.
Dr Ashley Reece is a Consultant Paediatrician and Medical Educator.He has been involved in the Royal College of Paediatrics and Child Health examinations and assessments for 15 years and was the first Chair of the START Assessment Board between 2012 and 2016.He is currently the college's Officer for Assessment.He successfully completed an MA in Medical Education in 2017.Lucy Foard is a Psychometric Researcher at the Royal College of Paediatrics and Child Health.She has worked for the psychometric team within the College for 11 years, having previously held the roles of Psychometric Analyst and Psychometrician.She provides psychometric advice and guidance to other Royal Colleges and sat on the panel which developed guidelines for standard setting postgraduate examinations for the Academy of Medical Royal Colleges.G187 (P) aiming for the apex-realtime assessment of teaching using medical students in a compulsory, multi-station postgraduate assessment to assess the "does" at the top of miller's pyramid.Archives of Disease in Childhood.100(Suppl3),pp.A80-A80.Reference SourceReece, A. and Foard, L. (2020) START -evaluating a novel assessment of consultant readiness in paediatrics: The entry not the exit.Medical Teacher.42(9): 1027-1036.Reference Source Reece, A., et al. (2015) STARTa novel assessment of consultant readiness for paediatric trainees in the UK.Proceedings of the Association for Medical Education in Europe Annual Conference.(September) p.173.Available at: Reference Source (Accessed: 29/01/2016).Royal College of Paediatrics and Child Health (2020a) Training guide.Available at: Reference Source (Accessed: 06 Dec 2020).Royal College of Paediatrics and Child Health (2020b) Assessment guide.Available at: Reference Source (Accessed: 06 Dec 2020).Royal College of Paediatrics and Child Health (2020c) Examinations.Available at: Reference Source (Accessed: 09 Aug 2021).Schön, D. (1983) The reflective practitioner: how professionals think in action.New York: Basic Books.Schuwirth, L. and Van Der Vleuten, C. (2004) Merging views on assessment.Medical Education.38(12): 1208-10.Reference Source van der Vleuten, C. P. M. (1996) The assessment of professional competence: Developments, research and practical implications.Advances in Health Sciences Education. 1 (1), pp.41-67.Reference Source van der Vleuten, C. P. M. and Schuwirth, L. W. (2005) Assessing professional competence: From methods to programmes.Medical Education.39 (3), pp.309-317.Reference Source van der Vleuten, C. P. M., et al. (2010) The assessment of professional competence: Building blocks for theory development.Best Practice and Research Clinical Obstetrics and Gynaecology.24 (6), pp.703-719.Reference Source van der Vleuten, C. P. M. (2016) Revisiting 'Assessing professional competence: From methods, to programmes.Medical Education.50 (9), pp.885-888.Reference Source