Education data tool help

What do the results mean?

How do I read reports?

Main reports

The year on year report views show the results in a coloured grid.

Each cell contains the mean score and a colour which indicates if the score is an outlier.

  • Red: a red outlier is a score in the bottom quartile (Q1) of the benchmark group, and the confidence interval does not overlap with that of the benchmark mean.
  • Pink: a score in the bottom quartile, but the confidence interval overlaps with that of the benchmark mean.
  • White: a score in between the top and bottom quartiles of the benchmark group.
  • Light green: a score in the top quartile (Q4), but the confidence interval overlaps with that of the benchmark mean.
  • Green: a green outlier is a score in the top quartile of the benchmark group, and the confidence interval does not overlap with that of the benchmark mean.
  • Grey: fewer than three results (n<3). We only report results which have three or more responses.

Question item reports

Question item reports display as a vertical bar chart or table.

In the chart, the question text is shown at the top of the page and n or n range (the number of doctors who answered the question) are shown below the chart.

You can hover over the bars to see percentages for each answer.

In the chart view you can select other questions that make up the indicator.

What are indicators and how are indicator scores calculated?

What are indicators?

In the survey doctors answer questions based on their experience of training. Questions are grouped by theme and we refer to these groupings as indicators. We use indicators to measure how doctors feel about specific areas of training.

How are indicator scores calculated?

We use each doctor's score to build reports. For example to show results by site we average the scores of doctors at that site or to show results by specialty we average the scores of doctors in that specialty.

An example of how we average the scores of doctors. In the image, the example shows three sample questions along with their scores. For question 1, the score was 75, question 2 100 and question 3 50. The average score for this example then is 75 plus 100 plus 50, divided by three to give an average score of 75.

What are benchmarks?

What are benchmark groups?

Results are calculated by comparing a report group to a benchmark group. Doctors' scores contribute to the score for the report groups they are in (specialty, site, training level, etc). The scores are also used to calculate the benchmark score.

The benchmark group is the group of respondents whose scores you are comparing your report group to. For example if your report group is general surgery your benchmark group would be all surgical specialties combined.

The report group is the group you are interested in looking at. For example you can look at a post specialty by trust/board report to see how General psychiatry at your trust compares to General psychiatry at other trusts. In this example the benchmark group would be all psychiatry.

In the example below, in the benchmark group of All psychiatry (including General psychiatry) six out of 12 doctors responded positively. This gives a benchmark score of 50. In the report group General psychiatry, two out of three doctors responded positively. This gives a mean score of 66.67 for the report group. Dr A's score contributes to both the report group and the benchmark score.

The image is a visual representation of icons of doctors and their thoughts.

Because different reports compare different groups, they use different benchmark groups. For example Dr A is in an F1 General psychiatry post in Big City mental health trust. The scores calculated from Dr A's answers will contribute to, and be compared against, different benchmark groups in the different reports.

A venn diagram with three collections of people. In the first circle 'all psychiatry posts', in the second 'all foundation trainees' and in the third 'all trainees in the UK'. At the intersection of all circles is 'Dr A'.

How do we use benchmark and reporting groups?

Reports compare the doctor's reporting group (eg level or specialty) against a benchmark group.

Different reports use different benchmark groups. This is to make sure that comparisons of groups are fair.

For example if you filtered the post specialty by site report for general surgery at a specific hospital, the report would show you how that group compared with all surgery posts at all hospitals.

How to use the benchmark tables to find your benchmark group

  • Find the reporting and benchmark groups table for the report you are looking at
  • Look for your specialty, training level etc to see the benchmark for your reporting group
  • Use the benchmark groups table for more detail on what is included in the benchmark group.

Read the reporting and benchmark groups tables

What are outliers and how do we calculate them?

What are outliers?

Outliers are scores that are significantly higher or lower than the average score. In the education data tool they are shown as red or green flags.

The benchmark group is the group of respondents whose scores you are comparing your report group to. For example if your report group is general surgery your benchmark group would be all surgical specialties combined.

How do we calculate outliers?

To calculate outliers we first calculate the benchmark group scores. We sort all of the scores into the benchmark group in order. We split the scores into quarters.

We also calculate the mean score for the benchmark group (national mean). The national mean is included on the data downloads.

Scores in the bottom (Q1) and top (Q4) quartiles result generate outliers. These are shown as red (below) and green (above) flags.

A visual representation showing the score outliers. The two lower scores fall within the lower quartile (25%) and the two higher scores with the upper quartile (75%).

What are confidence intervals?

Confidence intervals are the range of values that, to a certain percentage of confidence (95% in the NTS), we are sure the 'true' mean value lies in, accounting for random error. That is, for 95% of confidence intervals, the true mean lies within these range of values.

Hospital A: 100 doctors complete the survey, most of them agree that training is good. We have lots of data and a high level of agreement within the group. This means that we can be confident about the score and the confidence interval is small.

Hospital B: 3 doctors complete the survey, each of them gives a different response. We have a small amount data and a no agreement within the group. This means that we are less confident about the score and the confidence interval is large.

A diagram of hospital A and hospital B. Hospital A has lots of survey responses and a high level of agreement. It has a small confidence interval. Hospital B has few responses, with little agreement. It has a large confidence interval.

Quartiles are also used to determine the flags. The first quartile refers to the value at which 25% of indicator scores lie below and 75% of scores lie above. The third quartile refers to the value at which 75% of indicator scores lie below and 25% of scores lie above.

We calculate the confidence intervals for the national mean and the report group. Finally, we compare your score and confidence interval to the benchmark score and confidence intervals. This shows us if your score is a red or a green flag.

This works equally with the red flags and the bottom 25% quartile. Scores that could range across the bottom quartile will flag as pink.

Below are some examples of how the scoring system works.

Figure 1 shows Hospital A has scored 33 on the indicator. This falls in the first quartile with confidence so will score a red flag.

Figure 1: Hospital A has scored 33 on the indicator. This falls in the first quartile with confidence so will score a red flag.

Figure 2 shows Hospital B has scored 44 on the indicator, this falls in the first quartile. However, its confidence interval spans across the quartile so will score a cross-hatched red flag.

Figure 2: Hospital B has scored 44 on the indicator, this falls in the first quartile. However, its confidence interval spans across the quartile so will score a pink flag.

Figure 3 shows Hospital E has scored 67 on the indicator, it is in neither the first or third quartile so therefore will score a white flag.

Figure 3: Hospital E has scored 67 on the indicator, it is in neither the first or third quartile so therefore will score a white flag.

Figure 4 shows Hospital H has scored 78 on the indicator, this falls in the third quartile. However, its confidence interval spans across the third quartile so will score a cross-hatched green flag.

Figure 4: Hospital H has scored 78 on the indicator, this falls in the third quartile. However, its confidence interval spans across the third quartile so will score a light green flag.

Figure 5 shows Hospital H has scored 82 on the indicator. This falls in the third quartile with confidence so will score a green flag.

Figure 5: Hospital H has scored 82 on the indicator. This falls in the third quartile with confidence so will score a green flag.

What happens next?

Deaneries/Health Education local offices

The results are used as a screening tool by deaneries/Health Education local offices to help them decide where there might be areas that need looking into.

Usually, survey results are triangulated with other sources of information to help ensure that resources are allocated to the right areas.

Deaneries/Health Education local offices and local education providers will review their regional and local results straight away and start planning their quality assurance and improvement activity for the next few months. Deaneries/Health Education local offices report back to us periodically through their dean's reports.

Royal colleges

Royal colleges use results from the programme specific questionnaires to monitor the delivery of curricula.

General Medical Council

We use survey data to help quality assure medical education and training across the UK.