Methods that assess individual patient variables would appear to offer the best methodology for assessing surgeon and anaesthetist performance.
Clinicians have struggled with the capricious nature of predicting surgical outcomes for hundreds of years. If one wanders of the beaten track to the basement of the Louvre in Paris you will come across a black diorite plinth inscribed with hieroglyphics from the time of King Hammurabi of Babylon (Figure 1). As early as 1750BC he was issuing edicts aimed at practising clinicians. The best known being:
‘If a surgeon operates on a free man and the man dies or goes blind then the surgeon should have his hand cut off’.
‘If a surgeon operates on a slave and the slave dies then it is the responsibility of the surgeon to replace the slave’.
It would appear at first sight that little has changed over the intervening four thousand years, but over the past thirty years there has been an increasing clinical awareness of the importance of clinical audit and clinical governance as tools to help with overall quality improvement. Although mortality alone is often used as a quality measure itself, clearly a number of factors can influence the outcome from surgical endeavour. The quality and experience of the surgeon and the anaesthetist preparing the patient for surgery and its subsequent performance can have a significant effect on outcome. However, the patient themselves will often bring with them the major prognostic factor with regard to subsequent outcome, that of their physiological fitness. This may be reflected in their chronic disease status or the acute physiological disturbance caused by their acute illness. Finally, the procedure itself will have a major affect on surgical outcome.
All these variables are amenable to change. We can expand our clinical knowledge to encompass new procedures. We can contract our practice to those areas in which we can excel. We may be able to improve a patient’s chronic disease status or devise new methods of anaesthesia to minimise risk in particular patients or we may be able to amend a patient’s acute physiological disturbance. We can even alter the magnitude of our surgical intervention to a degree. It was with these thoughts in mind, rather than fear of lawyers and legislators, that probably led clinicians to look at methods for measuring and predicting the outcome from surgical intervention.
Let us look at some of the methodologies available for predicting and measuring surgical performance and examine the application of clinical audit and outcome measures to this field.
Whatever the contribution, there can be little doubt that regular clinical audit monitoring of process guidelines prevents performance slippage and will identify outliers at an early stage, providing the guidelines up to date and widely available.
Although the latter has tried to introduce some form a risk adjustment for age, sex, social deprivation and co-morbidity. The methodology is far from accurate and confidence limits are wide. It remains to be seen whether the availability of such SMRs to the general public reassures them of the equality of care or produces patient flows from units with SMRs above 100 to those below 100 despite all units performing within 99 per cent confidence limits. As with many surgeons, the public at large do not always understand complex mathematical models but do understand the concept of good (SMR under 100) and bad (SMR over 100).
Perhaps differing models may provide the solution. Models which merely produce an assessment of high or low risk with various graduations between such as ASA clearly do not offer the solution. Neither do those similar systems apportioning risk but without a numerical individual patient outcome prediction. APACHE requires observation over a twenty four hour period and the worst variables are applied to a mathematical formula which has extensive correction weightings for individual disease conditions. In comparison with those methods discussed previously it produces an individual numerical patient prediction for mortality but clearly more variables are necessary and the mathematics can be complex usually requiring significant hardware and software support. These APACHE problems have limited its application in general surgery where successful surgical intervention can have a major and immediate effect on physiological status.
In an attempt to overcome some of these difficulties, general surgeons during the late 1980s began to develop a methodology which would produce an individual patient prediction of both mortality and morbidity utilising data which was regularly collected and easy to obtain. This lead to the development of the POSSUM system (see Table 6 and 7), first published in 1991, which has now become one of the best known and widely applied methods for surgical audit. It has been validated in a wide range of surgical specialities including vascular surgery, colorectal surgery, thoracic surgery and general surgery. An orthopaedic POSSUM has been recently described and validated in which the general equations are still utilised but there are minor modifications to the operative severity score assessment. A modification of the POSSUM system has been devised which is of particular use in individual patient prediction. The p-POSSUM (Portsmouth POSSUM) system has proved to be particularly popular in vascular surgery. The same variables are assessed but a linear rather than logistic model (Table 8) is used making it an easier mathematical model to use and to self-design applicable software.
More recently further refinements of the original POSSUM system have been described specifically for colorectal and oesophageal surgeons. Tekkis et al. have described both a CR-POSSUM for colorectal surgeons and an O-POSSUM (Table 9) for oesophagogastric surgeons. These have the advantage of reducing the variables required for prediction and improving the accuracy for these particular fields of surgery. O-POSSUM, is however, somewhat complex and requires knowledge of individual variables, coefficients similar to the APACHE systems. As yet unlike the original POSSUM equations they have not been validated in units outside the UK but the original estimation data set was obtained from many differing sites across the UK, and as the variables and weightings are similar to the original POSSUM scoring system, it is likely that their accuracy will be confirmed by other observers. However, all these adaptations, unlike the original POSSUM system, have as yet no morbidity predictive model and cross speciality comparison is, of course, not possible.
Systems such as POSSUM and APACHE which produce such a prediction have obvious advantages in this regard. Some authors have suggested that the p-POSSUM mathematical model has advantages in individual case review and this may well be the case in low risk cases as both the POSSUM and APACHE models are logistic equations based on populations of patients rather than individuals. Certainly the p-POSSUM and POSSUM systems are the ones recommended by the Royal College of Surgeons of both England and Edinburgh and by NCEPOD and are probably the methods of choice. The POSSUM system is the only system that produces a numerical prediction of morbidity across the surgical spectrum.
Clinical audit of adverse outcomes can be a particularly depressing affair. While it can be of great value to discuss cases where death occurs and predictive models indicate a risk of death of less than 20 per cent, the opposite end of the spectrum (risk greater than 80 per cent) often yields little audit gain except to discuss whether the operation was indeed indicated. Predictive models of these types can produce a new audit spectrum, that of the patients whose risk exceeds a certain level (for example >50 per cent) but who survive. Often, audit of these cases can identify best practice and produces changes in resuscitative protocols which produce a sustained quality improvement. Such an approach has the added value of making clinical audit an uplifting rather than depressing experience.
Over the past fifteen years, there has been increasing interest in the outcomes from individual unit as well as individual surgeon endeavour. If one simply applied mortality rates—as any mathematician will point out if you choose to take a radical stance and close the worst performing 5 per cent, after 10 years you will have closed 40 per cent of units and probably still not improved overall care. Fortunately no country has chosen, to date, to take such a radical decision.
Methods that assess individual patient variables would appear to offer the best methodology for assessing surgeon and anaesthetist performance. Table 10 illustrates the marked differences in outcome of surgeons with varying case mix. However, with the application of the POSSUM system it is possible to predict the expected number of deaths and comparing this with the actual number yields a ratio (the observed to expected ratio; O/E ratio) which potentially produces a true quality measure (see Table 10 and 11). There is now commercially available software (the CRAB system: available from CRAB Clinical Informatics Ltd) which includes all available POSSUM algorithms and which allows for the first time analysis of all POSSUM related quality indicators from audit aid to surgeon, unit and specialty specific outcome measures.
These techniques have now been widely validated and from personal observations it would appear that when performance deteriorates, it is in the management of patients whose risk lies between 10-80 per cent that major differences in unit performance have been identified. Where O/E ratios are persistently above 1.00 examinations of individual patient deaths and of the morbidity spectrum, when compared to similar clinician or unit spectra, can often identify the cause of poor performance. Local complications and wound related problems are often surgeon related. Respiratory and cardiac problems are often anaesthetist related. Renal and to a lesser extent respiratory problems are often related to the availability of appropriate high dependency facilities and the overall quality of nursing services. While these may be oversimplifications, from a personal perspective I have found them to be useful tools over the past ten years when assessing both my own and other units.
It would seem that when assessing surgical performance process and structures are best measured using classical clinical audit techniques. When assessing true outcome, be this mortality or morbidity, then some form of refined risk adjustment is necessary to avoid the risks of utilising simple mortality or morbidity rates.
Graham Paul Copeland is a Consultant General Surgeon at North Cheshire Hospitals NHS Trust with a special interest in biliary and breast disease. I am the inventor of the POSSUM surgical audit scoring system and the CRAB audit software system.