The Use of Controls in the Assessment of Clinical Evidence

The prize-winning essay from the B.M.A. Essay Competition for Provisionally Registered General Practitioners. 1961. The essay is abridged. The incorporation of standards or controls into clinical experimentation has become, over the past decade, a widely accepted practice. So much so that there is a danger that the medical profession in general may become uncritical of the practical and, more important, the ethical problems involved. The subject should be under

The incorporation of standards or controls into clinical experimentation has become, er the past decade, a widely accepted practice.So much so that there is a danger that the medical profession in general may become uneritical of the practical and, more important, the ethical problems The subject should be under constant Arc controls really necessary?How best can they be employed ?I low is the resulting to be assessed?Under what circumstances can the use of controls be justified on ethical grounds?THE NEED FOR CONTROLS Progress in medicine depends upon experiment.Formerly the physician based his belief in the efficacy of his simples and mixtures upon either the dogma of his mentors, hallowed by time and seldom criticised, or his own observations upon indi idual patients.This has led to the perpetuation, e n into this supposedly enlightened age, of many remedies whose true worth has been accuratclv assessed.Although the majorit of the remedies inherited from the H)th century and beyond been discarded with the of the pharmacopoea in 19 5 the need for objecti e assessme t of the drugs we prescribe has been greater.The reasons for this twofold.The drug houses pouring out day a flood of new preparations.many of which either pharmacologically identical or differ, one from the other, by only a small degree in action or in side effects.This is too much for the practitioner to sort out for himself.Even if he had the time and energy he might not by now have the inclination, for the extravagant claims made on behalf of so many products can breed a cynical i1 .uffcrencc,and a to well-tried medicines.
I le needs therefore a guide for speedy reference.Though he may uot he trained in statistics, if he knows that a drug has been assessed by a standardised scientific procedure, such as a controlled trial, his problems and his scepticism are much diminished.
The second reason is an economic one.That the best should speedily be selected from a multiple choice is very relevant to the cost of the l Iealth Service in this country.Not only should the best be propagated.but the less effective products should just as quickly be discarded.
In diseases where no effective treatment exists, especially when they arc either so trival as to inconvenience the patient only a little, such as the common cold, or are highly malignant, such as acute leukaemia, there m a y be no necessity o r indeed justification for the inclusion of a control group i n a trial.

T/ lE CONCEPT OF CONTROLS
The first clear definition of clinical controls has been attributed to Laplacc in 1814."To determine the best of several treatments it is sufficient to tr\' each of them on the same number of patients, keeping all conditions constant ... the superiority of the most beneficial treatment will manifest itself the more, the greater the number of cases." This concept has gained widespread acceptance only during the past ten years.It continues to have its foes.Basically it involves the application of a statistical method to experimentation with human subjects in a manner previously applied only to animal studies.Planning the experiment is now recognised to be a major undertaking, necessitating from the start having a clear grasp of the questions it is hoped to answer, and proceeding to detailed specification of the kind of subjects to he incluclcd, the treatment to be compared and the measurements to be t a k e n The points at which different types of trial arc particularly vulnerable to bias arc more widely appreciated, as is the value of such precautions as randomisation, " blind " comparisons a n d the inclusion of placebos.The nu:nbcr of successfully completed trials is growing and they serve as models for future work.ERROR W i t h the exceptions already mentioned, the inclusion of controls is essential for every kind of clinical trial.Where no control is exercised other than the subjective response of the patient and the clinical impression of the physician, the scope for error is great.
Many investigations into observer variation have emphasised the need to substitute the experimental for the observational approach.
Professor Hill has pointed out, however, that " within the framework of a clinical trial designed to contrast one group with another there is nothing whatever to inhibit the highly gifted clinical observer from observing ... " The observational and experimental techniques arc not mutually incompatible; indeed I believe one advantage of the development of controlled trials in Great Britain has been the need for.and the attempt to improve, observations of disease and disease processes.
"The emphasis placed on objective measurements," writes Oswald Savage " has alreadv resulted in more careful and accurate studies in these chronic diseases a n d has already produced new observations on the natural history of conditions such as arthritis." The principle of therapeutic control is to provide for the group of patients who arc to undergo a new and yet untried treatment a parallel group of cases similar in all respects, that is as regards all possible contributing causes except the one factor of treatment ; or if the bchaviour of the disease in a particular patient is to be observed before and after treatment, to make conditions the same in both periods except again for the factor of treatment.Some sources of error may be unsuspected until the control method brings them to light.They represent all conditions which may be important for the origin as well as for the further development of the disease.Sources of error, or" concurrent causes," arc divided by llcrclan into 'internal' and 'external' causes.
Among the internal causes arc firstly, the characteristics of the patient.
These include sex.age.nutrition and g e n e t i c consti tion-l of hich influence profoundly his response to disease and its therapy.Secondl\'.th re ar thc charact ristics of the disease itsclf .. different periods of time and in cliffercnt subjccts.particu ly in chronic diseases.or in trials months or years.the may in its stages, in its degree of in the irul nc of an infecting organism.Delay in treatment will also allo ution in the dis as process, and its response to treatment.
r\mong xt rnal causes ar firstly iro ntal factors such as financial conditions, family affairs, conditions of mploy and arious physical dislmbanccs which may substantially affect the of thc disease.Causes may be introduced through the itsclf.These include the effects of transition from home to hospital -the fact that a patient is into a hospital can new mental physical as also the attitudes of doctors and nurses.For xampl , in organic cliscasc such as colitis where psychical factors ar known to play an important part, a tablet.if \\'ith coniction and enthusiasm by the physician, may a perceptible clinical Thus in certain trials the doctor nor the patient should whether he belongs to the or the control group, nor in the c a s e of the patient, that such dichotomy exists ; hence the use of 'double blind' method and dummy tablets.
It is not correct to assume that by i n t r o d u c i n g clinical controls into a we automatically eliminate error.I ower r large a trial, and h e r control and trial groups may be. to the operation of concurrent c a u s e s it m a y lack generality.That is to say that the results, although true for a time, place and trial.may not be true.The science of therapeutics i n v o l v e s the study of just such larger problems.I comparati therapeuti apart, it is probably truc to say that the larger more widespread a trial, thc more reliable its conclusions.Examples of two such experiments the M.R.C. trial of ACTII.cortisone and aspirin in acute rheumatic which inthe co-operation of many centres in the United Kingdom and the United States, and secondly the trial of the Salk polio ccin s which many thousands of patients.

Tl JE CONSTRUCTION OF CO TROLLED TRIALS
According to the type of disease w e are studying, so a particular type of trial is adopted and a particular statistical method applied to the analysis of the results.There are two broad categories: "within-patient" trials, where the patient is his own control, and " b e t w e e n -p a t i e n t " trials, where control and trial are different patients.
Tt seems agreed by most authorities that in clinical t r i a l s as in experimental pharmacology, when feasible, comparisons of t r e a t m e n t s should be made within subjects rather than between subjects.If each subject receives all the treatments, variations in level of response from subject to subject cancel o u t when treatment averages are compared.The gain in precision is often striking ( 5) .
A within-subject trial is a simple example of grouping.The general principle is to divide the patients into groups such that differences between groups rcprcscnt important sources of variation that may inflate the experimental errors-that is, they are stratified according to known concurrent causesthen, if the experiment can be conducted so that each treatment is represented equally often in every group, differences between groups are automatically eliminated from the comparisons of the treatment averages.
There are various designs by which grouping may be applied.A basic form is the simple cross-over pattern where the patients, and the duration of the trial, arc divided into two for the co:nparison of two drugs.For example. 100 patients are included for the testing of 2 drugs over a period of 6 months.For the first 3 months 50 patients.who may either be randomly selected or subgrouped according to some common characteristic, arc given drug A and 50patients drug B. For the second 3 months the treatments are reversed.The patients arc their own controls.
The same principle can be applied in various more complex designs.An example of a trial may illustrate the method.The relative values of Phenmetrazine and Dcxamphctaminc in the management of obesity have recently been compared (6).A 'double blind ' procedure was used.Three series of tablets of similar appearance were prepared :-Phenmetrazine, Dexamphetamine a n d a control placebo, containing principally Lactose.A crossover technique was employed on the latin-squarc pattern.

B.C.A.
3. C.A.B.The patients were allotted to a particular treatment by a random sequence using a table of random numbers (Bradford Hill, 1955).Thus the two treatments under review were compared one with the other, and in addition, each patient was his own control.W i t h the use of such control grouping far more information can be gained than simply the quantitative success of one treatment over another.
All patients throughout the trial were on a similar low caloric diet.As is always the case with the initiation of a reducing diet, regardless of the drugs given, the weight loss in all groups was greater during the first 6 weeks, due to initial dehydration.Since the precise effect of a treatment dcpcudcd upon two things, namely the kind of tablet taken and whether it was admiuistcrcd in the first, second or third period, a more accurate result could be obtained if these two influences were disentangled.Grouping made this possible.
Thus an important concurrent cause could be eliminated, namely the evolution of the disease process at a particular time in the trial-in this c a s e the regression of weight loss-and appropriate statistical correction terms could be applied.The conclusions drawn at the end of such a trial are therapeutically and statistically reliable.
In general, it is seldom possible to apply these within-subject methods to acute or rapidly progressive diseases.T n fact, in clinical practice, as Doll has pointed out, the opportunities arc few for using the patient as his own control because "when one drug has been tested the patient's condition is likely to be so altered, whether by the drug or by nature is immaterial, that nothing is to be gained by repeating the treatment with another drug."(7).
The majority of trials therefore have to be conducted by giving trial and control drugs to different patients.that is between-subject trials.The groups of patients should be similar in all respects except the treatment they rccci\'c.
In the past, controls were often rctrospccti\'c, that is to say, attempts were made to compare the effects of a ne\\' drug on a group of patients o v e r a period of time with the effects of a previous treatment and a different group of patients.This, as Armitagc (8) and others have pointed out. is a practice which should be avoided because other factors may be operating.
For example, patients treated with anticoagulants for coronary thrombosis have been compared with ' control ' groups treated prior to the introduction of anticoagulants in 1948.It has been sho\\'n that these two groups can not be fairly compared.The very fact that a specific treatment for thrombosis was available meant that after 1948 a larger number of coronary thromboses of only moderate severity were admitted to hospital than previously This alone was sufficient significantly to improve the survival rate of cases treated in hospital.
To avoid these and other difficulties in between-patient trials a variety of methods of random selection h a v e been devised.Randomisation precludes the possibility of conscious or unconscious bias on the part of the clinician w h o is allotting patients to the trial.
Randomisation does not ensure that all groups a r e exactly equal; nothing can do this.It does ensure that t h e y differ by an extent that is prcclictahlc and can he allowed for in the statistical analvsis.
It may be thought preferable to take into account all the concmrent causes over which w e have some control by methods of sub-grouping or stratification.This is particularly valuable where the numbers involved a r e small and where it is less likely that ' by chance' the two groups will be homogeneous.This sub-grouping can be incorporated into the system of alternates-known as compensating alternates-or can be combined with the random numbers method, especially if more than two sub-groups or strata a r e contemplated.
Two or more groups which at the commencement of a trial appear to be quite comparable may show, on reassessment at the end, to differ in important respects.In the M.R.C. trial in 19;;, some 500 children were divided into three groups to compare the value of ACTH, cortisone and aspirin in the treatment of acute rheumatism.It was found that most of the concurrent causes were strictly comparable, but a marked difference happened to be present between groups in the numbers presenting with chorea (ACTH, 5.6%: cortisone, 11.4% : aspirin, 1 5 .5 % ) and with congcsti,•c heart failure (ACTI-1, 1_4.2% : cortisone, 9% 1 • aspirin, 6% ).This difference is statistically significant and could therefore influence results, and it illustrates the value of subgrouping.< 9 l Doll points out, however, that it is seldom necessary to have more than a few subgroups, and that if it seems necessary to have a large number it suggests that the tre:1tment is being tried on too heterogeneous a group of patients.
A statistical method which has been increasingly used in recent years is Sequential Analysis.The usual practice in trials is to postpone conclusions until final measurements from all subjects have been gathered.In a sequential trial on the other hand, a continuous statistical analysis is made as the data from each subject comes in.The trial is stopped •as soon as the analysis indicates a clear-cut verdict of statistical significance.
This method may allow a reduction in amount of experimentation by 10-5 0 % as compared with a fixed-size trial of the same discriminating power.
For its suitable application there arc certain requirements:-(i) Patients enter the trial in sequence o v e r a period of time.
(ii) Results should be quickly available.
(iii) There should be potent reason for wishing to stop the trial as soon as possible.( iv) The primary object should be to perform a test of significance between new and standard treatments rather than a quantitative asscssmcn t ( 1 o).The controls in sequential trials a r e the patients on the standard treatment.Sequential Analysis is the method used in the Stilboestrol-Oestriol trial in metastatic mammary carcinoma, which is currently being conducted in Edinburgh.It provides a good illustration of the technique.
It is designed to show whether there is any significant difference between therapy with a synthetic oestrogen.stilboestrol, and a naturally occurring one.
oestriol.The former is the standard control.Patients are allotted to one or other group by a •double blind' method.The t a b l e t s similar in appearance arc distributed by the scaled envelope system using a table of randomised numbers and applied by a statistician.
In order that progress can be assessed with some accuracy the criteria of admission to the trial are clearly defined, principally definite skin recurrence and I or mctastases clearly seen on X-ray.
Each patient is seen at monthly intervals.The nosographic criteria of progress arc based on X-ray changes, measurements of skin lesions, including photographic records and b i o c h e m i c a l t e s t s They arc assessed n o t by o n e b u t by a panel of doctors.As soon as 1t 1s clear that a patient IS cletcnoratmg despite the hormone therapy she is withdrawn from the trial.

THE STATISTICAL EVALUATION OF RESULTS
The construction of therapeutic trials implies a twofold control.The two control concepts may be distinguished by the names of ' clinical controls ' and 'chance controls'.
Having arrived at the end of a trial with a quantitative difference in results between trial and control groups, the question then arises as to whether such differences can be regarded as significant.The alternative is that the differences observed arc caused bv fluctuations of the manv causes which comprise 'chance '.
Hence just as important as the clinical control is the control of the observed differences, and this is affected by application of a calculated standard error or standard deviation.
There arc three main types of fluctuations or variations of numerical quantities which obey the laws of chance:- Random sampling fluctuations of numbers and of relative frequencies ( perccn tages).2. Biological variations of counts and measurements.3• Experimental and observational errors.
Once :1 difference has been observed between the fincliugs in the control group a n d in the tr.ial group, a statistical significance test must be applied.The appropriate test depends upon the particular nosographic criterion by which the effect of treatment is being a s s e s s e d For example, differences in relative frequencies or of percentages of outcome arc assessed by the standard error of a perccutagc and by the chi-square test, and differences •between average duration of a disease by the standard error of the mean test.
For most practical purposes, having established a distribution curve from one control group we can draw an arbitrary line at a distance of two standard deviations from the mean.If an observed value from the trial group lies outwith this line it has a 95% chance of being statistically significant.If it lies beyond thrice the standard deviation it has a 99•7% likelihood of being significant.

ETHICAL PROBLEMS
The foregoing discussion has centred upon some of the problems which arise with the use of controls, their application and assessment.Considerations remain of far g r e a t e r i_mportancc.They arise in clinical experimentation because of our bchcf, w1th Kant, that every human being, irrespective of his mental or social s t a t u r e should be regarded as equally important.
It has b e e n said t h a t many of the great men of history were great because they were btgots ; that 1s to say they could only see one side of a problem, their own sick.In the same way.anyone.whateve r his sphere of activity, dedicated in the pursuit of a policy or line of investigation.may become blind to points of view differing from his own.Since this may be a quite subconscious process, such a person may bitterly resent the imputation that he is not fit to assess objectively the ethical implications of his own work.
Since the use of controls has assumed an integral part of clinical experiments it is very relevant, in discussing their use.to consider the ethics upon which human experimentation as a whole is based.
Can we justify experiments on humans at all ?I have discussed earlier some of the material reasons why controlled trials arc necessary in the propagation of new therapy.These trials cannot be confined to experimental animals.Findings in other species m a y have general or specific validity for man, but the ultimate estabiishment of such validity must rest in each instance upon direct observations upon man.
Shimkin has advanced important considerations for the justification of such experimentation.He points out that an unwillingness to experiment carries with it as much moral responsibility as active experimentation.He says, " ... to do nothing, or to prevent others from doing anything, is itself a type of experiment ... " and goes on : " As much knowledge and as weighty reasons arc required for one course of action as for the other, and it should be demonstrated that the proposed experiment is more dangerous or more painful than the known results of inaction " ( 11).
The Nuremberg Medical Trials resulted in the formulation in 1947 of the oft-quoted rules which serve as a guide, though not necessarily as an infallible credo, for the experimenter.Shimkin has reduced these rules to two primary principles :-1.
"Investigators must be thoroughly trained in the scientific disciplines of the problem, must understand and appreciate the ethics involved, and must then be competent to undertake and carry out the experiment.

2.
The human experimental subjects must agree to the procedure and must not he selected upon any basis such as race, religion, level of education or economic status.In other words, the investigators and their subjects arc human beings with entirely equal, inalienable rights that supersede any considerations of science or general public welfare."How can the use of patients as controls be reconciled with such principles ?Guttentag has drawn a distinction between the 'phvsician-fricnd' and the 'physician-experimenter '.The former has a personal relationship with his patient, sharing his distress and wanting to assist him."Objective experimentation to confirm or disprove some doubtful or suggested biological generalisation is foreign to this relationship.This would involve taking advantage of the patient's cry for help and of his insecurity" ( 12).'!'here is no doubt that among medical research workers in this country there is a tcndcncv to regard patients as experimental objects, rather than human beings, and since this attitude is principally to be observed in the teaching hospitals it may well be spreading.
'I 'he responsibility for care of the patient, and for intensive i n v e s t i g a t i o n or experimental procedures on that patient, should he in the same hands, not necessarily the hands of an individual but preferably of a group.The more extensive and potentially dangerous the experiment, the more widely shoulcl the responsibility be divided.The nature of the doctor-patient relationship is such that nothing ought ever to be done to the patient except to his direct• advantage, unless he gives his consent.As Fox has put it."it is this consent that changes his status to that of olun r, in he becomes the legitimate object of experiment."(13) Griener has listed the circ mstanc s under he considers that a compromise may fairly be reached.
" J n the first place, the risk of drug toxicity must not exceed that ru by thousands of patients under treatment.In the U.S .this means drugs and doses that ha\'c already been accepted by Government agencies.
"Secondly, each patient-participant shall have a medical condition w h i c h in the ordinary course of events, would be treated by the test agent or another drug with similar action."Thirdly, the experimental design shall not permit deterioration of the medical condition.
"Finally, it is apparent that a number of trials would be impracticable if the patient understood exactly the nature of the experiment, or even that he were participating in an experiment at all.A ethical ideal may be too restricting when facing real-life problems."As Grciner succinctly puts it, "a workable standard appeals to me even more than lip-service t o a remote principle, however perfect."( ) The very fact of an experiment being necessary implies that one treatment is inferior to the other, and therefore that one group of patients will dcri\'c more benefit than the other.As long as it is only known for certain to be true in retrospect, then the experiment can be entirely justified on ethical grounds.
W h i l e the trial group rccci\'c the new treatment the control group arc given the best orthodox treatment.Crofton has said that since the control group receives the best treatment previously available it is the safest group from the patient's point of view to belong to." He may be denied the still improved advantage of the n e w treatment but he avoids its possible side effects.And, of course, the n e w treatment may prove to be inferior to the old." Finally, if we reject controlled clinical trials as ethically inadvisable, what arc we to propose as the alternative?Without controls, trials arc unreliable; without trials the distribution of drugs may be haphazard and potentially To quote from Hill, " in many settings the carefully designed controlled trial is far more ethical than the uncontrolled experimentation with unproven products to which patients arc frequently exposed."( 1 ) It is not sufficient simply to take a passive role nor to obstrnct those who arc active in the development of n e w experimental methods.W e have an obligation not merely to reap what has been sown, but to plough-under a little more untillccl land for the next crop.
THE USE OF CONTROLS IN THE ASSESS M ENT OF CLINICAL EVIDENCE By C. VAUGHAN RUCKLEY, M.B., CH.B. prize-winning essay from the B.M.A. Essay Competition for Provisionally Registered General Practitioners.1961.The essay is abridged.