Scientific research in homeopathy: to Prove or to Improve
Twenty years ago science seemed simple: If a few Randomized Controlled Trials (RCT) give significant results you have scientific proof that a method works. In 1991 the first meta-analysis of homeopathic trials showed that proof for homeopathy is as good as for conventional medicine. Coincidence or not, since then RCT is no longer a method without flaws. There are bad RCT and positive homeopathic RCT must be bad. Why?
The same people who demand proof for homeopathy are convinced that homeopathy cannot work.
This conviction is based on two misconceptions:
1. Homeopathic doctors expect the same results from a homeopathic potency of a medicinal substance as from a pharmaceutical dose. Vandenbroucke states: “Microbiologists know for sure that infinite dilutions of an antibiotic will never show any effect on bacterial growth”.
2. Medicines can only work as conventional medicines via chemical interactions. Vandenbroucke: “Accepting that infinite dilutions work would subvert more than conventional medicine; it wrecks a whole edifice of chemistry and physics”.
It is amazing how these convictions have become dogmas. No homeopathic doctor will expect any effect on bacterial growth from an infinitely diluted antibiotic. Does the world only consist of chemical interactions? Based on these convictions positive results of homeopathic research can only prove fraud by homeopathic doctors. The only good homeopathic RCT is a negative homeopathic RCT, it will be qualified as good more easily because quality judgments of RCT are quite subjective. And beware, any negative result can and will be used against us!
Is it likely that a homeopathic RCT will produce a negative result, even if the method works? Homeopathy is not a perfect method, we are uncertain about many entries in our ma medica and repertories. Furthermore, RCT is based on indications and homeopathic medicines cannot be prescribed only on indication.
Practice or experiment
RCT is an experiment where you want to optimize the possibility to discern verum from placebo. Any social circumstance can effect this possibility. An attractive experimental opportunity is a marathon; it is limited in time and the same for all participants. So we measure the effect of homeopathic Arnica or Rhus toxicodendron on muscle soreness after a marathon. But is this illness? We think that a homeopathic medicine stimulates the defense mechanisms by a stimulus that resembles illness. Marathon runners however, are highly trained; what is there to stimulate? Many of us have the experience that we must stop a homeopathic medicine after some period because the complaints become worse again. Then the complaints subside, apparently the patient is over-stimulated. There have been four high quality trials testing Arnica for this indication and it appeared that muscle soreness was less in runners taking placebo (nearly significant). This could be proof for the possibility of over-stimulation by homeopathic medicines, but to our opponents it is proof that homeopathy does not work.
Another way to optimize the experiment is to measure the effect on something that can be easily measured like plantar warts. This indication was tested for the homeopathic medicines Thuja, Antimonium crudum and Nitricum acidum in a highly qualified RCT by Labrecque et al. The outcome was negative; does this mean that homeopathy does not work or that these three medicines are not the optimal choices? Needless to ask our opponents.
These two indications played an important role in the negative conclusion of the meta-analysis by Shang et al. The quality of the trials was good in epidemiological respect, but they reflect one of the problems of homeopathic research: there are no phase I and phase II studies paving the way toward a sure outcome, like in conventional medicine.
Many of us will answer to the examples above that this proves that we should use classical homeopathy. Unfortunately, there is no evidence for this at the moment. In the meta-analyses of Linde and Shang classical homeopathy did not perform better than other forms of homeopathy.
A clear example is Walach’s trial on chronic headaches, a very good trial with negative outcome. Walach explained that the discontinuation of conventional prophylactic medicines could have some effect in both verum and control groups, and that the main problem in classical homeopathy is the long duration of the consultations. This might cause an extra placebo effect, so that the verum effect is less visible, see Figure 1.
One of the main advantages of classical homeopathy is that we can systematically test hypotheses. Based on a number of symptoms we choose the most likely medicine. If it does not or partially work we re-analyze the case and prescribe another medicine. But does RCT provide us with enough time to test several medicines one after another? And how reliable are our repertories and ma medicas?
Preparations for RCT
The influence of the questionable quality of our instruments is shown in a 5 year long preparation for a positive RCT on ADHD. This improved the success rate of the first prescription from 21% to 54% and the effectiveness of the fifth prescription from 68% to 84% by four consecutive steps modifying the conventional homeopathic procedure. The first step was to develop and test a questionnaire. Questions that did not lead to successful prescriptions were removed from the questionnaire. The next step was called ‘polarity analysis’. Many homeopathic medicines exhibit both poles of a symptom: there are thirsty Phosphorus patients and there are Phosphorus patients without thirst. But the ‘mean Phosphorus patient’ is thirsty. In Kent’s repertory Phosphorus is in bold type in the rubric ‘Thirst’ and in plain type in the rubric ‘Thirstless’. Does this mean that ‘thirstlessness’ pleads for Phosphorus? No, the explanation follows under the paragraph ‘Bayesian science’. Analyzing and calculating polarities improved the success-rate. Furthermore, Q-potencies were used because they have less fluctuation in treatment effect. Figure 2 shows the influence of the consecutive steps on results of treatment.
Another step necessary to obtain significant results was to insert a screening phase before the actual trial in order to exclude non-responding patients from the trial.
Figure 2: Stepwise improvement of results by modifications of homeopathic diagnostic procedure:
These results suggest that every RCT should be prepared in this fashion. This could take several years. Such a preparation fulfills the same role as phase I and phase II studies in conventional medicine. But then there is still the problem that our ma medica and repertories are largely based on expert opinion instead of scientific research. This problem resembles conventional diagnostics: we know from expert opinion that night-sweat could indicate tuberculosis, but scientific assessment should indicate how strong the indication is. Here Bayesian science comes to help.
In 1763 reverent Thomas Bayes published his theorem describing the way we learn from experience. He showed how we make valid predictions about the future from past experiences. Bayesian reasoning has since invaded every field of science, because it can produce valid conclusions in real-world phenomena. Now medicine can be assessed without placebo-control and randomization, science can be used for improving instead of proving homeopathy.
From experience to prognosis
The bayesian principle is in fact quite simple: a diagnostic test is better as it is positive more frequently in people with the disease than in other people. Hahnemann also made this observation and in the same fashion he concluded that rare and peculiar symptoms are the most valuable symptoms. Likelihood ratio (LR) expresses the relation between the prevalence of the symptom in the population with the illness and the population without the illness. In the homeopathic translation: the relation between the prevalence of the symptom in the population cured by a certain medicine and the rest of the treated population.
A symptom with a higher LR is more important. Peculiar symptoms have high LR because they are specific for just a few medicines. The bayesian formula is as follows:
Posterior odds = LR x prior odds
If a symptom is as frequent in the population cured by a certain medicine as in the rest-population LR=1. Such a symptom gives no indication at all for this medicine. When a medicine is frequently prescribed, like Phosphorus, we will see a number of patients who are thirstless, but when this occurs as frequently as in the rest-population, LR=1 and the symptom is no indication for Phosphorus.
The choice of a homeopathic medicine is usually not based on one fact (diagnosis). In bayesian perspective we can describe the decision-process of a homeopathic physician: if we add symptoms, our certainty about the curative effect of a medicine will grow; if our symptoms are better (eg if they are peculiar) our certainty will grow faster. Suppose that the chance that a homeopathic medicine cures is 1% if there are no symptoms, than our conviction that Rhus toxicodendron will be curative grows as follows with 3 subsequent symptoms (in this example odds are translated into chance by the computer):
This is a normal procedure in homeopathy. The patient visits the doctor because of joint pains. The homeopathic doctor then asks about circumstances that influence the complaint. If the patient tells him that the pain is ameliorated by motion, the homeopathic doctor thinks of Rhus toxicodendron as one of the possibilities. If further investigation learns that the patient has a definite desire for cold milk his expectation that Rhus toxicodendron will help grows. The last symptom, aggravation from wet weather, is subsequently enough to prescribe this medicine.
A new repertory
Bayesian thinking is a perfect starting point for scientific update of the homeopathic method. Homeopathic symptoms can be assessed as diagnostic instruments, like any other diagnostic test eg. ultrasonography. In conventional medicine we seek for the relation between test and diagnosis, in homeopathy we seek for the relation between symptom and cure.
Computers make it possible to collect an enormous amount of data about our prescriptions with a minimal amount of work. Then it is easy to evaluate which cases were successful and which symptoms led to these cases. And which did not!
Type and LR
A correct repertory would indicate the difference between medicine-population and rest of the population, expressed as LR. A homeopathic medicine should be present in the rubric when the population responding to that medicine shows the rubric-symptom clearly more often than the rest of the population, say more than one and a half times more often. Table shows a possible schema for translating LR into type face. Such a schema should be evaluated in due course for correspondence with actual practice.
Table 1: Possible translation of likelihood ratio into typeface
The intermediate values represented by Italics are just a rough estimation; further research must indicate optimal values.
An example: ‘Fear of death’
The Committee for Methods and Validation of the Dutch homeopathic doctors association started the first prospective assessment of six homeopathic symptoms in June 2004. The planned duration is three years. After two years we have 2266 evaluated prescriptions. One of the assessed symptoms is ‘Fear of death’. The rubric ‘Fear of death’ contains 103 medicines in Kent’s repertory. There are 103 patients in our assessment with this symptom. The prevalence of the symptom in the whole population is 4%. According to the translation of LR into type proposed above we expect a prevalence of 6% in the population cured by the medicine before entering the medicine in the rubric. If the prevalence in the medicine-population is 12% it can be entered in Italics, and if the prevalence is over 24% it should be entered in bold type.
The results are shown in the next table. The expected prevalence is the prevalence we expect according to the existing entry in the repertory; Calcarea carbonica is in bold type, we therefore expect 24% (=6 times 4%) of the Calcarea patients to have a fear of death. In table we show the results for this symptom. We used exact binomial calculations (one-tailed) to calculate P-values, and calculations via binomial approximation of normal distribution if (number of patients cured by the medicine)x(expected prevalence of symptom)>5.
Table 2: Entries from the repertory-rubric ‘Fear of death’ compared with prospective assessment in 10 practices during two years; also LR for each medicine, confidence interval (CI) is given if LR is significant.
The entry of Anacardium should be upgraded, probably to bold type (p=0.938, the p-value in the table is the probability that LR>1.5). The bold entry of Calcarea carbonica is incorrect, plain type might still be correct (p=0.675), but even that is uncertain. The same goes for Phosphorus (p=0.699). Lycopodium still might be possible, but plain, not in Italics (p=0.599). Natrium muriaticum should not be in this rubric.
Interim results after two of the planned three years of investigation are now available. At this moment we have evaluated 56 LR values considering six repertory rubrics (‘Diarrhoea from anticipation’, ‘Fear of death’, ‘Herpes lips’, ‘Grinding teeth in sleep’, ‘Sensitive to injustice’ and ‘Loquacity’). Our results suggest the addition of 20 (35.7% of 56) entries to Kent’s original repertory and the rubric ‘Sensitive to injustice’ in the RADAR-Synthesis repertory, version 8.1.40. On the other hand 11 (19.6% of 56) of our outcome values suggest removal of existing entries in the repertory.
There are also retrospective data that give an indication of LR s of homeopathic symptoms, like van Wassenhoven showed. Such data should be handled with more care than prospective research, but they are more reliable than expert opinion. Retrospective data can also be obtained by analyzing successful cases. In the Netherlands homeopathic medicines are validated by analyzing a number of successful cases by different practitioners. One of the results was that ‘Loquacity’ is present in only 40% of all successful cases of Lachesis. So it makes no sense to withhold a patient Lachesis if he or she is not loquacious. At the moment about 20 medicines are validated, none of these medicines have their characteristic symptoms in more than 50% of all cases.
As long as the judges of homeopathic RCT are convinced of the impossibility of homeopathy we are in a witch-trial. If the results are negative the outcome is accepted, if the results are positive we must have committed fraud. As long as this situation persists we should not perform any RCT.
Even if the witch-hunt is over we should only do RCT after proper preparation. Our method has too many weaknesses for optimal results. We made some suggestions for such preparations.
In the long run we should revise our repertories by scientific research using Bayes’ method. We have to improve our method before we can convincingly prove it.
1 Kleijnen J, Knipschild P, Riet G ter. Clinical trials of homeopathy. BMJ 1991; 302:316-323
2 Vandenbroucke JP, Craen AJ. Alternative Medicine: A “Mirror Image” for Scientific Reasoning in Conventional Medicine. Ann. of Internal Medicine 2001;135(7): 507-13
3 Vickers AJ, Fisher P, Wyllie SE, Rees R: Homeopathic Arnica 30X Is Ineffective for Muscle Soreness After Long-Distance Running - A randomized, double-blind, Placebo-controlled trial. Clin J Pain 1998;14(3):227-231
4 Vickers AJ, Fisher P, Smith C, Wyllie SE, Lewith GT: Homoeopathy for delayed onset muscle soreness - A randomised double blind placebo controlled trial. Brit J Sports Med 1997;31:304-307
5 Jawara N, Lewith GT, Vickers, Aj, Mullee MA, Smith C: Homoeopathic Arnica and Rhus toxicodendron for delayed onset muscle soreness - A pilot for a randomized, double-blind, placebo-controlled trial. Brit Hom J 1997;86(1):10-15
6 Tveiten D, Bruset S, Borchgrevink CF, Norseth J: Effects of the homeopathic remedy Arnica D30 on marathon runners: A randomized, double-blind study during the 1995 Oslo Marathon. Complement Ther Med 1998;6(2):71-74
7 Labrecque M, Audet D, Latulippe L, Drouin J: Homeopathic treatment of plantar warts. Can Med Assoc J 1992;146(10):1749-1753
8 Walach H, Haeusler W, Lowes T, Mussbach D, Schamell U, Springer W, Stritzl G, Gaus W, Haag G: Classical homeopathic treatment of chronic headaches. Cephalalgia 1997;17:119-126
9 Frei H, Everts R, Von Ammon K, et al, Homeopathic treatment of children with attention deficit hyperactivity disorder, a randomised, double blind placebo-controlled crossover trial. Eur J Ped, 2005, 164/12; 758-767
10 Frei H, Ammon K von, Thurneysen A. Treatment of hyperactive children: Increased efficiency through modifications of homeopathic diagnostic procedure. Homeopathy 2006;95:163-170
11 Rutten A.L.B., Stolper CF, Lugten RF, Barthels RJ. Repertory and likelihood ratio: time for structural changes. Homeopathy. 2004;93:120-124
12 Rutten A.L.B., Stolper CF, Lugten RF, Barthels RJ. A Bayesian perspective on the reliability of homeopathic repertories. Homeopathy. 2006;95:88-93
13 Van Wassenhoven M. Towards an evidence-based repertory: clinical evaluation of Veratrum album. Homeopathy 2004;93(2):71-7.
14 Stolper CF, Rutten ALB, Lugten RFG, Barthels RJWMM. Materia medica validation and meta-analysis: A post-graduate course combining learning and research. Homeopathic Links 2004;17(3):186-188
Lex Rutten MD Breda - NL
Heiner Frei MD Pediatrician
Keywords: homeopathy, scientific research, RCT, fear of death, thuya, Arnica, rhus toxicodendron, Antimonium crudum, Nitricum acidum, dokterrutten, Labrecque, Shang, homeopathic research, Linde, Walach, Verum effect, reliability, ADHD, phosphorus, Bayesian science, Q-potencies, Homeopathic diagnostic procedure, Diarrhoea from anticipation, Herpes lips, Grinding teeth in sleep, Sensitive to injustice, Loquacity, Lachesis