What is a randomized algorithm

Background: The randomized controlled study (RCT) is the gold standard of scientific evidence to be able to attribute clinical effects in the sense of benefit and harm to specific medical measures. Numerous variants of the RCT study design were developed in order to meet legitimate critical objections in a meaningful way and to be able to better adapt to the challenges of dynamic clinical research contexts.

Method: The diversity and adaptability of randomized study designs were presented and explained on the basis of a selective review of the literature and selected examples.

Results: There are numerous possibilities for varying the RCT design in order to adapt it to specific research questions and clinical framework conditions. These include cross-over studies, n = 1 studies, factorial RCT designs, and cluster randomized studies. Furthermore, adaptive designs such as modern platform studies and pragmatic RCTs with simplified clinical questions and less restricted patient groups are available, which allow a broad recruitment of patients in everyday clinical practice.

Conclusion: Only randomized controlled studies ensure, due to the random allocation of test persons, that known and unknown patient characteristics that can interfere or distort a fair comparison of two or more medical interventions are evenly distributed. Therefore, the methodological variants and further developments of this type of study are significant, because even with high innovation dynamics, assessments of the benefits and harms of medical methods and products for the protection of patients can be made based on robust evidence.

There is now consensus that the gold standard for evaluating intervention-outcome relationships is randomized clinical trials (RCTs). Many variants and special forms of RCTs have been developed in order to improve the informative value in special clinical situations and to be able to carry out studies in this randomized design, even if this appears difficult from an organizational point of view. The following article describes a number of such practical possibilities. It should be remembered that randomization only means the random allocation to the intervention groups. It must not be equated with placebo comparisons or with blinding.

Designs

The classic and by far the most common case of a randomized controlled study is the parallel comparison (parallel group comparison) of two or more interventions, in which the allocation to the treatment groups is random. There are a number of possibilities to arrive at a random allocation, whereby nowadays electronic methods with the use of random numbers are generally used.

A very important element of an RCT is that, before patients are included in the study, the people involved (and the patients) must not know which intervention group the patients are assigned to. This procedure is called allocation concealment. The hidden allocation can best be guaranteed by randomization via telephone or internet, as is done in modern studies (1). This guarantees that patients will not be selectively included in or excluded from the study knowing who will later belong to the group.

In studies with a small number of cases, it can happen that, despite randomization, imbalances occur between the groups for certain patient characteristics (2). Theoretically, this is not a problem, since these imbalances balance each other out with a larger number of repetitions of studies. However, if it is to be ensured in a current study that essential prognostic factors are distributed almost equally between the groups, this can be supported by a stratified randomization. If many factors are to be taken into account, this can be achieved through minimization (box, example 1) (3). Statistical allocation algorithms help to ensure that the prognostically significant properties are distributed as evenly as possible between the treatment groups at every point in time when the patient is included. A random component can also be integrated into these algorithms.

Cross-over studies (box, example 2) are used to test short-term effective interventions - especially drugs - in chronic diseases. Each study participant receives both drugs A and B, the order being randomized, i.e. either AB or BA. A washout phase is usually necessary between the two therapy phases so that the effects or side effects of the two drugs do not overlap. Cross-over studies offer the advantage that intra-individual comparisons are possible. For example, each patient can be asked during which therapy phase they felt better. Under certain circumstances, this can lead to a considerable reduction in the number of cases. However, the validity of the studies depends on a number of critical conditions. The most important one is obvious: At the beginning of the second intervention phase, a patient must be able to reach roughly the same initial state as at the beginning of the first. For a long time, asthmatic diseases were therefore an essential area of ​​application for such studies. Conversely, cross-over studies are not suitable for chronically progressive diseases or treatments that aim to cure or prolong survival.

So-called n = 1 studies (also n-of-1 studies, box example 3) can be viewed as a special case of cross-over studies. Here, the same patient ideally receives blinded treatments in several therapy phases in random order. By comparing the treatments, you can gain insight into the best therapy. In this way, various interventions can be examined individually for the chronically ill. Since only a single patient is examined here, the results can rarely be generalized, but can help, for example, to find out the optimal treatment for individual patients in everyday general practice. In principle, several n = 1 studies can also be combined meta-analytically in order to enable statements that can be generalized if necessary (4).

Factorial designs (box, example 4) combine, so to speak, 2 RCTs in one. They come into question if you want to investigate 2 interventions A and B at the same time, which can also be used in combination (A + B). In such a design with, in the simplest case, 2 x 2 factors, the patients are randomized to one of 4 groups (A + B; only A; only B; neither A nor B). At the end of the study, patients treated with A and those not treated with A and those treated with B and not with B are compared. In addition, the effects of the combination can be viewed. A major advantage of factorial designs is the considerable reduction in the number of cases, since the same patients can be used for several questions (partial studies). An interpretation problem can arise for the two simple comparisons if the two therapies influence each other in a relevant manner, i.e. there is a weakening or strengthening interaction.

If organizational changes or educational measures are to be investigated, or if it is difficult or impossible for other reasons to keep the interventions to be compared in a center at the same time and to randomize individual participants, cluster randomized studies (box, example 5) are ideal. For example, hygiene and prevention measures are examined, for which entire hospital departments, nursing homes or school classes are randomized. Cluster randomized studies are also popular in general medicine, in which certain interventions are randomized to the individual practices (5). Although the target values ​​(e.g. avoidance of infections) are measured at the patient level, the cluster nature of the data, i.e. the dependency of the patients (observation units) within a cluster, must be taken into account in the statistical analysis. In a cluster-randomized study, information and the associated concealment of allocation at patient level can be problematic (5, 6).

Adaptive designs (box, example 6) allow the study design to be adapted in the course of the study. This primarily concerns the number of cases in the study, which can be increased or decreased on the basis of interim evaluations. This is particularly important if at the start of the study the possible treatment effect or certain assumptions necessary for planning the number of cases (e.g. about the expected variability) can only be estimated with great uncertainty. In these cases it can happen that an RCT is planned far too large or far too small. An adaptive design allows an interim analysis of the study to be carried out and the planned number of cases to be adjusted accordingly. There are other options for adapting RCTs, for example with regard to the target variables or the patients to be included, but this always requires close cooperation with competent biostatisticians (7). A corresponding precise description in the study protocol is absolutely necessary for the application of adaptive designs. Conversely, this means that unplanned interim analyzes - unless they are indicated for safety concerns - should be avoided, as otherwise there is a risk that the study will lose its informative value. However, planned interim analyzes, which may serve to terminate the study prematurely, are not without problems, since effects can no longer be determined with the precision that is actually desired. In addition, premature termination of studies can lead to a biased estimate due to large differences observed in an interim analysis (8, 9). In order to be used efficiently, intermediate evaluations in the context of adaptive designs must be based on rather short-term endpoints. These are then not infrequently surrogates, for example progression-free survival (PFS) in oncology.

Platform studies represent a further development of adaptive designs (box, example 7). In platform studies, several experimental interventions are evaluated against a common control intervention and / or against each other under a master protocol (10). In contrast to studies in the factorial design, however, it is not examined whether combinations have a synergistic effect or whether their benefits weaken each other. In the case of interim analyzes planned in advance, the probability of allocation to the individual arms is adapted, individual arms are removed entirely or new ones (for example combinations of individual arms) are added. Platform studies are an efficient alternative in indications with short innovation cycles and with shrinking target populations. They are also designed as combined phase 2 / phase 3 studies and then also referred to as multi-arm multi-stage (MAMS) RCTs. Umbrella and basket studies can also be subsumed under this (11). Both terms are used to test so-called targeted therapies in the context of personalized medicine in oncology: Either in the case of a histopathological tumor entity (e.g. non-small cell lung cancer), different therapeutic approaches directed against these driver mutations are compared in subgroups formed by, for example, different driver mutations for a common standard therapy (umbrella design). Or in the case of different histopathological tumor entities, a common goal is examined across these tumor entities (basket design). However, basket studies are mostly (still) carried out in an uncontrolled manner. A rationale for this is not really discernible (12).

In order to counter the - sometimes well-founded - objection that RCTs depict artificial scenarios that can be characterized, for example, by narrow inclusion and exclusion criteria and many control examinations, pragmatic RCTs (box, example 8) have met with great interest in recent years (13 , 14). The pragmatic element consists in the fact that the study is dedicated to the targeted, rapid and free of all ballast of possible secondary questions answering a practical question. The limitation to a few and easily ascertainable inclusion and exclusion criteria allows a broad recruitment of patients also in everyday clinical practice. The focus on a few patient-relevant and easily surveyed endpoints promotes willingness to participate and practical relevance at the same time. Such studies can also be supported by registries (15, 16). The often extensive approval of accompanying measures and therapies supports practical relevance and acceptance. This targeted and cost-effective approach is extremely useful for many supply issues and, as many examples show, it is also easy to implement. However, it has its price: On the one hand, the omission of strict specifications creates statistical "noise", which can lead to a significant increase in the number of patients required (17). The low level of standardization of processes and surveys can also lead to implementation and interpretation problems. On the other hand, dispensing with additional data prevents the pursuit of additional questions, which is what makes clinical studies interesting in the first place for many medical scientists - on the other hand, it can make it very difficult to conduct such studies.

Effort and efficiency of RCTs

The goal of arriving at reliable causal conclusions with regard to the effectiveness of medical measures with a clinical study is most efficiently achieved with RCTs, provided that the same basic quality standards apply to all study forms in the sense of Good Clinical Practice (GCP). The effort involved, for example, in the creation of study protocols, quality assurance of the observed medical interventions and the collection and validation of data, including the reliable recording of adverse events, should not actually differ between different study types. But randomization enables the simple and most reliable formation of structurally identical groups for a scientifically fair comparison of interventions. On the other hand, non-RCTs require the collection of a far larger number of characteristics and data in order to try to statistically control distortion influences due to confounding (e.g. a selection bias due to confounding by indication) in the evaluation. In addition, non-RCTs often produce significantly more heterogeneous results (18), which consequently necessitates larger sample sizes and thus increases the effort. These are also reasons why dispensing with randomization does not provide a solution for comparing therapies in rare diseases (19).

In addition, from a higher-level perspective, RCTs lead to more research and supply efficiency, because they are the only way to ultimately achieve the degree of reliability required for clinical guidelines, for example. It was only after decades that the randomized WHI study was able to clarify the importance of hormone replacement therapy for postmenopausal women (20). It is significant that, after evaluating non-RCT data, for example from patient registries, researchers draw the conclusion that RCTs are necessary to finally clarify the clinical benefit of interventions (21, 22).

RCTs from a meta-epidemiological perspective: make sense?

Results from meta-epidemiological comparisons of results of RCTs and non-RCTs (mostly observational studies) on the same clinical issues that suggest equivalence are sometimes used as an argument against the supposed implementation effort of RCTs. However, even if it were proven that both types of study empirically achieve similar results on average, it would still be wise to choose the much more efficient RCT approach. Why is that?

The comparisons in the relevant methodological reviews lead to very heterogeneous results. This means that there are studies that suggest that non-RCTs lead to larger effect estimates and - vice versa - others in which non-RCTs result in lower effect estimates. If these reviews are summarized in a meta-review like Anglemyer and colleagues, which is actually inadmissible given the great heterogeneity, no relevant difference actually emerges (23). In addition, one observes that the comparisons make the difference between RCTs and non-RCTs all the smaller, the better and more sophisticated the quality of the non-RCTs, i.e. the closer they come to the RCTs in terms of data quality and confounder control (24 ). However, since this quality is rarely found in non-RCTs and is also very difficult to check in publications, this means that the results of conventional non-RCTs with an unclearly high degree of risk of bias cannot be regarded as valid compared to the standard RCT .

Ultimately, meta-epidemiological empirical design comparisons do not provide any clear answers: Even if a difference resulted, this could in turn be interpreted differently: On the one hand, it could be explained by distorting mechanisms or other poor quality of the non-RCTs. On the other hand, it could also be justified by different settings and study populations in RCTs and non-RCTs, which again means a systematic bias in the comparison of study designs.

Conclusion

In order to arrive at reliable, causally interpretable statements on the benefits and harms of (medical) interventions, studies with a non-randomized allocation require an incomparably higher effort, since the control of confounding variables is provided by randomization almost free of charge.

As shown, there are numerous options for carrying out RCTs in a targeted and valid manner. The necessary infrastructure is also available at the universities with the coordination centers for clinical studies (KKS). Developments such as platform and pragmatic studies impressively show that the RCT instrument has repeatedly been adapted to relevant issues and changed or very dynamic research framework conditions. RCTs are neither hostile to innovation (short innovation cycles are a popular counter-argument [25]) nor do they fundamentally contradict the desire for “real world evidence” (26). Therefore, RCTs should not only be retained as the gold standard for clinical intervention studies and reliable proof of efficacy, but should also gain in importance in Germany through targeted research funding to answer patient-relevant questions.

Conflict of interest

The authors declare that they have no conflict of interest.

Manuscript dates
Taken: April 6, 2017, revised version accepted: July 12, 2017

Address for the authors
Dr. med. Dipl.-Psych. Jörg Lauterberg

IQWiG - Institute for Quality and Efficiency in Health Care

Im Mediapark 8, 50670 Cologne

[email protected]

How to quote
Lange S, Sauerland S, Lauterberg J, Windeler J: The range and scientific value of randomized trials — part 24 of a series on evaluation of scientific publications. Dtsch Arztebl Int 2017; 114: 635-40.
DOI: 10.3238 / arztebl.2017.0635

The German version of this article is available online:
www.aerzteblatt-international.de

Savovic J, Jones H, Altman D, et al .: Influence of reported study design characteristics on intervention effect estimates from randomized controlled trials: combined analysis of meta-epidemiological studies. Health Technol Assess 2012; 16: 1-82 CrossRefMEDLINE
Blair E: Gold is not always good enough: the shortcomings of randomization when evaluating interventions in small heterogeneous samples. J Clin Epidemiol 2004; 57: 1219-22 CrossRef MEDLINE
Zucker DR, Ruthazer R, Schmid CH: Individual (N-of-1) trials can be combined to give population comparative treatment effect estimates: methodologic considerations. J Clin Epidemiol 2010; 63: 1312-23 CrossRefMEDLINE PubMed Central
Chenot JF: Cluster Randomized Trials: An Important Tool in General Medicine Research. Z Evid further training Qual Gesundhwes 2009; 103: 475-80 Cross Ref
Kleist P: Study designs with incomplete explanation of the test subjects. Switzerland Medical Journal 2010; 91: 994-7 Cross Ref
Food and Drug Administration. Adaptive design clinical trials for drugs and biologics - Draft guidance [online]. 02.2010 www.fda.gov/downloads/drugs/guidances/ucm201790.pdf (last accessed on 18 February 2017).
Bassler D, Montori VM, Briel M, et al .: Reflections on meta-analyzes involving trials stopped early for benefit: is there a problem and if so, what is it? Stat Methods Med Res 2013; 22: 159-68 CrossRefMEDLINE
Guyatt GH, Briel M, Glasziou P, Bassler D, Montori VM: Problems of stopping trials early. BMJ 2012; 344: e3863 CrossRefMEDLINE
Berry SM, Connor JT, Lewis RJ: The platform trial: an efficient strategy for evaluating multiple treatments. JAMA 2015; 313: 1619-20 CrossRefMEDLINE
Woodcock J, LaVange LM: Master protocols to study multiple therapies, multiple diseases, or both. N Engl J Med 2017; 377: 62-70 CrossRefMEDLINE
Renfro LA, Sargent DJ: Statistical controversies in clinical research: basket trials, umbrella trials, and other master protocols: a review and examples. Ann Oncol 2017; 28: 34-43 MEDLINE
Tunis SR, Stryer DB, Clancy CM: Practical clinical trials: increasing the value of clinical research for decision making in clinical and health policy. JAMA 2003; 290: 1624-32 CrossRefMEDLINE
Sacristan JA, Soto J, Galende I, Hylan TR: Randomized database studies: a new method to assess drugs ‘effectiveness? J Clin Epidemiol 1998; 51: 713-5 MEDLINE
Lagerqvist B, Frobert O, Olivecrona GK, et al .: Outcomes 1 year after thrombus aspiration for myocardial infarction. N Engl J Med 2014; 371: 1111-20 CrossRefMEDLINE
Greenfield S, Kravitz R, Duan N, Kaplan SH: Heterogeneity of treatment effects: implications for guidelines, payment, and quality assessment. On J Med 2007; 120: 3-9 CrossRefMEDLINE
Ioannidis JP, Haidich AB, Pappa M, et al .: Comparison of evidence of treatment effects in randomized and nonrandomized studies. JAMA 2001; 286: 821-30 CrossRefMEDLINE
Rossouw JE, Anderson GL, Prentice RL, et al .: Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results from the women’s health initiative randomized controlled trial. JAMA 2002; 288: 321-33 CrossRefMEDLINE
Angus DC: Whether to intubate during cardiopulmonary resuscitation: conventional wisdom vs big data. JAMA 2017; 317: 477-8 CrossRefMEDLINE
Sarno G, Lagerqvist B, Frobert O, et al .: Lower risk of stent thrombosis and restenosis with unrestricted use of 'new-generation' drug-eluting stents: a report from the nationwide Swedish Coronary Angiography and Angioplasty Registry (SCAAR). Eur Heart J 2012; 33: 606-13 CrossRefMEDLINE
Anglemyer A, Horvath HT, Bero L: Healthcare outcomes assessed with observational study designs compared with those assessed in randomized trials. Cochrane Database Syst Rev 2014; 4: MR000034 CrossRef
Furlan AD, Tomlinson G, Jadad AA, Bombardier C: Methodological quality and homogeneity influenced agreement between randomized trials and nonrandomized studies of the same intervention for back pain. J Clin Epidemiol 2008; 61: 209-31 CrossRefMEDLINE
Federal Association of Medical Technology (BVMed): 5-point plan for assessing the benefits of medical technologies. Berlin: 2014. www.bvmed.de/de/versorgung/nutzenicherung/5-punkte-nutzenicherung (last accessed on 18 February 2017).
Sherman RE, Anderson SA, Dal Pan GJ, et al .: Real-world evidence — what is it and what can it tell us? N Engl J Med 2016; 375: 2293-7 CrossRefMEDLINE
Treasure T, Fallowfield L, Lees B: Pulmonary metastasectomy in colorectal cancer: the PulMiCC trial. J Thorac Oncol 2010; 5: 203-6 CrossRefMEDLINE
Surgical & Interventional Trials Unit (SITU) DoSIS, Faculty of Medical Sciences UCL. PulMiCC Newsletter Issue 001 (online). 03.2015. www.ucl.ac.uk/surgical-interventional-trials-unit/documents/trials_doc/pulmicc_doc/pulmicc_open/PULMICC_news_docs/PulMiCC_Newsletter__Issue_001__March_2015_.pdf
(last accessed on 18 February 2017)
Nenke MA, Haylock CL, Rankin W, et al .: Low-dose hydrocortisone replacement improves wellbeing and pain tolerance in chronic pain patients with opioid-induced hypocortisolemic responses. A pilot randomized, placebo-controlled trial. Psychoneuroendocrinology 2015; 56: 157-67 CrossRefMEDLINE
Mitchell GK, Hardy JR, Nikles CJ, et al .: The effect of methylphenidate on fatigue in advanced cancer: an aggregated N-of-1 Trial. J Pain Symptom Manage 2015; 50: 289-96 CrossRefMEDLINE
Yusuf S, Lonn E, Pais P, et al .: Blood-pressure and cholesterol lowering in persons without cardiovascular disease. N Engl J Med 2016; 374: 2032-43 CrossRefCrossRef
Weltermann B, Kersting C, Viehmann A: Hypertension management in primary care. Dtsch Arztebl Int. 2016; 113: 167-74 FULL TEXT
Bhatt DL, Stone GW, Mahaffey KW, et al .: Effect of platelet inhibition with cangrelor during PCI on ischemic events. N Engl J Med 2013; 368: 1303-13 CrossRefMEDLINE
James ND, Sydes MR, Clarke NW, et al .: Systemic therapy for advancing or metastatic prostate cancer (STAMPEDE): a multi-arm, multistage randomized controlled trial. BJU Int 2009; 103: 464-9 CrossRefMEDLINE
James ND, Sydes MR, Mason MD, et al .: Celecoxib plus hormone therapy versus hormone therapy alone for hormone-sensitive prostate cancer: first results from the STAMPEDE multiarm, multistage, randomized controlled trial. Lancet Oncol 2012; 13: 549-58 Cross Ref
Sydes MR, Parmar MK, Mason MD, et al .: Flexible trial design in practice — stopping arms for lack-of-benefit and adding research arms mid-trial in STAMPEDE: a multi-arm multi-stage randomized controlled trial. Trials 2012; 13: 168 CrossRefMEDLINE PubMed Central
Vestbo J, Leather D, Diar Bakerly N, et al .: Effectiveness of fluticasone furoate-vilanterol for COPD in clinical practice. N Engl J Med 2016; 375: 1253-60 CrossRefMEDLINE
IQWiG - Institute for Quality and Efficiency in Health Care, Cologne:
PD Dr. med. Lange, PD Dr. med. Sauerland, Dr. med. Dipl.-Psych. Lauterberg, Prof. Dr. med. Diaper
1.Savovic J, Jones H, Altman D, et al .: Influence of reported study design characteristics on intervention effect estimates from randomized controlled trials: combined analysis of meta-epidemiological studies. Health Technol Assess 2012; 16: 1-82 CrossRefMEDLINE
2.Blair E: Gold is not always good enough: the shortcomings of randomization when evaluating interventions in small heterogeneous samples. J Clin Epidemiol 2004; 57: 1219-22 CrossRefMEDLINE
3.Altman DG, Bland JM: Treatment allocation by minimization. BMJ 2005; 330: 843 CrossRefMEDLINEPubMed Central
4.Zucker DR, Ruthazer R, Schmid CH: Individual (N-of-1) trials can be combined to give population comparative treatment effect estimates: methodologic considerations. J Clin Epidemiol 2010; 63: 1312-23 CrossRefMEDLINE PubMed Central
5.Chenot JF: Cluster Randomized Trials: An Important Method in General Medicine Research. Z Evid training course Qual Gesundhwes 2009; 103: 475-80 Cross Ref
6.Kleist P: Study designs with incomplete explanation of the test subjects. Switzerland Medical Journal 2010; 91: 994-7 Cross Ref
7.Food and Drug Administration. Adaptive design clinical trials for drugs and biologics - Draft guidance [online]. 02.2010 www.fda.gov/downloads/drugs/guidances/ucm201790.pdf (last accessed on 18 February 2017).
8.