1. Trang chủ
  2. » Y Tế - Sức Khỏe

Essentials of Clinical Research - part 2 doc

36 443 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 36
Dung lượng 317,78 KB

Nội dung

26 S.P Glasser more appropriate when studies are used to detect rare or late consequences of interventions Discussion One should now be able to begin to understand the key differences, and therefore limitations, of each study design; and, circumstances where one design might be preferable to another Let’s, for example, use the exposure of electromagnetic energy (EME) and cancer outcome (e.g leukemia) With a cross-sectional study, a population is identified (target population), cancer rates determined, and exposure and lack of exposure to EME is ascertained in a sample One then analyzes the exposure rates in subjects with cancer and those that are cancer free If the cancer rate is higher in those who were exposed, an association is implied This would be a relatively inexpensive way to begin to look at the possible association of these variables, but limitations should be obvious For example, since there is no temporality in this type of design, and since biologically, exposure to EME if it did cause cancer would likely have to occur over a long period of time, one could easily miss an association In summary, it should be evident that observational studies (e.g cross-sectional, case-control, and cohort studies) have a major role in research However, despite their important role, von Elm et al discussed the lack of important information that was either missing or unclear in prior published observational studies; and why this lack of information lead to a guideline document for reporting observational studies (the STROBE statement – the Strengthening and Reporting of Observational Studies in Epidemiology) The STROBE statement was designed after the CONSORT – the Consolidated Standards of Reporting Trials –; this statement outlines the guidelines for reporting RCTs The STROBE statement is a checklist of 22 items that are to be considered essential for good reporting of observational studies.9 References Parker_Palmer http://en.wikipedia.org/wiki/Parker_Palmer Vickers AJ Michael Jordan won’t accept the null hypothesis: notes on interpreting high P values Medscape 2006; 7(1) The Null Logic of Hypothesis Testing http://www.shsu.edu/∼icc_cmf/cj_787/research6.doc Blackstone Cited in The Null Logic of Hypothesis Testing Bl Com C 27, margin page 358, ad finem Available at: http://www.shsu.edu/~icc_cmf/cj_787/research6.doc Connolly HM, Crary JL, McGoon MD, et al Valvular heart disease associated with fenfluramine-phentermine N Engl J Med Aug 28, 1997; 337(9):581–588 Cited in Sartwell P and Nathanson N Epidemiologic Reviews 1993 Sacks FM, Pfeffer MA, Moye LA, et al The effect of pravastatin on coronary events after myocardial infarction in patients with average cholesterol levels Cholesterol and recurrent events trial investigators N Engl J Med Oct 3, 1996; 335(14):1001–1009 Introduction to Clinical Research and Study Designs 27 Doll R Cohort studies: history of the method II Retrospective cohort studies Soz Praventivmed 2001; 46(3):152–160 von Elm E, Altman DG, Egger M, Pocock SJ, Gotzsche PC, Vandenbroucke JP The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies Ann Intern Med Oct 16, 2007; 147(8):573–577 Chapter Clinical Trials* Stephen P Glasser Abstract The spectrum of evidence imparted by the different clinical research designs ranges from ecological studies through observational epidemiological studies to randomized control trials (RCTs) This chapter addresses the definition of clinical research, the major aspects of clinical trials eg ethics, randomization, masking, recruitment and retention of subjects enrolled in a clinical trial, patients/ subjects lost to follow-up during the trial etc Although this chapter focuses on the weaknesses of clinical trials, it is emphasized that the randomized, placebocontrolled, double blind clinical trial is the design that yields the greatest level of scientific evidence A researcher is in a gondola of a balloon that loses lift and lands in the middle of a field near a road Of course, it looks like the balloon landed in the middle of nowhere As the researcher ponders appropriate courses of action, another person wanders by The researcher asks, ‘Where am I?’ The other person responds, ‘You are in the gondola of a balloon in the middle of a field.’ The researcher comments, ‘You must design clinical trials.’ ‘Well, that’s amazing, how did you know?’ ‘Your answer was correct and precise and totally useless.’ Introduction The spectrum of evidence imparted by the different clinical research designs ranges from ecological studies through observational epidemiological studies to randomized control trials (RCTs) The differences in clinical research designs and the different weights of evidence are exemplified by the post-menopausal hormone replacement therapy (HRT) controversy Multiple observational epidemiological studies had shown that HRT was strongly associated with the reduction of atherosclerosis, myocardial infarction risk, and stroke risk.1-3 subsequently, clinical trials suggested that HRT was not beneficial, and might even be harmful.4-6 This latter observation raises a number of questions, including: why can this paradox occur? what can contribute to this disagreement?; and, why we believe these RCT’s more than so many well-done observational trials? * Over 50% of this chapter is taken from “Clinical trial design issues: at least 10 things you should look for in clinical trials”7 with permission of the publisher S.P Glasser (ed.), Essentials of Clinical Research, © Springer Science + Business Media B.V 2008 29 30 S.P Glasser Before addressing these above questions it is appropriate to point out that frequently, there is confusion about the difference between clinical research and clinical trials A clinical trial is one type of clinical research A clinical trial is a type of experimental study undertaken to assess the response of an individual (or in the case of group clinical trials-a population) to interventions introduced by an investigator Clinical trials can be randomized or non-randomized, un-blinded, single-blinded, or double-blinded; comparator groups can be placebo, active controls, or no treatment controls, and RCTs can have a variety of designs (eg parallel group, crossover, etc.) That being said, the RCT remains the ‘gold-standard’ study design and its results are appropriately credited as yielding the highest level of scientific evidence (greatest likelihood of causation) However, recognition of the limitations of the RCT is also important so that results from RCTs are not blindly accepted As Grimes and Schultz point out, in this era of increasing demands on a clinician’s time it is ‘difficult to stay abreast of the literature, much less read it critically In our view, this has led to the somewhat uncritical acceptance of the results of a randomized clinical trial’.8 Also, Loscalzo, has pointed out that ‘errors in clinical trial design and statistical assessment are, unfortunately, more common that a careful student of the art should accept’.9 What leads the RCT to the highest level of evidence and what are the features of the RCT that renders it so useful? Arguably, one of the most important issues in clinical trials is having matched groups in the interventional and control arms; and, this is best accomplished by randomization That is, to the degree that the groups under study are different, results can be confounded, while when the groups are similar, confounding is reduced (See chapter 17 for a discussion of confounding) It is true that when potential confounding variables are known, one can relatively easily adjust for them in the design or analysis phase of the study For example, if smoking might confound the results of the success of treatment for hypertension, one can build into the design a stratification scheme that separates smokers form non-smokers, before the intervention is administered and in that way determine if there are differential effects in the success of treatment (e.g smokers and non-smokers are randomized equally to the intervention and control) Conversely, one can adjust after data collection in the analysis phase by separating the smokers from the non-smokers and again analyze them separately in terms of the success of the intervention compared to the control The real challenge of clinical research, is not how to adjust for known confounders, but how to have matched (similar groups- how to adjust) in the intervention and control arms, when potential confounders are not known Optimal matching is accomplished with randomization, and this is why randomization is so important More about randomization later, but in the meanwhile one can begin to ponder how un-matching might occur even in a RCT In addition to randomization, there are a number of important considerations that exist regarding the conduct of a clinical trial, such as: is it ethical? what type of comparator group should be used? what type of design and analysis technique will be utilized? how many subjects are needed and how will they be recruited and retained? etc Finally, there are issues unique to RCTs (eg intention-to-treat analysis, placebo control groups, randomization, equivalence testing) and issues common to all clinical research (eg ethical issues, blinding, selection of the control group, choice Clinical Trials 31 Table 3.1 Issues of importance for RCTs Ethical considerations Randomization Eligibility criteria Efficacy vs effectiveness Compliance Run-in periods Recruitment and retention Masking Comparison groups Placebo ‘Normals’ Analytical issues ITT Subgroup analysis Losses to follow-up Equivalence vs traditional testing Outcome selection Surrogate endpoints Composite endpoints Trial duration Interpretation of results Causal inference The media of the outcome/endpoint, trial duration, etc) that must be considered Each of these issues will be reviewed in this chapter (Table 3.1) To this end, both the positive and problematic areas of RCTs will be highlighted Ethical Issues Consideration of ethical issues is key to the selection of the study design chosen for a given research question/hypothesis For RCTs ethical considerations can be particularly problematic, mostly (but by no means solely) as it relates to using a placebo control A full discussion of the ethics of clinical research is beyond the scope of this book, and for further discussion one should review the references noted here.10-12 (There is also further discussion of this issue under the section entitled Traditional vs Equivalence Testing and Chapters and 7) The opinions about when it is ethical to use placebo controls is quite broad For example, Rothman and Michaels are of the opinion that the use of placebo is in direct violation of the Nuremberg Code and the Declaration of Helsinki,12 while others would argue that placebo controls are ethical as long as withholding effective treatment leads to no serious harm and if patients are fully informed Most would agree that placebo is unethical if effective life-saving or life-prolonging therapy is available or if it is likely that the placebo group could suffer serious harm For ailments that are not likely to be of harm or cause severe discomfort, some would argue that placebo is justifiable.11 However, in the majority of scenarios, the use of a placebo control 32 S.P Glasser is not a clear-cut issue, and decisions need to be made on a case-by-case basis One prevailing standard that provides a guideline for when to study an intervention against placebo is when one has enough confidence in the intervention that one is comfortable that the additional risk of exposing a subject to the intervention is low relative to no therapy or the ‘standard’ treatment; but, that there is sufficient doubt about the intervention that use of a placebo or active control (‘standard treatment’) is justified This balance, commonly referred to as equipoise, can be difficult to come by and is likewise almost always controversial Importantly, equipoise needs to be present not only for the field of study (i.e there is agreement that there is not sufficient evidence of the superiority of an alternative treatments), but equipoise also has to be present for individual investigators (permitting individual investigators to ethically assign their patients to treatment at random) Another development in the continued efforts to protect patient safety is the Data Safety and Monitoring Board (DSMB-see chapter 9) The DSMB is now almost universally used in any long-term intervention trial First a data and safety monitoring plan (DSMP) becomes part of the protocol, and then the DSMB meets at regular and at ‘as needed’ intervals during the study in order to address whether the study requires early discontinuation As part of the DSMP, stopping rules for the RCT will have been delineated Thus, if during the study, either the intervention or control group demonstrates a worsening outcome, or the intervention group is showing a clear benefit, or adverse events are greater in one group vs the other (as defined within the DSMP) the DSMB can recommend that the study be stopped But, the early stopping of studies can also be a problem For example, in a recent systematic review by Montori et al, the question was posed about what was known regarding the epidemiology and reporting quality of RCTs involving interventions stopped for early benefit.13 Their conclusions were that prematurely stopped RCTs often fail to adequately report relevant information about the decision to stop early, and that one should view the results of trials that are stopped early with skepticism 13 Randomization Arguably, it is randomization that results in the RCT yielding the highest level of scientific evidence (i.e resulting in the greatest likelihood that the intervention is causally related to the outcome) Randomization is a method of treatment allocation that is a distribution of study subjects at random (i.e by chance) As a result, randomization results in all randomized units (e.g subjects) having the same and independent chance of being allocated to any of the treatment groups, and it is impossible to know in advance to which group a subject will be assigned The introduction of randomization to clinical trials in the modern era can probably be credited to the 1948 trial of streptomycin for the treatment of tuberculosis (Fig 1.1).14 In this trial, 55 patients were randomized to either streptomycin with bed rest, or to treatment with bed rest alone (the standard treatment at that time) To quote from that paper, ‘determination of whether a patient would be treated by streptomycin and bed rest (S case) Clinical Trials 33 or bed rest alone (C case), was made by reference to a statistical series based on random sampling numbers drawn up for each sex at each center by Professor Bradford Hill; the details of the series were unknown to any of the investigators or to the co-coordinator and were contained in a set of sealed envelopes each bearing on the outside only the name of the hospital and a number After acceptance of a patient by the panel and before admission to the streptomycin centre, the appropriate numbered envelope was opened at the central office; the card inside told if the patient was to be an S or C cases, and this information was then given to the medical officer at the centre’ Bradford Hill was later knighted for his contributions to science including the contribution of randomization With randomization the allocation ratio (number of units-subjects- randomized to the investigational arm versus the number randomized to the control arm) is usually 1:1 But a 1:1 ratio is not required, and there may be advantages to unequal allocation (e.g 2:1 or even 3:1) The advantages of unequal allocation are: one exposes fewer patients to placebo, and one gains more information regarding the safety of the intervention The main disadvantage of higher allocation ratios is the loss of power There are general types of randomization: simple, blocked, and stratified Simple randomization can be likened to the toss of an unbiased coin-ie heads group A, tails group B This is easy to implement, but particularly with small sample sizes, could result in substantial imbalance (for example if one tosses a coin 10 times, it is not improbable that one could get heads and tails If one tosses the coin 1000 times it is likely that the distribution of heads to tails would be close to Confounders of relationships in Randomized Clinical Trials In a RCT, those with and without the confounder as assigned to the risk factor at random Confounder (SES) Risk Factor (Estrogen) CHD (CHD risk) It now doesn’t matter if the confounder (SES) is related to CHD risk, because it is not related to the risk factor (estrogen) it cannot be a confounder Fig 3.1 The relationship of confounders to outcome and how they are eliminated in a RCT 34 S.P Glasser 500 heads and 500 tails) Blocked randomization (sometimes called permuted block randomization) is a technique common to multi-center studies Whereas the entire trial might intend to enroll 1000 patients, each center might only contribute 10 patients to the total To prevent between center bias (recall each sample population has differences even if there is matching to known confounders) blocked randomization can be utilized Blocked randomization means that randomization occurs within each center ensuring that about patients in each center will be randomized to the intervention and to the control If this approach was not used, one center might enroll 10 patients to the intervention and another center, 10 patients to the control group Recall that the main objective of randomization is to produce between-group comparability If one knows prior to the study implementation that there might be differences that are not equally distributed between groups (again particularly more likely with small sample sizes) stratified randomization can be used For example, if age might be an important indicator of drug efficacy, one can randomize within strata of age groups (e.g 50–59, 60–69 etc.) Within each stratum, randomization can be simple or blocked In review, simple randomization is the individual allocation of subjects into the intervention and control groups, block randomization creates small groups (blocks) in which there are equal numbers in each treatment arm so that there are balanced numbers throughout a multi-center trial, and stratified randomization addresses the ability to separate known confounders into strata so that they can no longer confound the study results Again, randomization is likely the most important key to valid study results because (if the sample size is large enough), it distributes known, and more importantly unknown, confounders equally to the intervention and control groups Now, as to the problems associated with randomization As prior discussed, the issue of confounders of relationships is inherent in all clinical research A confounder is a factor that is associated with both the risk factor and the outcome, and leads to a false apparent association between the risk factor and outcome (See Fig 3.2) In observational studies, there are two alternative approaches to remove the effect of confounders: ● Most commonly used in case/control studies, one can match the case and control populations on the levels of potential confounders Through this matching the investigator is assured that both those with a positive outcome (cases) and a negative outcome (controls) have similar levels of the confounder Since, by definition, a confounder has to be associated with both the risk factor and the outcome; and, since through matching the suspected confounder is not associated with the outcome – then the factor cannot affect the observed differences in the outcome For example, in a study of stroke, one may match age and race for stroke cases and community controls, with the result that both those with and without strokes will have similar distributions for these variables, and differences in associations with other potential predictors are not likely to be confounded, for example, by higher rates in older or African American populations Clinical Trials ● 35 In all types of observational epidemiological studies, one can statistically/ mathematically ‘adjust’ for the confounders Such an adjustment allows for the comparison between those with and without the risk factor at a ‘fixed level’ of the confounding factor That is, the association between the exposure and the potential confounding factor is removed (those with and without the exposure are assessed at a common level of the confounder), and as such the potential confounder cannot bias the association between the exposure and the outcome For example, in a longitudinal study assessing the potential impact of hypertension on stroke risk, the analysis can ‘adjust’ for race and other factors This adjustment implies that those with and without the exposure (hypertension) are assessed as if race were not associated with both the exposure and outcome The major shortcoming with either of these approaches is that one must know what the potential confounders are in order to match or adjust for them; and, it is the unknown confounders that represent a bigger problem Another issue is that even if one suspects a confounder, one must be able to appropriately measure it For example, a commonly addressed confounder is socio-economic status (usually a combination of education and income); but, clearly this is an issue in which there is disagreement and, which measure or cutpoint is appropriate The bottom line is that one can never perfectly measure all known confounders and certainly one cannot measure or match for unknown confounders As mentioned, the strength of the RCT is that randomization (performed properly and with a large enough sample size) balances both the known and unknown confounders between the interventional and control groups But even with an RCT, randomization can be further compromised as will be discussed in some of the following chapters, and by the following example from “Student’s” Collected Papers regarding the Lanarkshire Milk Experiment: 15 “Student” (ie, the great William Sealy Gosset) criticized the experiment for it’s loss of control over treatment assignment As quoted: Student’s “ contributions to statistics, in spite of a unity of purpose, ranged over a wide field from spurious correlation to Spearman’s correlation coefficient Always kindly and unassuming, he was capable of a generous rage, an instance of which is shown in his criticism of the Lancashire Milk Experiment This was a nutritional experiment on a very large scale For four months 5,000 school children received three-quarters of a pint of raw milk a day, 5,000 children the same quantity of pasteurized milk and 10,000 other children were selected as controls The experiment, in Gosset’s view, was inconclusive in determining whether pasteurized milk was superior in nutritional value to raw milk This was due to failure to preserve the random selection of controls as originally planned “In any particular school where there was any group to which these methods (i.e., of random selection) had given an undue proportion of well-fed or ill-nourished children, others were substituted to obtain a more level selection.” The teachers were kind-hearted and tended to select ill-nourished as feeders and well-nourished as controls Student thought that among 20,000 children some 200–300 pairs of twins would be available of which some 50 pairs would be identical-of the same sex and half the remainder nonidentical of the same sex The 50 pairs of identicals would give more 36 S.P Glasser reliable results than the 20,000 dealt with in the experiment, and great expense would be saved It may be wondered, however, whether Student’s suggestion would have proved free from snags Mothers can be as kind-hearted as teachers, and if one of a pair of identical twins seemed to his mother to be putting on weight Implications of Eligibility Criteria In every study there are substantial gains in statistical power by focusing the intervention in a homogenous patient population likely to respond to treatment, and to exclude patients that could introduce ‘noise’ by their inconsistent responses to treatment Conversely, at the end of a trial there is a need to generalize the findings to a broad spectrum of patients who could potentially benefit from the superior treatment These conflicting demands introduce an issue of balancing the inclusion/exclusion (eligibility criteria) such that the enrolled patients are as much alike as possible; but, on the other hand to be as diverse as possible in order to be able to apply the results to the more general population (i.e generalizability) Fig 3.2 outlines this balance What is the correct way of achieving this balance? There really is no correct answer, there is always a tradeoff between homogeneity and generalizability; and each study has to address this, given the availability of subjects, along with other considerations This process of sampling represents one of the reasons that scientific inquiry requires reproducibility of results, that is, one study generally cannot be relied upon to portray ‘truth’ even if it is a RCT The process of sampling embraces the concept of generalizability The issue of generalizability is nicely portrayed in a video entitled ‘A Village of 100’.16 If one Implications of Eligibility Criteria Homogeneity • Divergent subgroup of patients (i.e., “weird” patients) can distort findings for the majority • Restriction of population reduces “noise” and allows study to be done in a smaller sample size Restrict population to homogenous group Generalizability • At the end of the study, it will be important to apply findings to the broad population of patients with the disease • It is questionable to generalize the findings to those excluded from the study Have broad inclusion criteria “welcoming” all What is the correct answer? There is no correct answer! Fig 3.2 The balance of conflicting issues involved with patient selection 48 S.P Glasser Table 3.5 Checklist for subgroup analyses Design ■ Are the subgroups based on pre-randomisation characteristics? ■ What is the impact of patient misallocation on the subgroup analysis? ■ Is the intention-to-treat population being used in the subgroup analysis? ■ Were the subgroups planned a priori? ■ Were they planned in response to existing trial or biological data? ■ Was the expected direction of the subgroup effect stated a priori? ■ Was the trial designed to have adequate power for the proposed subgroup analysis? Reporting ■ Is the total number of subgroup analyses undertaken declared? ■ Are relevant summary data, including event numbers and denominators, tabulated? ■ Are analyses decided on a priori clearly distinguished from those decided on a posteriori? Statistical analysis Are the statistical tests appropriate for the underlying hypotheses? ■ Are tests for heterogeneity (i.e., interaction) statistically significant? ■ Are there appropriate adjustments for multiple testing? ■ Interpretation ■ Is appropriate emphasis is being placed on the primary outcome of the study? ■ Is the validity of the findings of the subgroup analysis discussed in the light of current biological knowledge and the findings from similar trials? Cook D I et al Subgroup analysis in clinical trials MJA 2004; 180: 289–291 © 2004 The Medical Journal of Australia Reproduced with permission subgroups appropriately defined, (that is be careful about subgroups that are based upon characteristics measured after randomization e.g adverse drug events may be more common as reasons for withdrawal from the active treatment arm whereas lack of efficacy may be more common in the placebo arm); 2) were the subgroup analyses planned before the implementation of the study (in contrast to after the study completion or during the conduct of the study); 3) does the study report include enough information to assess the validity of the analysis eg the number of subgroup analyses; 4) does the statistical analyses use multiplicity and interaction testing; 5) were the results of subgroup analyses interpreted with caution; 6) is there replication of the subgroup analysis in another independent study; 7) was a doseresponse relationship demonstrated; 8) was there reproducibility of the observation within individual sites; and 9) is there a biological explanation Traditional versus Equivalence testing (Table 3.6) Most clinical trials have been designed to assess if there is a difference in the efficacy to two (or more) alternative treatment approaches (with placebo usually being the comparator treatment) There are reasons why placebo-controls are preferable to active controls, not the least of which is the ability to distinguish an effective treatment from a less effective treatment However, if a new treatment is Clinical Trials 49 Table 3.6 The types of RCTs and there relationship to hypothesis testing7 RCT type Null hypothesis Alternative hypothesis Traditional New = Old Equivalence New < Old + δ (where δ is a “cushion,” that is that the new is at least δ worse than the old) New < Old Non-inferiority New ≠ Old (i.e., New < Old or New > Old) New ≥ Old + δ New = Old considered to be equally effective but perhaps less expensive and/or invasive, or a placebo-control is considered unethical, then the new treatment needs to be compared to an established therapy and the new treatment would be considered preferable to the established therapy, even if it is just as good (not necessarily better) as the old The ethical issues surrounding the use of a placebo-control and the need to show a new treatment to only be as ‘good as’ (rather than better) has given rise to the recent interest in equivalence testing With traditional (superiority) hypothesis testing, the null hypothesis states that ‘there is no difference between treatment groups (i.e New = Old or placebo or standard therapy) Rejecting the null, then allows one to definitively state if one treatment is better (or worse) than another (i.e New > or < Old) The disadvantage is if at the conclusion of an RCT there is not evidence of a difference, one cannot state that the treatments are the same, or as good as one to the other, only that the data are insufficient to show a difference That is, when the null hypothesis is not accepted, it is simply the case where it cannot be rejected The appropriate statement when the null hypothesis is not rejected is ‘there is not sufficient evidence in these data to establish if a difference exists.’ Equivalence testing in essence ‘flips’ the traditional null and alternative hypotheses Using this approach, the null hypothesis is that the new treatment is worse than the old treatment (i.e New < Old); that is, rather than assuming that there is no difference, the null hypothesis is that a difference exists and the new treatment is inferior Just as in traditional testing, the two actions available resulting from the statistical test are 1) reject the null hypothesis, or 2) failure to reject the null hypothesis However, with equivalence testing rejecting the null hypothesis is making the statement that the new treatment is not worse than old treatment, implying the alternative, that is ‘that the new treatment is as good as or better than the old’ (i.e New ≥ Old) Hence, this approach allows a definitive conclusion that the new treatment is as good as the old One caveat is the definition of ‘as good as,’ which is defined as being in the ‘neighborhood’ or having a difference that is so small that it is to be considered clinically unimportant (generally, event rates within +/− 2% – this is known as the equivalence or noninferiority margin usually indicted by the symbol δ) The need for this ‘neighborhood’ that is considered ‘as good as’ exposes the first shortcoming of equivalence testing – having to make a statement that ‘I reject the null 50 S.P Glasser hypothesis that the new treatment is worse than the old, and accept the alternative hypothesis that it is as good or better – and by that I mean that it is within at least 2% of the old’ (the wording in italics are rarely included in the conclusions of a manuscript) A second disadvantage of equivalence testing is that no definitive statement can be made that there is evidence that the new treatment is worse Just as in traditional testing, one never accepts the null hypothesis – one only fails to reject it Hence if the null is not rejected, all one can really say is that there is insufficient evidence in these data that the new treatment is as good as or better than the old treatment Another problem with equivalence testing is that one has to rely on the effectiveness of the active control obtained in previous trials, and on the assumption that the active control would be equally effective under the conditions of the present trial An example of an equivalence trial is the Controlled ONset Verapamil INvestigation of Cardiovascular Endpoints study (CONVINCE), a trial that also raised some ethical issues that are different from those usually involved in RCT’s.32 CONVINCE was a large double-blind clinical trial intended to assess the equivalence of verapamil and standard therapy in preventing cardiovascular diseaserelated events in hypertensive patients The results of the study indicated that the verapamil preparation was not equivalent to standard therapy because the upper bound of the 95% confidence limit (1.18) slightly exceeded the pre-specified boundary of 1.16 for equivalence However, the study was stopped prematurely for commercial reasons This not only hobbled the findings in terms of inadequate power, it also meant that participants who had been in the trial for years were subjected to a ‘breach in contract’ That is, they had subjected themselves to the risk of an RCT with no ultimate benefit There was a good deal of criticism borne by the pharmaceutical company involved in the decision to discontinue the study early Parenthetically, the company involved no longer exists Another variant of equivalence testing is non-inferiority testing Here the question is again slightly different in that one is asking whether the new intervention is simply not inferior to the comparator (i.e New 50%) is by itself an inadequate benefit for drug approval As stated in the commentary by Kelsen ‘the critical question in the debate over the adequacy of response rate as a surrogate endpoint for survival is whether an objective response to treatment is merely associated with a better survival, or whether the tumor regression itself lengthens survival.’ It should be understood that there are differences in an intermediate endpoint, correlate, and a surrogate endpoint, although an intermediate endpoint may serve as a surrogate Examples of intermediate endpoints include such things as angina pectoris, or hyperglycemic symptoms i.e these are not the ultimate outcome of interest (MI, or death etc) but are of value to the patient should they be benefited by an intervention Another example is from the earlier CHF literature where exercise walking time was used as an intermediate endpoint as well as a surrogate marker (in lieu of survival) A number of drugs improved exercise walking time in the CHF patient; but long-term studies proved that the same agents that improved walking time resulted in earlier death An example of surrogate ‘misadventure’ is characterized by a hypothetical scenario where a new drug is used in pneumonia, and it is found to lower the patients white blood count (wbc-this used as a surrogate marker for improvement in the patients pneumonia) Subsequently, this ‘new drug’ is found to be cytotoxic to wbc’s but obviously had little effect on the pneumonia But, perhaps the most glaring example of a surrogate ‘misadventure’ is represented by a real trial –the Cardiac Arrhythmia Suppression Trial (CAST).41 At the time of CAST, premature ventricular contractions (PVC’s) were thought to be a good surrogate for ventricular tachycardia or ventricular fibrillation, and thereby for sudden cardiac death (SCD) It was determined that many anti-arrhythmic agents available at the time or being developed reduced –PVC’s, and it was assumed would benefit the real outcome of interest- SCD CAST was proposed to test the hypothesis that these anti-arrhythmic agents did actually reduce SCD (in a post MI population) and this study was surrounded with some furor about the studies ethics, since a placebo control was part of the study design (it was felt strongly by many that the study was unethical since it was so likely that reduction in PVCs led to a reduction in SCD and how could one justify a placebo arm) In fact, it turned out that the anti-arrhythmic therapy not only failed to reduce SCD, but in some cases it increased its frequency A final example occurred in 2007, when the Chairman of the FDA Advisory panel that reviewed the safety of rosiglitazone stated that the time has come to abandon surrogate endpoints for the approval of type diabetes drugs This resulted from the use of glycated hemoglobin as a surrogate for diabetes morbidity and mortality as exemplified in the ADOPT (A Diabetes Outcome Prevention Trial) study where patients taking rosiglitazone had a greater decrease in glycosolated hemoglobin than in patients taking comparator drugs, yet the risks of CHF and cardiovascular ischemia were higher with rosiglitazone.42 Clinical Trials 55 Unknown Causal factor Rx Clinical event Intermediate (surrogate) event Fig 3.7 Depicts a correlation (statistically significant) between a causal factor and a cllnical event However, while treatment impacted the intermediate (surrogate) event, it had no effect on the clinical event since it does not lie in the direct pathway Correlates may or may not be good surrogates Recall, ‘that a surrogate endpoint requires that the effect of the intervention on the surrogate end-point predicts the effect on the clinical outcome-a much stronger condition than correlation.’36 Another major point of confusion is that between statistical correlation and proof of causality as demonstrated in Fig 3.7 as discussed by Boissel et al.43 In summary, it should be understood that most (many) potential surrogates markers used in clinical research have been inadequately validated and that the surrogate marker must fully (or nearly so) capture the effect of the intervention on the clinical outcome of interest However, many if not most treatments have several effect pathways and this may not be realized, particularly early in the research of a given intervention Table3 summarizes some of the issues that favor support in using a surrogate Surrogate endpoints are most useful in phase and trials where ‘proof of concept’ or dose-response is being evaluated One very important additional down-side to the use of surrogate measures is a result of its strength i.e the ability to use smaller sample sizes and shorter trials, in order to gain insight into the benefit of an intervention This is because smaller and shorter term studies result in the loss of important safety information Selection of Endpoints Table 3.8 makes the point that for most clinical trials, one of the key considerations is the difference in events between the investigational therapy and the control It is this difference (along with the frequency of events) that drives the sample size and power of the study From Table 3, one can compare the rate in the control group compared to the intervention effect Thus, if the rate in the control group of the event of interest is high (say 20%) and the treatment effect is 20% (i.e an expected 50% reduction compared to control), a sample size of 266 patients would be necessary Compare that to a control rate of 2% and a treatment effect of 10% (i.e a reduction compared to control from 2% to 1.8%), where a sample size of 97959 would be necessary 56 S.P Glasser Table 3.8 Support for surrogates Factor Favors surrogate Biological plausibility Success in clinical trials Risk/B, PubH considerations Epi evidence extensive, consistent, quantitative; credible animal model; pathogenesis & drug mechanism understood; surrogate late in causal path Effect on surrogate has predicted outcome with other drugs in class; and in several classes Serious or life-threatening illness and no alternative Rx; large safety database; short term use; difficult to study clinical end point Does not favor surrogate Inconsistent epi; no animal model; pathogenesis unclear; mechanisms not studied, surrogate earlier in causal path Inconsistent results across classes Less serious disease; little safety data; long term use; easy to study clinical endpoint Composite endpoints Composite endpoints (rather than a single endpoint) are being increasingly used as effect sizes for most new interventions are becoming smaller Effect sizes are becoming smaller because newer therapies need to be assessed when added to all clinically accepted therapies; and, thus the chance for an incremental change is reduced For example, when the first therapies for heart failure were introduced, they were basically added to diuretics and digitalis Now, a new therapy for heart failure would have to show benefit in patients already receiving more powerful diuretics, digitalis, angiotensin converting enzyme inhibitors and/or angiotensin receptor blockers, appropriately used beta adrenergic blocking agents, statins etc To increase the ‘yield’ of events, composite endpoints are utilized (a group of individual endpoints that together form a ‘single’ endpoint for that trial) Thus, the rationale for composite endpoints comes from basic considerations: statistical issues (sample size considerations due to the need for high event rates in the trial in order to keep the trial relatively small, of shorter duration and with less expense), the pathophysiology of the disease process being studied, and the increasing need to evaluate an overall clinical benefit The risk associated with the use of composite endpoints is that the benefits ascribed to an intervention are assumed to relate to all the components of the composite Consider the example of a composite endpoint that includes death, MI, and urgent revascularization In choosing the components of the composite, one should not be driven by the least important variable just because it happens to be the most frequent (e.g death, MI, urgent revascularization, would be a problem if revascularization turned out to be the main positive finding) Montori et al provided guidelines for interpreting composite endpoints which included asking whether the individual components of composite endpoints were of similar importance, occurred with about the same frequency, had similar relative risk reductions, and had similar biologic mechanisms.44 Clinical Trials 57 Freemantle et al assessed the incidence and quality of reporting of composite endpoints in randomized trials and asked whether composite endpoints provide for greater precision but with greater uncertainty.45 Their conclusion was that the reporting of composite outcomes is generally inadequate and as a result, they provided several recommendations regarding the use of composite endpoints such as following the CONSORT guidelines, interpreting the composite endpoint rather than parsing the individual endpoints, and defining the individual components of the composite as secondary outcomes The reasons for their recommendations stemmed from their observations that in many reports they felt that there was inappropriate attribution of the treatment effects on specific endpoints when only composite endpoints yielded significant results, the effect of dilution when individual endpoints might not all react in the same direction, and the effect of excessively influential endpoints that are not associated with irreversible harm In an accompanying editorial by Lauer and Topel they list a number of key questions that should be considered when composite endpoints are reported or when an investigator is contemplating their use.46 First, is whether the end points themselves are of clinical interest to patients and physicians, or are they surrogates; second, how non fatal endpoints are measured (e.g is judgment involved in the end point ascertainment, or is it a hard end point); third, how many individual endpoints make up the composite and how are they reported (ideally each component of the composite should be of equal clinical importance - in fact, this is rarely the case); and finally, how are non fatal events analyzed - that is are they subject to competing risks As they point out, patients who die cannot later experience a non fatal event so a treatment that increases the risk of death may appear to reduce the risk of non fatal events.46 Kip et al47 reviewed the problems with the use of composite endpoints in cardiovascular studies The term “major adverse cardiac events:” or MACE is used frequently in cardiovascular studies, a term that was born with the percutaneous coronary intervention studies in the 1990’s Kip et al noted that MACE encompassed a variety of composite endpoints, the varying definitions of which could lead to different results and conclusions, leading them to the recommendation that MACE should be avoided Fig 3.8 from their article demonstrates this latter point rather well Trial Duration An always critical decision in performing or reading about a RCT (or any study for that matter) is the specified duration of follow-up, and how that might influence a meaningful outcome Many examples and potential problems exist in the literature, but basically in interpreting the results of any study (positive or negative) the question should be asked ‘what would have happened had a longer follow-up period been chosen?’ A recent example is the Canadian Implantable Defibrillator Study (CIDS).48 CIDS was a RCT comparing the effects of defibrillator implantation to amiodarone in preventing recurrent sudden cardiac death in 659 patients At the end of study (a mean of months) a 20% relative risk reduction occurred in all-cause mortality, and a 33% 58 S.P Glasser Fig 3.8 Adjusted hazard ratios or different definitions of major adverse cardiac events (MACE) comparing acute myocardial infarction (MI) versus nonacute MI patients (top) and patients with multilesion versus single-lesion percutaneous coronary intervention (bottom) Filled center circles depict the adjusted hazard ratios, filled circles at the left and right ends depict the lower and upper 95% confidence limits Reverse = revascularization; ST = stent thrombosis; TVR = target vessel revascularization reduction occurred in arrhythmic mortality, when ICD therapy was compared with amiodarone (this latter reduction did not reach statistical significance) At one center, it was decided to continue the follow-up for an additional mean of 5.6 years in 120 patients who remained on their originally assigned intervention.49 All-cause mortality was then found to be increased in the amiodarone group The Myocardial Ischemia Reduction with Aggressive Cholesterol Lowering (MIRACL) trial is an example of a potential problem in which study duration could have been problematic (but probably wasn’t).50 The central hypothesis of MIRACL was that early rapid and profound cholesterol lowering therapy with atorvastatin could reduce early recurrent ischemic events in patients with unstable angina or acute non-Q wave infarction Often with acute intervention studies, the primary outcome is assessed at 30 days after the sentinel event From Fig 3.9 one can see that there was no difference in the primary outcome at 30 days Fortunately the study specified a 16 week follow-up, and a significant difference was seen at that time point Had the study been stopped at 30 days the ultimate benefit would not have been realized Finally, an example from the often cited controversial ALLHAT study which demonstrated a greater incidence in new diabetes in the diuretic arm as assessed at the study end of years.51 The investigators pointed out that this increase in diabetes did not result in a statistically significant difference in adverse Clinical Trials 59 Cumulative Incidence (%) MIRACL: primary efficacy measure Placebo 15 17.4% 14.8% Atorvastatin 10 Time to first occurrence of: • Death (any cause) • Nonfatal MI • Resuscitated cardiac arrest • Worsening angina with new objective evidence requiring urgent rehospitalization Relative risk = 0.84 p = 0.048 0 12 16 Time since randomization (weeks) Fig 3.9 The results of MIRACL for the primary outcome What would have been the conclusion for the intervention if the pre-specified study endpoint was month? outcomes when the diuretic arm was compared to the other treatment arms Many experts have subsequently opined that the trial duration was too short to assess adverse outcomes from diabetes, and had the study gone on longer that it is likely that a significant difference in adverse complications from diabetes would have occurred The Devil Lies in the Interpretation It is interesting to consider and important to reemphasize, that intelligent people can look at the same data and render differing interpretations MRFIT is exemplary of this principal, in that it demonstrates how mis-interpretation can have far-reaching effects One of the conclusions from MRFIT was that reduction in cigarette smoking and cholesterol was effective, but ‘possibly an unfavorable response to antihypertensive drug therapy in certain but not all hypertensive subjects’ led to mixed benefits.22 This ‘possibly unfavorable response’ has since been at least questioned if not proven to be false Differences in interpretation was also seen in the alpha-tocopherol, beta carotene cancer study.23 To explain the lack of benefit and potential worsening of cancer risk in the treated patients, the authors opined that perhaps the wrong dose was used, or that the intervention period was to short, since ‘no known or described mechanisms and no evidence of serious toxic effects of this substance (beta carotene) in humans’ had been observed This points out how ones personal bias can influence ones ‘shaping’ of the interpretation of a trials results Finally, there are many examples of trials where an interpretation of the results is initially presented only to find that after publication differing interpretations are rendered Just consider the recent controversy over the interpretation of the ALLHAT results.51 60 S.P Glasser Causal Inference, and the role of the Media in reporting clinical research will be discussed in chapters 16 and 20 Conclusions While randomized clinical trials remain a ‘gold standard’, there remains many aspects of trial design that must be considered before accepting the studies results, even when the study design is a RCT Starzi et al in their article entitled ‘Randomized Trialomania? The Multicentre Liver Transplant Trials of Tacrolimus’ outline many of the roadblocks and pitfalls that can befall even the most conscientious clinical investigator.52 Ioannidis presents an even more somber view of clinical trials, and has stated ‘there is increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims However, this should not be surprising It can be proven that most claimed research findings are false.’53 One final note of caution revolves around the use of reading or reporting only abstracts in decision making As Toma et al noted, ‘not all research presented at scientific meetings is subsequently published, and even when it is, there may be inconsistencies between these results and what is ultimately printed.54 They compared RCT abstracts presented at the American College of Cardiology sessions between 1999 and 2002, and subsequent full length publications Depending upon the type of presentation (e.g late breaking trials vs other trials) 69-79% were ultimately published; and, discrepancies between meeting abstracts and publication results were common even for the late breaking trials.54 References Grady D, Herrington D, Bittner V, et al Cardiovascular disease outcomes during 6.8 years of hormone therapy: Heart and Estrogen/progestin Replacement Study follow-up (HERS II) Jama Jul 2002;288(1):49-57 Hulley S, Grady D, Bush T, et al Randomized trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women Heart and Estrogen/progestin Replacement Study (HERS) Research Group Jama Aug 19 1998;280(7):605-613 Rossouw JE, Anderson GL, Prentice RL, et al Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results From the Women’s Health Initiative randomized controlled trial Jama Jul 17 2002;288(3):321-333 Grady D, Rubin SM, Petitti DB, et al Hormone therapy to prevent disease and prolong life in postmenopausal women Ann Intern Med Dec 15 1992;117(12):1016-1037 Stampfer MJ, Colditz GA Estrogen replacement therapy and coronary heart disease: a quantitative assessment of the epidemiologic evidence Prev Med Jan 1991;20(1):47-63 Sullivan JM, Vander Zwaag R, Hughes JP, et al Estrogen replacement and coronary artery disease Effect on survival in postmenopausal women Arch Intern Med Dec 1990;150(12):2557-2562 Glasser SP, Howard G Clinical trial design issues: at least 10 things you should look for in clinical trials J Clin Pharmacol Oct 2006;46(10):1106-1115 Clinical Trials 61 Grimes DA, Schulz KF An overview of clinical research: the lay of the land Lancet Jan 2002;359(9300):57-61 Loscalzo J Clinical trials in cardiovascular medicine in an era of marginal benefit, bias, and hyperbole Circulation Nov 15 2005;112(20):3026-3029 10 Bienenfeld L, Frishman W, Glasser SP The placebo effect in cardiovascular disease Am Heart J Dec 1996;132(6):1207-1221 11 Clark PI, Leaverton PE Scientific and ethical issues in the use of placebo controls in clinical trials Annu Rev Public Health 1994;15:19-38 12 Rothman KJ, Michels KB The continuing unethical use of placebo controls N Engl J Med Aug 11 1994;331(6):394-398 13 Montori VM, Devereaux PJ, Adhikari NK, et al Randomized trials stopped early for benefit: a systematic review Jama Nov 2005;294(17):2203-2209 14 Medical Research Council Streptomycin treatment of pulmonary tuberculosis BMJ 1948;ii:769-782 15 Reviews of statistical and economic books, Student’s Collected Papers J Royal Staitistical Society 1943;106:278-279 16 A Village of 100 A Step Ahead 17 Beneficial effect of carotid endarterectomy in symptomatic patients with high-grade carotid stenosis North American Symptomatic Carotid Endarterectomy Trial Collaborators N Engl J Med Aug 15 1991;325(7):445-453 18 Endarterectomy for asymptomatic carotid artery stenosis Executive Committee for the Asymptomatic Carotid Atherosclerosis Study Jama May 10 1995;273(18):1421-1428 19 Lang JM The use of a run-in to enhance compliance Stat Med Jan-Feb 1990;9(1-2):87-93; discussion 93-85 20 Shem S The House of God: Palgrace Macmillan; 1978:280 21 Smith DH, Neutel JM, Lacourciere Y, Kempthorne-Rawson J Prospective, randomized, openlabel, blinded-endpoint (PROBE) designed trials yield the same results as double-blind, placebocontrolled trials with respect to ABPM measurements J Hypertens Jul 2003;21(7):1291-1298 22 Multiple risk factor intervention trial Risk factor changes and mortality results Multiple Risk Factor Intervention Trial Research Group Jama Sep 24 1982;248(12):1465-1477 23 The effect of vitamin E and beta carotene on the incidence of lung cancer and other cancers in male smokers The Alpha-Tocopherol, Beta Carotene Cancer Prevention Study Group N Engl J Med Apr 14 1994;330(15):1029-1035 24 Hollis S, Campbell F What is meant by intention to treat analysis? Survey of published randomised controlled trials Bmj Sep 11 1999;319(7211):670-674 25 Influence of adherence to treatment and response of cholesterol on mortality in the coronary drug project N Engl J Med Oct 30 1980;303(18):1038-1041 26 Sulfinpyrazone in the prevention of sudden death after myocardial infarction The Anturane Reinfarction Trial Research Group N Engl J Med Jan 31 1980;302(5):250-256 27 Sackett DL, Gent M Controversy in counting and attributing events in clinical trials N Engl J Med Dec 27 1979;301(26):1410-1412 28 Howard G, Chambless LE, Kronmal RA Assessing differences in clinical trials comparing surgical vs nonsurgical therapy: using common (statistical) sense Jama Nov 1997;278(17):1432-1436 29 Assmann SF, Pocock SJ, Enos LE, Kasten LE Subgroup analysis and other (mis)uses of baseline data in clinical trials Lancet Mar 25 2000;355(9209):1064-1069 30 Sleight P Debate: Subgroup analyses in clinical trials: fun to look at - but don’t believe them! Curr Control Trials Cardiovasc med 2000;1(1):25-27 31 Amarenco P, Goldstein LB, Szarek M, et al Effects of intense low-density lipoprotein cholesterol reduction in patients with stroke or transient ischemic attack: the Stroke Prevention by Aggressive Reduction in Cholesterol Levels (SPARCL) trial Stroke Dec 2007;38(12): 3198-3204 32 Black HR, Elliott WJ, Grandits G, et al Principal results of the Controlled Onset Verapamil Investigation of Cardiovascular End Points (CONVINCE) trial Jama Apr 23-30 2003;289(16):2073-2082 62 S.P Glasser 33 Weir MR, Ferdinand KC, Flack JM, Jamerson KA, Daley W, Zelenkofske S A noninferiority comparison of valsartan/hydrochlorothiazide combination versus amlodipine in black hypertensives Hypertension Sep 2005;46(3):508-513 34 Kaul S, Diamond GA, Weintraub WS Trials and tribulations of non-inferiority: the ximelagatran experience J Am Coll Cardiol Dec 2005;46(11):1986-1995 35 Le Henanff A, Giraudeau B, Baron G, Ravaud P Quality of reporting of noninferiority and equivalence randomized trials Jama Mar 2006;295(10):1147-1151 36 Fleming TR, DeMets DL Surrogate end points in clinical trials: are we being misled? Ann Intern Med Oct 1996;125(7):605-613 37 Prentice RL Surrogate endpoints in clinical trials: definition and operational criteria Stat Med Apr 1989;8(4):431-440 38 Anand IS, Florea VG, Fisher L Surrogate end points in heart failure J Am Coll Cardiol May 2002;39(9):1414-1421 39 Kelsen DP Surrogate endpoints in assessment of new drugs in colorectal cancer Lancet Jul 29 2000;356(9227):353-354 40 Buyse M, Thirion P, Carlson RW, Burzykowski T, Molenberghs G, Piedbois P Relation between tumour response to first-line chemotherapy and survival in advanced colorectal cancer: a metaanalysis Meta-Analysis Group in Cancer Lancet Jul 29 2000;356(9227):373-378 41 Greene HL, Roden DM, Katz RJ, Woosley RL, Salerno DM, Henthorn RW The Cardiac Arrhythmia Suppression Trial: first CAST then CAST-II J Am Coll Cardiol Apr 1992;19(5):894-898 42 FDA Adviser Questions Surrogate Endpoints for Diabetes Drug Approvals Medpage Today; 2007 43 Boissel JP, Collet JP, Moleur P, Haugh M Surrogate endpoints: a basis for a rational approach Eur J Clin Pharmacol 1992;43(3):235-244 44 Montori VM, Busse JW, Permanyer-Miralda G, Ferreira I, Guyatt GH How should clinicians interpret results reflecting the effect of an intervention on composite endpoints: should I dump this lump? ACP J Club Nov-Dec 2005;143(3):A8 45 Freemantle N, Calvert M, Wood J, Eastaugh J, Griffin C Composite outcomes in randomized trials: greater precision but with greater uncertainty? Jama May 21 2003;289(19):2554-2559 46 Lauer MS, Topol EJ Clinical trials multiple treatments, multiple end points, and multiple lessons Jama May 21 2003;289(19):2575-2577 47 Kip K, Hollabaugh K, Marroquin O, Williams D The problem with composite endpoinmts in cardiovascular studies J Am Coll Cardiol 2008;51:701-707 48 Connolly SJ, Gent M, Roberts RS, et al Canadian implantable defibrillator study (CIDS) : a randomized trial of the implantable cardioverter defibrillator against amiodarone Circulation Mar 21 2000;101(11):1297-1302 49 Bokhari F, Newman D, Greene M, Korley V, Mangat I, Dorian P Long-term comparison of the implantable cardioverter defibrillator versus amiodarone: eleven-year follow-up of a subset of patients in the Canadian Implantable Defibrillator Study (CIDS) Circulation Jul 13 2004;110(2):112-116 50 Schwartz GG, Olsson AG, Ezekowitz MD, et al Effects of atorvastatin on early recurrent ischemic events in acute coronary syndromes: the MIRACL study: a randomized controlled trial Jama Apr 2001;285(13):1711-1718 51 Major outcomes in high-risk hypertensive patients randomized to angiotensin-converting enzyme inhibitor or calcium channel blocker vs diuretic: The Antihypertensive and LipidLowering Treatment to Prevent Heart Attack Trial (ALLHAT) Jama Dec 18 2002;288(23):2981-2997 52 Starzl TE, Donner A, Eliasziw M, et al Randomised trialomania? The multicentre liver transplant trials of tacrolimus Lancet Nov 18 1995;346(8986):1346-1350 53 Ioannidis JPA Why most published research findings are false PLoS 2005;2:0696-0701 54 Toma M, McAlister FA, Bialy L, Adams D, Vandermeer B, Armstrong PW Transition from meeting abstract to full-length journal article for randomized controlled trials Jama Mar 15 2006;295(11):1281-1287 ... African-American Age 25 –44 45–64 6 5-> Median Income $ Education High School CVD Framingham 24 2,800 73.5 62, 910 5.1 30 20 14 26 ,700 35 22 13 55,300 25 28 48 528 –5 82 13 23 ... 20 03 ;21 (7): 129 1-1 29 8 22 Multiple risk factor intervention trial Risk factor changes and mortality results Multiple Risk Factor Intervention Trial Research Group Jama Sep 24 19 82; 248( 12) :146 5-1 477... benefit: a systematic review Jama Nov 20 05 ;29 4(17) :22 0 3 -2 209 14 Medical Research Council Streptomycin treatment of pulmonary tuberculosis BMJ 1948;ii:76 9-7 82 15 Reviews of statistical and economic books,

Ngày đăng: 14/08/2014, 11:20

Nguồn tham khảo

Tài liệu tham khảo Loại Chi tiết
1. Grady D, Herrington D, Bittner V, et al. Cardiovascular disease outcomes during 6.8 years of hormone therapy: Heart and Estrogen/progestin Replacement Study follow-up (HERS II).Jama. Jul 3 2002;288(1):49-57 Khác
2. Hulley S, Grady D, Bush T, et al. Randomized trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women. Heart and Estrogen/progestin Replacement Study (HERS) Research Group. Jama. Aug 19 1998;280(7):605-613 Khác
3. Rossouw JE, Anderson GL, Prentice RL, et al. Risks and benefits of estrogen plus progestin in healthy postmenopausal women: principal results From the Women’s Health Initiative ran- domized controlled trial. Jama. Jul 17 2002;288(3):321-333 Khác
4. Grady D, Rubin SM, Petitti DB, et al. Hormone therapy to prevent disease and prolong life in postmenopausal women. Ann Intern Med. Dec 15 1992;117(12):1016-1037 Khác
5. Stampfer MJ, Colditz GA. Estrogen replacement therapy and coronary heart disease: a quan- titative assessment of the epidemiologic evidence. Prev Med. Jan 1991;20(1):47-63 Khác
6. Sullivan JM, Vander Zwaag R, Hughes JP, et al. Estrogen replacement and coronary artery disease. Effect on survival in postmenopausal women. Arch Intern Med. Dec 1990;150(12):2557-2562 Khác
7. Glasser SP, Howard G. Clinical trial design issues: at least 10 things you should look for in clinical trials. J Clin Pharmacol. Oct 2006;46(10):1106-1115 Khác
8. Grimes DA, Schulz KF. An overview of clinical research: the lay of the land. Lancet. Jan 5 2002;359(9300):57-61 Khác
9. Loscalzo J. Clinical trials in cardiovascular medicine in an era of marginal benefit, bias, and hyperbole. Circulation. Nov 15 2005;112(20):3026-3029 Khác
10. Bienenfeld L, Frishman W, Glasser SP. The placebo effect in cardiovascular disease. Am Heart J. Dec 1996;132(6):1207-1221 Khác
11. Clark PI, Leaverton PE. Scientific and ethical issues in the use of placebo controls in clinical trials. Annu Rev Public Health. 1994;15:19-38 Khác
12. Rothman KJ, Michels KB. The continuing unethical use of placebo controls. N Engl J Med. Aug 11 1994;331(6):394-398 Khác
13. Montori VM, Devereaux PJ, Adhikari NK, et al. Randomized trials stopped early for benefit: a systematic review. Jama. Nov 2 2005;294(17):2203-2209 Khác
14. Medical Research Council. Streptomycin treatment of pulmonary tuberculosis. BMJ. 1948;ii:769-782 Khác
15. Reviews of statistical and economic books, Student’s Collected Papers. J Royal Staitistical Society. 1943;106:278-279.16. A Village of 100 A Step Ahead Khác
17. Beneficial effect of carotid endarterectomy in symptomatic patients with high-grade carotid stenosis. North American Symptomatic Carotid Endarterectomy Trial Collaborators. N Engl J Med. Aug 15 1991;325(7):445-453 Khác
18. Endarterectomy for asymptomatic carotid artery stenosis. Executive Committee for the Asymptomatic Carotid Atherosclerosis Study. Jama. May 10 1995;273(18):1421-1428 Khác
19. Lang JM. The use of a run-in to enhance compliance. Stat Med. Jan-Feb 1990;9(1-2):87-93; discussion 93-85 Khác
21. Smith DH, Neutel JM, Lacourciere Y, Kempthorne-Rawson J. Prospective, randomized, open- label, blinded-endpoint (PROBE) designed trials yield the same results as double-blind, placebo- controlled trials with respect to ABPM measurements. J Hypertens. Jul 2003;21(7):1291-1298 Khác
22. Multiple risk factor intervention trial. Risk factor changes and mortality results. Multiple Risk Factor Intervention Trial Research Group. Jama. Sep 24 1982;248(12):1465-1477 Khác

TỪ KHÓA LIÊN QUAN