Statistical Methods for Survival Data Analysis Third Edition phần 3 doc

random sample of persons of the same age, gender, occupation, and so on, the patients could be considered ‘‘cured.’’ Cutler et al. (1957, 1959, 1960a, b, 1967) adopted Greenwood’s idea of comparing the survival experience of cancer patients with that of the general population to ascertain (1) the ratio of observed to expected survival rates and (2) whether, in time, the mortality rate declines to a ‘‘normal’’ level. The relative survival rate is defined as the ratio of the survival rate (probability of surviving one year) for a patient under study (observed rate) to someone in the general population of the same age, gender, and race (expected rate) over a specified period of time. To provide a more precise measure of the relationship of the observed and expected survival rates, Cutler et al. suggest computing the ratio for each individual follow-up year. A relative rate of 100% means that during a specific follow-up year the mortality rates in the patient and in the general population are equal. A relative rate of less than 100% means that the mortality rate in the patients is higher than that in the general population. Cutler et al. use the survival rates in the Connecticut and U.S. life tables for the general population. Using the notations in Table 4.6, the survival rate observed at time t G is p G , the expected survival rate can be computed as follows: Suppose that at time t G there are n  G individuals alive for whom age, gender, race, and time of observation are known. Let p * GH be the survival rate of the jth individual from general population life tables (with corresponding age, gender, and race). The expected survival rate is p * G : 1 n  G LY G  H p * GH (4.3.1) Then the relative survival rate at time t G is defined by r G : p G p * G (4.3.2) Example 4.5 taken from Cutler et al. (1957) illustrates the interpretation of relative survival rates. Example 4.5 A total of 9121 breast cancer cases were diagnosed in Connecticut hospitals from 1935 to 1953. The Connecticut life table for white females, 1939—1941, is used in calculation of the expected survival rate. Table 4.8 gives the observed and expected survival rates as well as the relative survival rates. Figure 4.5a graphically shows these data: the survival curves for the breast cancer patients and the general population. The relative survival rates are plotted in Figure 4.5b. For this group of patients, the relative survival rates, although increasing during 13 successive years, are less than 100% throughout the 15 years of follow-up. During each of the 15 years, the , -,     95 Table 4.8 Relative Survival Rates of Breast Cancer Patients in Connecticut, 1935 1953 Survival Rates (%) Relative Years after Survival Rate Diagnosis Observed Expected (%) 0—1 82.9 97.2 85 1—2 83.3 97.1 86 2—3 85.9 96.9 89 3—4 86.8 96.7 90 4—5 89.2 96.6 92 5—6 90.0 96.4 93 6—7 89.9 96.4 93 7—8 91.6 96.2 95 8—9 92.0 96.1 96 9—10 92.7 96.1 96 10—11 92.9 95.9 97 11—12 94.0 95.8 98 12—13 94.1 95.3 99 13—14 91.5 95.3 96 14—15 90.6 94.9 95 Source: Cutler et al. (1957). breast cancer patient mortality rate is greater than that of the general population. Other measures of describing survival experience of cancer patients are the five-year survival rate and the corrected rate. The five-year survival rate is simply the cumulative proportion surviving at the end of the fifth year. For example, the five-year survival rate for the males with angina pectoris in Example 4.4 is 0.5193. The five-year survival rate is no longer a measure of treatment success for patients with many types of cancer since the survival of cancer patients has improved considerably in the last few decades. Berkson (1942) suggests using a corrected survival rate. This is the survival rate if the disease under study alone is the cause of death. In most survival studies, the proportion of patients surviving is usually determined without considering the cause of death, which might be unrelated to the specific illness. If p A denotes the survival rate when cancer alone is the cause of death, Berkson proposes that p A : p p  (4.3.3) where p is the observed total survival rate in a group of cancer patients and p  is the survival rate for a group of the same age and gender in the general 96       Figure 4.5 Survival rates of breast cancer patients in Connecticut, 1935—1953. population. Rate p A may be computed at any time after the initiation of follow-up; it provides a measure of the proportion of patients that escaped a death from cancer up to that point. If a five-year survival rate is 0.5 and it is corrected for noncancer deaths and if we find that five-year survival rate of the general population is 0.9, the corrected survival rate is 0.5/0.9, or 0.56. 4.4 STANDARDIZED RATES AND RATIOS Rates and ratios are often used in demography and epidemiology to describe the occurrence of a health-related event. For example, the standardized mortality (or morbidity) ratio (SMR) is frequently used in occupational epidemiology as a measure of risk, and the standardized death rate is commonly used in comparing mortality experiences of different populations or the same population at different times. The concept of the SMR is very similar to that of the relative survival rate described above. It is defined as the ratio of the observed and the expected number of death and can be expressed as SMR : observed number of deaths in study population expected number of deaths in study population ;100 (4.4.1) where the expected number of deaths is the sum of the expected deaths from the same age, gender, and race groups in the general population. The standardized morbidity ratio can similarly be calculated simply by replacing the word deaths by disease cases in (4.4.1). If only new cases are of interest, we call the ratio the standardized incidence ratio (SIR).     97 Table 4.9 Population and Deaths of Sunny City and Happy City by Age Sunny City Happy City Age-Specific Age-Specific Rates Rates Age Population Deaths (per 1000) Population Deaths (per 1000) :25 25,000 25 1.00 55,000 110 2.0 25—44 40,000 50 1.25 20,000 50 2.5 45—64 20,000 200 10.00 21,000 315 15.0 .65 15,000 1,200 80.00 4,000 650 162.5 Total 100,000 1,475 100,000 1,125 The standardized death rate is only one of the many rates used to describe the health status of a population or to compare the health status of different populations. If the populations are similar with respect to demographic variables such as age, gender, or race, the crude rate, or ratio of the number of persons to whom the event under study occurred to the total number of persons in the population, can safely be used for comparison. The level of the crude rate is affected by demographic characteristics of the population for which the rate is computed. If populations have different demographic compositions, a comparison of the crude rates may be mislead- ing. As an example consider the two hypothetical populations, Sunny City and Happy City, in Table 4.9. The crude death rate of Sunny City is 1000(1475/ 100,000) or 14.7 per 1000. The crude death rate of Happy City is 1000(1125/ 100,000), or 11.25 per 1000, which is lower than that of Sunny City even though all age-specific rates in Happy City are higher. This is mainly because there is a large proportion of older people in Sunny City. A crude death rate of a population may be relatively high merely because the population has a high proportion of older people; it may be relatively low because the population has a high proportion of younger people. Thus, one should adjust the rate to eliminate the effects of age, gender, or other differences. The procedure of adjustment is called standardization and the rate obtained after standardization is called the standardized rate. The most frequently used methods for standardization are the direct method and the indirect method. Direct Method In this method a standard population is selected. The distribution across the groups with different values of the demographic characteristic (e.g., different age groups) must be known. Let r  , , r I , where k is the number of groups, be the specific rates of the different groups for the population under study. Let p  , , p I be the proportions of people in the k groups for the standard population. The direct standardized rate is obtained by multiplying the specific 98       rates r G by p G in each group. The formula for the direct standardized rate is R  : I  G r G p G (4.3.2) As an example, consider the data in Table 4.9. If we choose a standard population whose distribution is shown in the second column of Table 4.10, the direct standardized death rate for Sunny City and Happy City is, respect- ively, 9.37 and 17.84 per 1000. These standardized rates are more reliable than the crude rates for comparison purposes. Indirect Method If the specific rates r G of the population being studied are unknown, the direct method cannot be applied. In this case, it is possible to standardize the rate by an indirect method if the following are available: 1. The number of persons to whom the event being studied occurred (D) in the population. For example, if the death rate is being standardized, D is the number of deaths. 2. The distribution across the various groups for the population being studied, denoted by n  , , n I . 3. The specific rates of the selected standard population, denoted by s  , , s I . 4. The crude rate of the standard population, denoted by r. The formula for indirect standardization is R  : D  I G n G s G r (4.3.3) The summation in (4.3.3) is the expected number of persons to whom the event occurred on the basis of the specific rates of the standard population. Thus, the indirect method adjusts the crude rate of the standard population by the ratio of the observed to expected number of persons to whom the event occurred in the population under study. Table 4.11 represents an example for the death rate in the states of Oklahoma and Arizona in 1960 (data are from Grove and Hetzel, 1963). The U.S. population in 1960 is used as the standard population. The crude death rate of Oklahoma (9.7 per thousand) is higher than that of Arizona (7.8 per thousand). However, the indirect standardized rates show a reverse relationship (8.6 for Oklahoma and 9.6 for Arizona). This, again, is because of the differences in age distribution. There is a higher proportion of people below the age of 25 in Arizona and a higher proportion of people above the age of 54 in Oklahoma.     99 Table 4.10 Standardized Death Rates by Direct Method for Sunny City and Happy City Sunny City Happy City Age-Specific Age-Standardized Age-Specific Age-Standardized Standard Proportion, Death Rates, Death Rates, Death Rates, Death Rates, Age Population p G r G p G r G r G p G r G :25 420,000 0.42 1.00 0.42 2.00.84 25—44 280,000 0.28 1.25 0.35 2.5 0.70 45—64 220,000 0.22 10.00 2.20 15.0 3.30 .65 80,000 0.08 80.00 6.40 162.5 13.00 Total 1,000,000 9.37 17.84 (R  )(R  ) 100 Table 4.11 Standardized Death Rates by Indirect Method for Oklahoma and Arizona, 1960 Oklahoma Arizona Standard Population (U.S. Population, 1960) Expected Expected Age-Specific Death Rates, Population, Deaths, Population, Deaths, Age s G n G n G s G n G n G s G :10.0270 49,103 1,325.78 34,599 934.17 1—4 0.0011 193,644 213.01 132,367 145.60 5—14 0.0005 454,972 227.49 285,830 142.92 15—24 0.0011 329,230 362.15 186,789 205.47 25—34 0.0015 279,327 418.99 169,873 254.81 35—44 0.0030 287,994 863.98 173,029 519.09 45—54 0.0076 269,147 2,045.52 136,573 1,037.95 55—64 0.0174 216,036 3,759.03 92,871 1,615.96 65—74 0.0382 157,385 6,012.11 63,634 2,430.82 75—84 0.0875 74,848 6,549.20 22,499 1,968.66 85; 0.1986 16,598 3,296.36 4,092 812.67 Total 2,328,284 25,074 1,302,161 10,068 Crude rates 9.5 9.7 7.8 (per thousand) Observed deaths 22,584 10,157 Expected deaths? 25,074 10,068 Standardized rate  22,584 25,074  9.5 : 8.6  10,157 10,068  9.5 : 9.6 (per thousand) Source: Data from Grove and Hetzel (1963). ?  n G s G . 101 Results for the adjusted rates depend on the standard population selected. Hence, this selection should be done carefully. When discussing death rate by age, Shryock et al. (1971) suggest that a population with similar age distribution to the various populations under study be selected as a standard. If the death rate of two populations is being compared, it is best to use the average of the two distributions as a standard. It should be remembered that specific rates are still the most accurate and essential indicators of the variations among populations. No matter which method is used, standardized rates are meaningful only when compared with similarly computed rates. Kitagawa (1964) also criticizes the standardized rate because if the specific rates vary in different ways between the two populations being compared, standardization will not indicate the differences and some- times will even mask the differences. Nevertheless, if the specific rates are not available, if a single rate for a population is desired, or if the demographic composition of the population being compared is different, the standardized rate is useful. Bibliographical Remarks Kaplan and Meier’s (1958) PL method is the most commonly used technique for estimating the survivorship function for samples of small and moderate size. However, with the aid of a computer, it is not difficult to use the method for large sample sizes. Berkson (1942), Berkson and Gage (1950), Cutler and Ederer (1958), and Gehan (1969) have written classic reports on life-table analysis. Peto et al. (1976) published an excellent review of some statistical methods related to clinical trials. The term life-table analysis that they use includes the PL method. Other references on life tables are, for example, Armitage (1971), Shryock et al. (1971), Kuzma (1967), Chiang (1968), Gross and Clark (1975), and Elandt- Johnson and Johnson (1980). Relative survival rates and corrected survival rates have been used by Cutler and co-workers in a series of survival studies on cancer patients in Connecticut in the 1950s and 1960s (Cutler et al., 1957, 1959, 1960a, b, 1967; Ederer et al., 1961). Discussions of SMR, standardized rates, and related topics can be found in many standard epidemiology textbooks: for example, Mausner and Kramer (1985), Kahn (1983), Kelsey et al. (1986), Shryock et al. (1971), Chiang (1961), and Mantel and Stark (1968). EXERCISES 4.1 Consider the survival time of the 30 melanoma patients in Table 3.1. (a) Compute and plot the PL estimates of the survivorship functions S (t) of the two treatment groups and check your results with Table 3.2 and Figure 3.1. 102       Exercise Table 4.1 Number Time from Number Lost Withdrawn Number Number Diagnosis to Follow-up, Alive, Dying, Entering, (yr) l G w G d G n  G 0—5 18 0 731 949 5—10 16 0 52 200 10—15 8 67 14 132 15—20 0 33 10 43 (b) Compute the variance of S (t) for every uncensored observation. (c) Estimate the median survival times of the two groups. 4.2 Do the same as in Exercise 4.1 for the remission durations of the two treatment groups in Table 3.1. 4.3 Compute and plot the PL estimates of the tumor-free time distributions for the saturated fat and unsaturated fat diet groups in Table 3.4. Compare your results with Figure 3.4. 4.4 Consider the remission data of 42 patients with acute leukemia in Example 3.3. (a) Compute and plot the PL estimates of S(t) at every time to relapse for the 6-MP and placebo groups. (b) Compute the variances of S (10) in the 6-MP group and of S (3) in the placebo group. (c) Estimate the median remission times of the two treatment groups. 4.5 (a) Compute the survival time for each patient in Exercise Table 3.1. (b) Estimate and plot the overall survivorship function using the PL method. What is the median survival time? (c) Divide the patients into two groups by gender. Compute and plot the PL estimates of the survivorship functions for each group. What is the median survival time for each? 4.6 Consider the skin test results in Exercise Table 3.1. For each of the five skin tests: (a) Divide patients into two groups according to whether they had a positive reaction. Measurements less than 10;10 (5;5 for mumps) are considered negative. (b) Estimate and plot the survivorship functions of the two groups. (c) Can you tell from the plots if any skin tests might predict survival time? 4.7 Consider the data of patients with cancer of the ovary diagnosed in Connecticut from 1935 to 1944 (Cutler et al. 1960b). Exercise Table 4.1  103 Exercise Table 4.2 Survival Data of Female Patients with Angina Pectoris Year After Number Entering Number Lost to Diagnosis Interval Follow-up Number Dying 0—1 555 0 82 1—2 473 8 30 2—3 435 8 27 3—4 400 7 22 4—5 371 7 26 5—6 338 28 25 6—7 285 31 20 7—8 234 32 11 8—9 191 24 14 9—10 153 27 13 10—11 113 22 5 11—12 86 23 5 12—13 58 18 5 13—14 35 9 2 14—15 24 7 3 15; 14 11 3 Source: R. L. Parker et al., JAMA, 131(2),95—100 (1946). Copyright 1946. American Medical Association. reproduces the data in life-table format. Provide a life-table like Table 4.5. What do you find out? 4.8 Do a complete life-table analysis for the two sets of data given in Table 3.5. Plot the three survival functions. 4.9 Do a complete life-table analysis of the data given in Exercise Table 4.2. Plot the three survival functions. 4.10 Consider the survival times of the melanoma patients in Exercise Table 3.4. Do a complete life-table analysis of the survival time. Plot the three survival functions. 4.11 Consider the data given in Exercise Table 4.3. Compute the direct standardized death rate for the states of Oklahoma and Montana using the U.S. population of 1960 as the standard. 4.12 Given the population of Japan and Chile (Exercise Table 4.4), compute the indirect standardized death rate for the two countries using the U.S. death rate of 1960 in Table 4.11 as the standard. 104       [...]... n 9d G G 1 935 —1944 (2) 38 7.0 218.5 172.5 127.0 108.0 90.5 79.0 71.0 65.5 59.0 Total, n G (3) 185 88 55 43 32 31 20 7 6 6 Deaths, d G (4) 559.0 461.0 39 6.0 34 3.0 299.0 235 .0 170.0 132 .0 101.5 71.0 Survivors, n 9d G G 1945—1954 (5) 744.0 549.0 451.0 38 6.0 33 1.0 266.0 190.0 139 .0 107.5 77.0 Total, n G (6) 35 2 133 100 62 49 42 28 12 12 13 Deaths, D G (7) 779.0 634 .5 5 23. 5 451.0 39 0.0 31 4.5 241.0... 161.0 1 23. 0 Survivors, S G (8) Combined Time Periods 1 131 .0 767.5 6 23. 5 5 13. 0 439 .0 35 6.5 269.0 210.0 1 73. 0 136 .0 Total, T G (9) 512.45 4 53. 86 37 8.67 33 9 .35 294.05 234 .66 170.22 131 .06 100.04 69.64 n S /T G G G (6);(8) (9) E(d ) G (3) ;(7) (9) 120.45 37 .86 27.67 15 .35 12.05 10.66 8.22 4.06 4.54 5.64 ——— 246.50 (11) (10) 54.624 22.418 16. 832 10.174 8.090 7. 036 5.221 2.546 2.641 2.909 ———— 132 .491 Var(d... 79.0 190.4 25.8 1.2 0.5 1.6 1.8 3. 1 7.5 16 .3 37 .3 87 .3 202.8 Source: Grove and Hetzel (19 63) Exercise Table 4.4 Population (thousands) Age :1 1—4 5—14 15—24 25 34 35 —44 45—54 55—64 65—74 75—84 -85 Total Observed deaths Japan Chile 1,577 6,268 20,2 23 17,627 15,727 11,057 9,018 6,5 73 3,724 1, 438 188 ———— 93, 419 706,599 228 876 1,817 1 ,32 3 1, 034 779 6 03 395 212 83 22 ——— 7 ,37 4 95,486 Source: Shryock et... Without CHD Total 30 70 100 60 155 215 90 255 31 5 and for nonsmokers: Elevated Cholesterol? Yes No Total —  1 23 Using (5.2.2) and (5.2 .3) , we obtain E(d ) : 140 ; 200 : 100 280 E(d ) : 90 ; 100 : 28.571 31 5 Var(d ) : 140 ; 140 ; 200 ; 80 : 14 .33 7 (280)(280 9 1) Var(d ) : 90 ; 225 ; 100 ; 215 : 13. 974 (31 5) (31 5 9 1) Using (5.2.1) and d : 120, d : 30 , we have X... and (5.1. 13) For example, t for t : 10 is equal to 1/12 ; 1/11 ; 1/10 ; 1/9 PL G or simply the previous t plus 1/9, that is, 0.274 ; 1/9 : 0 .38 5 The PL tied observations receive an average score: for example, for t : 12, G      119 Table 5.6 Computations of Cox’s F-Test for Data in Example 5.6 t t PL G 8 8 9 10 12 12 12 13 15 20 30 ; 30 ; , : 0.0 831 0.129 ... 0.100 : 0.274 0.274 ; : 0 .38 5 0 .38 5 ; : 0.510 0.510 ; : 0.661 0.661 0.6 53 ; : 0.820 0.820 ; : 1.020 1.020 ; : 1.270 1.270 ; : 1.6 03 1.6 03 ; : 2.1 03 2.1 03 , t of PL Sample A t of PL Sample B 0.129 0.129 — 0 .38 5 0.661 0.661 — 1.020 — — — — ——— 2.985 — — 0.274 — — — 0.661 — 1.270 1.6 03 2.1 03 2.1 03 ——— 8.014 t : (0.510 ; 0.6 53 ; 0.820) : 0.661 The last two columns... Table 4 .3 Age :1 1—4 5—14 15—24 25 34 35 —44 45—54 55—64 65—74 75—84 85; Total U.S Population, 1960 (thousands) Proportion, p G 4,112 16,209 35 ,465 24,020 22,818 24,081 20,486 15,572 10,997 4, 634 929 179 ,32 3 Oklahoma Average Montana Average Death Rate Death Rate (per 1000) (per 1000) r r G G 0.0 23 0.091 0.198 0. 134 0.127 0. 134 0.114 0.087 0.061 0.026 0.005 1.000 25.5 1.2 0.5 1.2 1.6 2.9 6.9 14.8 32 .4 79.0... : 3. 75 Since there are a total of six deaths (O : 1, O : 5) in the two groups, E : 6 9 3. 75 : 2.25 Using Table 5.4 Computation of E1 of Logrank Test Relapse time, t d R n R n R 15 18 19 20 23 Total 1 1 2 1 1 5 4 3 3 2 5 4 3 1 0 e R 0.5 0.5 1.0 0.75 1.0 3. 75 e R 0.5 0.5 1.0 0.25 0 2.25      115 (5.1.10), we have X : (1 9 3. 75) (5 9 2.25) ; : 5 .37 8 3. 75... 128.571) : 16.220 14 .33 7 ; 13. 974 which is significant at the 0.001 level Thus, elevated cholesterol is significantly associated with CHD after adjusting for the effects of smoking Example 5.8 Table 5.7 gives survival data in life-table format of male cases with localized cancer of the rectum in Connecticut for 1 935 —1944 and 1945—1954 We use Mantel and Haenszel’s chi-square test to see if the survival distribution... Table 5 .3 Computations of Logrank Test Remission Times in Both Samples, t G 15 16; 18 18; 19 20 20; 23 24; m G 1 — 1 — 2 1 — 1 — r G m /r G G e(t ) G w G 10 — 8 — 6 4 — 2 — 0.100 — 0.125 — 0 .33 3 0.250 — 0.500 — 0.100 — 0.225 — 0.558 0.808 — 1 .30 8 — 0.900? 90.100 0.775? 90.225 0.442? 0.192? 90.808 90 .30 8 91 .30 8 ? From sample 2 The statistic S : 0.900 ; 0.775 ; 0.442 ; 0.442 ; 0.192 : 2.751 The variance . 82 1—2 4 73 8 30 2 3 435 8 27 3 4 400 7 22 4—5 37 1 7 26 5—6 33 8 28 25 6—7 285 31 20 7—8 234 32 11 8—9 191 24 14 9—10 1 53 27 13 10—11 1 13 22 5 11—12 86 23 5 12— 13 58 18 5 13 14 35 9 2 14—15 24 7 3 15;. 254.81 35 —44 0.0 030 287,994 8 63. 98 1 73, 029 519.09 45—54 0.0076 269,147 2,045.52 136 ,5 73 1, 037 .95 55—64 0.0174 216, 036 3, 759. 03 92,871 1,615.96 65—74 0. 038 2 157 ,38 5 6,012.11 63, 634 2, 430 .82 75—84. 49,1 03 1 ,32 5.78 34 ,599 934 .17 1—4 0.0011 1 93, 644 2 13. 01 132 ,36 7 145.60 5—14 0.0005 454,972 227.49 285, 830 142.92 15—24 0.0011 32 9, 230 36 2.15 186,789 205.47 25 34 0.0015 279 ,32 7 418.99 169,8 73 254.81 35 —44

Định dạng
Số trang	54
Dung lượng	293,33 KB