Play-the-Winner rule and Markov chain adaptive des- 123docz.net

propose the following design, which is well known as the play-the-Winner (PW) rule: A success on a particular treatment generates a future trial on the same treatment with a new patient. A failure on a treatment generates a future trial on the alternate treatment. Let pi = P {success | treatment i}

be the success probability of a patient on the treatment i, qi = 1 - Pi, i = l , 2 . Then

Nni q2

> a.s.

n qi+Q2 and

where

° P W = 9i92(Pi + P2)/(ôi + <?2)3, (1)

This is first discussed in Zelen [34].

MC adaptive design. An extension of the PW rule is the multi-treatment Markov chain adaptive design. Consider an K-treatment clinical trial. Sup- pose that at stage m, the m-th patient is assigned to the treatment i. Then the (m + l)-th patient will be assigned to the treatment j according certain probabilities, which depend on the response of the i-th patient. Denote this probability by hij(m). Write H „ = (hij(n))i . Obviously, Hnl ' = 1', where 1 = ( 1 , . . . , 1). Denote e$ be vector for which the i-th component is

1 and other components are 0, i = 1 , . . . , K. Then P ( Xm +i = e j | Xm = ej) = hij(m).

It follows that

E[Xm+i|^-"m] = XmHm,

where !Fm = cr(Xi,... X„). Obviously, {X„; n > 1} is a Markov chain with transition probability matrices Hn. If H „ = H for all n, then {Xn; n > 1}

is called homogenous. Otherwise, it is called non-homogeneous, and it is usually assumed that Hn —> H a.s. for some non-random matrix H. It is obvious that Ai = 1 is an eigenvalue of H. Let A2,..., XK be the other K — \ eigenvalues of H , and let A = max{ile(A2),..., RC{XK)}- We assume that A < 1. Notice that |Aằ| <l,i = l,...,K. The condition that A < 1 is not a hard condition. For example, if H is a regular transition probability matrix of a Markov chain, then |A,| < 1, i = 2 , . . . , K , and so A < 1.

We have the following asymptotic properties (c.f., Lin and Zhang [24] and Zhang [35]).

Theorem 2.1. Possibly in a richer underlying probability space in which there exists an K- dimensional standard Brownian motion { Bt} , we can redefine the sequence { Xn}; without changing its distribution, such that

Nn - n v - BnS1/2( I - H ) -1 = o f r1'2- " ) + 0(^2 HHfc - H | | ) a.s., (2) for some K > 0, where v = (ui, • • • , VK) is the left eigenvector corresponding

to the largest eigenvalue \\ = 1 o / H with v\ + .. . + VK — 1, H = H — l ' v and

S = diag(v) - H'diag(v)H.

In particular, if

£ | | Hf c- H | | = o ( n1/2) a.s., (3) fe=i

then we have the asymptotic normality:

n ' / ^ n / n - v) £ JV(0, (I - H ' )_ 1S ( I - H ) -1) . (4) Remark 2.1. In the case K = 2, write /in = a and /122 = P- Then it can

be checked that

__ . a 1 — a\ / 1 — 0 1 — a

,1-/3 0 / ' V2-a- ^ ' 2 - a - / 3 and

- . - i ™ a>-i_(i-<*)(i-ffl(<ằ + ffl/'i -1 (i - H r 'S( i - H ) - = ( 2_o_f f l 3 W l

For the PW rule, a = p\ and /? = p2. And so,

Nni - nq2/(qi + q2) = -Bn^pw + o(n1 / 2 _ K) a.s., where Bt is a one-dimensional standard Brownian motion.

The following examples are for the multi-treatment case with dichoto- mous response (success and failure).

Example 2.1. (Up-and-down design) Durham and Flournoy [8] present a response -driven up-and-down design for dose-response studies using a random walk model. Given an increasing dose-response function pj = F(XJ) = P{"success" \Xm = Xj}, m = 1 , . . . , n, j = 1 , . . . , K, and Xi taking values in the set {xk = xi + (k — 1) A, k = 1 , . . . , K}, Durham and Flournoy [8] fix the bias of a coin b = T/(l -T), 0 < T < 1/2. Given Xm = Xj, Xm+i is determined using the following allocation scheme. If the response is a failure and the coin lands heads up, assign Xm+i = £j+i; if the response is a failure and the coin lands heads down, assign Xm+x = Xj\ and if the response is a success, assign Xm+i = Xj-i (appropriate modifications are made at the boundaries x\ and XK). Then they show that the distribution of administered doses centers around the unknown quantile /x, where T = F(fi). For this design,

/ ( I - 6)9i 6ôi 0 ••• 0 pi \ w w P2 (1 - b)q2 bq2 ... 0 0

V bqK 0 0 • • • pK (1 - b)qK) So,

Nn - n v = BnS1 / 2( I - H ) -1 + o(n1/2-K) a.s.

with v = 7r, where it = (7Ti,... ,irx) is defined in Durham and Flournoy [8].

When the responses {£n} are not identical distributed, we put Pi(n) = P("success"\Xni = 1), and qi(n) = 1 — Pi(n), i = 1 , 2 , . . . , K . Assume that Pi{n) —> Pi, i = 1,2,..., K. Denote g^ = 1 — pi.

Example 2.2. ( P W C rule) Suppose that the m-th patient is assigned to the i-th treatment. If the response of the m-th patient is a success, then the (m + l)-th patient is assigned to the same treatment i. If the response of the m-th patient is a failure, then the (m + l)-th patient is assigned to the treatment (i + 1), where the treatment (K + 1) means the treatment 1. This assignment scheme is called the cyclic play the winner (PWC) rule (Hoel and Sobel [14]). It is easily seen that

0 \ 0

PK{n)J

/pi qi 0 ••• 0 \

H = 0 p2 92 ••• 0

W 0 PKl

„_( 1/9! l /g 2 l/QK \ ...

and ||H„ - H|| < CY^Li \Pi{n) - Pi\. So, by Theorem 2.1, if 0 < pt < 1, j = 1,...,K, then (2) is just

n K

N „ - n v - BnS1/2( I - H ) -1=0( n1/2-K) + 0 ( ^ E b i ( f c ) - P i | ) a.s. (6) fe=l i=l

Example 2.3. Suppose that the m-th patient is assigned to the treatment i. If the response of the m-th patient is a success, then the (m + l)-th patient is assigned to the same treatment i. If the response of the m-th patient is a failure, then the (m + l)-th patient is assigned to each of other

/ p i ( n ) qi(n) 0 ••

0 p2{n) q2(n) • •

W ( n ) 0

K — 1 treatments with probability ^ r j . It is easily seen that

Hn —

/ Pi(n) qi(n)/(K-l) q2(n)/(K-l) p2(n)

\qK(n)/(K-l)qK(n)/(K-l)

H =

/ P l ôi/(ff-l)

q2/(K - 1) p2 W / ( J T - 1) qK/(K - 1)

qM/{K-l)\

q2{n)/{K-l) PK(n) j

Q2/(K - 1)

PK J

(7)

v is the same as in (5) and ||Hn — H|| < C]Ci=i \Pi(^K n) ~Pi\- So, (6) holds whenever 0 < pi < 1, j = 1 , . . . , K.

Remark 2.2. When K = 2, the designs in Examples 2.1-2.3 are all the PW rule.

MC adaptive design with random assignment probabilities. The PWC rule and the design in Example 2.3 tend to put more patients to better treatments. However, when there is a failure on the treatment k, it should be more reasonable t o assign the next patient with higher probability to a better treatment among the other d - 1 treatments. Thus, the following adaptive is considered.

Example 2.4. Suppose the response sequence {£„} is an i.i.d. sequence, and so pk(n) = Pk for all n and k = 1 , . . . , K. If the response of the m-th patient on the treatment k is a success, we assign a coming patient to the same treatment. If the response is a failure, we assign the (m+l)-the patient for all j 7^ k, where 0 <

m-1,.7

- P m - l , J Sm_ i , j + 1

to the treatment j with probability j ^ —

a < oo, Mm_ i ,a = £ * L i P m - i , j . Pm-i,j = toZ~-i!jVi' a n d S™-iJ d e n o t e s

the number of successes of the j - t h treatment in all the Nm_ij trials of previous m - 1 stages, j = 1 , . . . ,K. Write pm_ i = (pm-i,i, • • • ,pm-i,K)- In this case, Hm = H ( pm_ ! ) , where H(x) = (hk,j(x),k,j = l,...,K) with /ifc,fc(x) =pk and hk,j(x)

^3- for k ^ j . Also,

H j , H = H<Q>

—El 0„

A fQ- p f VI Pi

_E2L_

J W o ^ P ?9 2

\ wfki^ •As** '" P* )

(8)

where Ma = Ylf=iPf< a n d

Vi=v^=: pm°-p?y* , i = 1>...,*. (9)

If a = 0, then H ^ is the as same as the one in (7) and v\a' is the as same as the one in (6). By comparing the values of v ^ s , one can find that the larger a is, the more patients will be assigned to a better treatment. So, the design Example 2.4 assigns more patients to better treatments than the PWC rule does. Also, when there is failure, the assignment is random in the design defined in this example. However, for this example, ]Cm=i ll^m — ^11 = o^1/2) does not hold. So, the asymptotic normality is not implied by Theorem 2.1. This leads us to consider the following design (c.f., Zhang [36]).

Adaptive Design 2.1 Suppose the response sequence {£„} is an i.i.d.

sequence and each response £m; is a random variable coming from a distribution family {Pệ Without losing of generality, we assume that

®i — Efmti i = 1,...,K and write 0 = ( 0 i , . . . ,0/r). Suppose the previous m — 1 patients are assigned and the responses observed. Let

E^ W * - ^ + i+ e° 'f c' k = X' • • • 'K- H e r e' 0 O = (*°.iằ' • • ' 9°.ô) i s a Su e s s e d

value of 0 , or an estimate of 0 from other early trials.

Now, if the m-th patient is assigned to the treatment i and the response observed, then we assign the (m + l)-th patient to the treatment j with probability d y ( 0m_ i , fm, i ) , j = l,...,K. That is

'\-"-m+l,j = I|~' mi •^•m,i = IiSi7i,iJ = ">ij\"m—liSm.ijj \^l

i,j = h,...,K. Here Fn = <r(Xi,... , Xn, £x, . . . ,£„_!). We also let hij(dm-i) = E[dij(0m-iằ€m,i)| Fm-i] and write H ( 0m_ i ) = {hij{0m-i))i = 1- To insure that each treatment is tested by enough patients, i.e., Nni —• oo a.s., i — 1,...,K, dij{6m-i,£,mti) shall be modified by (1 - l/m)dij(£m_i,fm ii) + \/{mK) if necessary.

T h e o r e m 2.2. For the Adaptive Design 1, suppose E\\£n\\2+5 < oo for some 6 > 0, H(x) -ằ H =: H ( 0 ) as x -> 0 and /or 5 > 0,

H(x) - H = ^ ^ T ^ (*k - ®k) + 0 ( | | x - 0||1 +*) as x -> 0 .

Let Ai = 1, A2,..., XK be the eigenvalues o / H and v be the left eigenvector corresponding to Ai = 1 with v l ' = 1. Write a\ = Var(£„fc), k = 1 , . . . , K.

Define H = H — l ' v and

£ = (diag{v) - H'diag(y)H),

ffc=v^? P = ( f {|. . . , f ^ ) ' . (11)

x=0

t 7 - - s ^ ) * ằ h L - a ằ -

If IAi| < 1, i = 2 , . . . , i f , t/ien possibly in a richer underlying probability space in which there exist two independent d-dimensional standard Brown- ian motions {Bt} and {Wt}, we can redefine the sequence { Xn, £n} , with- out changing its distribution, such that

0n - 0 = -Wndiag{ai/y/v^, ••• , <rd/Vvd) + o{n1/2~K) a.s., (12) n

Nn - nv = [ B ^1/2 + / ; 3 f * d M $ r . • • • • ^ ) F ] (! " fi)_1

+o(n1/2-K) a.5., for some K > 0. in particular,

"V\-1T?I

:t / /

A = ( I - H ' ) -1( S + 2 F ' EtP ) ( I - H ) -1, St = diag{a\lvi, ••• , o-\jvd).

Play-the-Winner rule and Markov chain adaptive designs

Maxima and boundary crossing probabilities of

Cramer type large deviations for independent random