8 Parametric models for postnatal growth Roland C Hauspie Free University of Brussels and Luciano Molinari Kinderspital, Zürich Why model growth data? Growth can be considered as the process that mak[.]
8 Parametric models for postnatal growth Roland C Hauspie Free University of Brussels and Luciano Molinari Kinderspital, Zăurich Why model growth data? Growth can be considered as the process that makes children change in size and shape over time The dynamics of growth is best understood from the analysis of longitudinal data, i.e from serial measurements taken at regular intervals on the same subject Table 8.1 gives an example of longitudinal growth data for height of a boy measured at birth and at each birthday thereafter up to the age of 18 years Such data usually form the basis to estimate the underlying process of growth, which is supposed to be continuous Recent analysis of frequent measurements of size (at daily or weekly intervals) with high-precision techniques (such as knemometry where measurement error is about 0.1 mm) has shown that the growth process is, at microlevel, not as smooth as we usually assume (Hermanussen, 1998; Lampl, 1999) However, we may readily assume that the growth process is continuous when we are dealing with measurements taken at yearly intervals, or even 3- to 6-monthly intervals, using classical anthropometric techniques Various mathematical models have been proposed to estimate such a smooth growth curve on the basis of a set of discrete measurements of growth of the same subject over time (Marubini and Milani, 1986; Hauspie, 1989, 1998; Simondon et al., 1992; Bogin, 1999) The main goals of mathematical modelling of longitudinal growth data are: Methods in Human Growth Research, eds R C Hauspie, N Cameron and L Molinari C Cambridge University Press 2004 Published by Cambridge University Press 205 206 Roland C Hauspie and Luciano Molinari Table 8.1 Attained height (in cm) and yearly increments in height (in cm/year) of a boy taken at birth and at each subsequent birth date up to the age of 18 years; these data are used in the examples of Figures 8.1 to 8.7 Attained height Yearly increments Age Height Age Height 10 11 12 13 14 15 16 17 18 49.1 76.0 89.4 97.2 103.5 109.4 116.4 122.7 128.0 133.5 138.2 143.8 148.7 155.0 164.5 170.7 174.0 175.0 175.0 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 12.5 13.5 14.5 15.5 16.5 17.5 26.9 13.4 7.8 6.3 5.9 7.0 6.3 5.3 5.5 4.7 5.6 4.9 6.3 9.5 6.2 3.3 1.0 0.0 r To estimate the continuous growth process from a set of discontinuous measures of growth in order to obtain a smooth graphical representation of the growth curve r To estimate growth between measurement occasions (in that sense, curve fitting is an interpolation technique) r To summarize the growth data by a limited number of constants or function parameters (therefore curve fitting is also a data reduction technique) r To estimate particular milestones of the growth process (the so-called biological parameters) such as final size or age, size and velocity at take-off and at peak velocity, which characterize the shape of the growth curve and usually form the basis for further analysis r To estimate a smooth velocity curve representing instantaneous velocity (i.e by taking the mathematical first derivative of the fitted curve) r To estimate the ‘typical average’ curve in the population, such as the meanconstant curve in the case of structural growth models Parametric models for postnatal growth 207 Fitting a growth model consists of finding the set of function parameters that yield the best-fitting curve The best-fitting curve is usually estimated by the least-squares method, i.e the method that yields the curve with the smallest value for the sum of squared residuals (deviations of the observations from the fitted curve) Other parameter estimation methods may be envisaged A thorough discussion of this topic can be found in Chapter Non-structural versus structural models Broadly speaking, we can subdivide growth models into structural (or parametric) and non-structural models (Bock and Thissen, 1980) Non-structural models not postulate a particular form of the growth curve They provide smoothing techniques suppressing measurement error and short-term variation Typical examples are polynomials and cubic splines (Largo et al., 1978) Non-structural models: r r r r r Do not postulate a particular form of the growth curve Usually have a large number of parameters with no biological interpretation Do not tend to an asymptotic value Are usually unstable in the extremities of the data range Are easy to fit Figure 8.1 shows the example of the fit of a 4th-degree (Figure 8.1a) and 9thdegree polynomial (Figure 8.1b) to the yearly increments in height of the boy shown in Table 8.1 The dashed line shows the subject’s yearly increments in height (i.e the differences between height measurements, one year apart) compared to the polynomial fits It is obvious that the 4th-degree polynomial (with five parameters) does not adequately describe the yearly increments in height The 9th-degree polynomial (with ten parameters) performs much better, but still cannot correctly describe the maximum increment in height Polynomials can be considered as inadequate to fit growth data over wide age ranges, but can be used to fit growth over reasonably short-term intervals (a few years) The flexibility of polynomials can be much improved by approaches like smoothing splines, which consist of series of lower-order polynomials (3rddegree or cubic, for example) that are fitted over only a small range of ages, and which are connected by constraints of continuity, i.e equality of the first and second derivative at the points of transition or ‘knots’ These models give considerable better fits than higher-order polynomials and are better able to model local variations in the growth pattern such as the mid-growth spurt or the decrease in velocity prior to the adolescent growth spurt (Goldstein, 1984) The subjective element in this approach is the determination of the number and 208 Roland C Hauspie and Luciano Molinari 30 30 y = a0 + a1t + a2 t + a3 t + a4 t 25 25 (a) 20 Velocity, cm/year Velocity, cm/year y = a0 + a1t + a2 t + a3 t + a4 t + a5 t + a6 t + a7 t + a8 t + a9 t 15 10 (b) 20 15 10 5 0 10 Age, years 12 14 16 18 10 12 14 16 18 Age, years Figure 8.1 Yearly increments in height of the boy whose growth data are shown in Table 8.1 (a) Fit of 4th degree polynomial, (b) fit of 9th degree polynomial y = increment, t = age in years, a0 to ak the function parameters (After Hauspie and Chrzastek-Spruch, 1999.) position of the knots in the fitting procedure Other most useful developments of non-structural approaches in modelling growth are discussed in Chapter Structural or parametric models: r Imply a basic functional form of the growth model r Usually have fewer parameters that allow some functional/biological interpretation r Usually tend to an upper asymptote (final size) Structural or parametric models sometimes impose too rigid a shape on the growth data, which may result in slight, but systematic, bias They also require more sophisticated curve-fitting techniques, but most statistical and several graphical software packages nowadays offer the possibility of non-linear regression analysis of user-defined functions The estimation algorithms may vary from one type of software to another, but they are all based on iterative numeric minimization techniques and require more or less rough guesses of the values of the function parameters to be estimated, the so-called starting values Good starting values will often allow an iterative technique to converge to a solution more quickly Reaching convergence means that the numeric minimization procedure has found a minimum in the multidimensional plane of the sum of squared deviations However, it may occur that the process has encountered a local minimum but has not yet reached the absolute true minimum, which then leads to a misfit of the data The occurrence of local minima in the estimation procedures is intrinsic to non-linear regression and does not depend on the minimization algorithm, but rather on the functional form of the growth model The risk of Parametric models for postnatal growth 209 reaching convergence at a local minimum is greater if the starting values are badly chosen, the scatter in the data is large (usually not a problem in longitudinal data except for outliers due to erroneous measurements), and the range of the data is insufficient for the model at hand A robust model is one that has few or no local minima near the real minimum and hence is not too sensitive towards the choice of starting values To test for the presence of a false minimum, one can run the curve-fitting procedure with different sets of starting values of the parameters If they all converge to nearly the same solution, then it is very likely that you have found the true minimum If one set of starting values results in a substantially lower sum of squares, then you should keep them as the new starting values and repeat the procedure since it is likely that you are now nearer to the true minimum (see Chapter for a more extensive discussion of numerical minimization techniques) Usually, the population means for the function parameters can serve as starting values for these numerical minimization techniques although individual adjustments are sometimes required When studying a new population, one can utilize starting values taken from the literature Tables 8.2 and 8.3 provide sets of starting values for the models discussed in this chapter They will not necessarily be suitable to fit all growth curves in a specific population, but they can be used in a preliminary analysis of specific data A more optimal set of starting values can then be obtained from the means of the successful fits Most structural growth models are monotonously increasing functions and are therefore in the first place designed to describe growth of skeletal dimensions for which, strictly speaking, we have only positive growth For this reason, structural models are not suitable for traits such as body weight, body mass index and skinfolds, for instance The latter traits may show negative as well as positive growth, and the general shape of the growth pattern of those traits usually does not match the functional form of the models Non-structural approaches are more apt to fit those traits Most structural models designed to describe adolescent growth tend to an upper asymptote (final size) towards the end of the growth phase, and also allow for an adolescent spurt Therefore, they are suitable for postcranial skeletal dimensions (length and width measurements of the body), but perform badly for measurements of the head and face, which have virtually no adolescent growth spurt Growth in infancy and childhood A long time ago, Jenss and Bayley (1937) proposed a four-parameter nonlinear model which fits satisfactorily growth data from birth to years The formulation of the Jenss curve is as follows: y = a + bt − ec+dt 210 Roland C Hauspie and Luciano Molinari 130 130 120 Jenss-Bayley 110 Body length, cm Body length, cm 110 100 90 80 a = 79.85 b = 0.5043 c = 3.427 d = -0.09687 70 60 100 90 80 a = 48.88 b = 0.3781 c = 9.395 70 60 50 40 Count 120 50 12 24 36 48 60 72 84 Age, months 96 40 12 24 36 48 60 72 84 96 Age, months Figure 8.2 Fit of Jenss–Bayley model and Count model to the data of growth in height from birth to years of age (Table 8.1) where y is size, t is age, and a, b, c and d are the four function parameters The model has a linear component (a + bt) in which the parameter b determines the childhood growth velocity and an exponential component (ec+dt ), determining the decreasing growth rate shortly after birth The Jenss curve has been successfully applied by Deming and Washburn (1963), Manwani and Agarwal (1973), Berkey (1982), and several others The model is suitable for describing growth of body length and of various dimensions of the head (typical head circumference) during infancy and early childhood It has often been used to fit weight data as well, despite the problems that may arise when growth in weight is not monotonously increasing or has an irregular pattern, which often occurs shortly after birth Another model that fits early childhood data fairly well is the three-parameter model proposed by Count (1942, 1943), slightly modified by Livshits et al (2000) in order to allow the inclusion of birth data: y = a + bt + c ln(t + 1) where y is size, t is age, and a, b and c are the three function parameters Figure 8.2 shows the height data of Table 8.1 for ages to years with Jenss– Bayley and Count curve fittings For the purpose of fitting those two models, it is better to express the ages in months The estimates of the parameters for the respective fits are shown in the figures At first glance, both models seem to describe adequately body length during the first years of postnatal life, although visual inspection of the graphs shows a slightly better fit of the Jenss–Bayley curve The residual standard deviation (RSD) is 0.48 cm for the Parametric models for postnatal growth 211 Table 8.2 Starting values for fitting the Jenss–Bayley model and the Count model to growth of body length, weight and head circumference with numeric minimization algorithms (when age is expressed in months) Jenss–Bayley Count Head Head circumference circumference Length (cm) Weight (kg) (cm) Parameter Length (cm) Weight (kg) (cm) a b c d 83 0.5 3.5 −0.08 8.5 0.2 1.7 −0.3 50 0.03 2.6 −0.1 50 0.4 9.5 0.1 35 −0.03 4.5 Jenss–Bayley curve and 0.93 cm for the Count curve Both models are fairly robust towards the choice of starting values Table 8.2 gives a set of starting values for body length (in cm), body weight (in kg) and head circumference (in cm) when the age is expressed in months The sex differences in growth are much smaller than the normal variations in growth during infancy so that a single set of starting values suffices for both genders Berkey (1982) compared the reliability, efficiency, precision and goodnessof-fit of the Count and Jenss–Bayley models and concluded that the latter model fitted the growth data better than Count’s model, especially prior to year of age Berkey and Reed (1973) have greatly enhanced the flexibility of the Count function by adding one or more deceleration terms They proposed the following two functions: d Reed 1st order y = a + bt + c ln(t) + t d2 d1 + Reed 2nd order y = a + bt + c ln(t) + t t where y is size, t is age, and a, b, c, d and a, b, c, d1 , d2 are the parameters The Reed models can accommodate one or more inflection points (depending on the number of reciprocal terms), allowing the description of one or more periods of growth acceleration and thus fitting a wider variety of both normal and abnormal growth patterns in early childhood However, if birth is included, then chronological age since birth cannot be used, and an alternative age scale has to be chosen Berkey and Reed (1973) suggest the age transformation t = (months since birth + 9)/9, which assigns t = at conception and t = at birth They showed that the four-parameter Reed model provided significantly better overall fits than the Jenss–Bayley model, which has also four parameters Moreover, by the fact that the Reed models are linear 212 Roland C Hauspie and Luciano Molinari in their constants, they can be fitted by simpler statistical methods than the nonlinear Jenss–Bayley curve Simondon et al (1992) made an interesting comparison of five growth models to fit weight data between birth and 13 months of age Growth at adolescence Logistic and Gompertz functions The first attempts to fit the adolescent growth cycle were made by using the logistic and the Gompertz function These models are special cases of the generalized logistic model (Nelder, 1961) of which the differential equation integrates to: y = K (1 + ce−bt )1/(1−m) for m > For m > 1, this curve has an S-shape with a lower and upper asymptote, equal to zero and K, and one point of inflection Parameter b is a rate constant, determining the spread of the curve along the time axis, while parameter c is an integration constant For the purpose of fitting the adolescent growth cycle, the lower asymptote is set different from zero by adding a constant P For m = 2, the generalized logistic leads to the autocatalytic or logistic curve, which can, after reparameterization, be written in the form: y=P+ K + ea−bt where y is size, t is age, and with P, K, a = log(c) and b as stated above In the logistic model, relative growth rate (growth velocity divided by size) declines linearly with size Hence, the curve is symmetrical around its inflection point yI = P + K/2 at tI = a/b (age at peak velocity) with maximal peak velocity given by bK/4 For m = 1, the generalized logistic equation breaks down, but it can be shown that for m → 1, the model leads to the Gompertz curve, in which relative growth rate declines exponentially with size: y = P + K e−e a−bt where y is size and t is age The Gompertz curve is asymmetrical around its point of inflection: yI = P + K/e ≈ P + 0.37K, at tI = a/b (age at maximal velocity) with maximal velocity given by bK/e In both, the logistic and Gompertz function, the inflection point is functionally related to the amount of adolescent growth (respectively 50% and ±37%) Parametric models for postnatal growth 213 The logistic and Gompertz functions were used to fit the adolescent growth data of several body dimensions (Deming, 1957; Marubini et al., 1971, 1972; Tanner et al., 1976) In a longitudinal study of 35 Belgian girls (Hauspie et al., 1980), it was shown that both models fit adolescent data well with pooled residual variances of 0.45 cm2 for the logistic and 0.61 cm2 for the Gompertz function (total number of degrees of freedom 110) Nevertheless, Wilcoxon’s signed rank test revealed significantly better fits with the logistic than with the Gompertz function (P