Hubbry Logo
Nonlinear mixed-effects modelNonlinear mixed-effects modelMain
Open search
Nonlinear mixed-effects model
Community hub
Nonlinear mixed-effects model
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Nonlinear mixed-effects model
Nonlinear mixed-effects model
from Wikipedia

Nonlinear mixed-effects models constitute a class of statistical models generalizing linear mixed-effects models. Like linear mixed-effects models, they are particularly useful in settings where there are multiple measurements within the same statistical units or when there are dependencies between measurements on related statistical units. Nonlinear mixed-effects models are applied in many fields including medicine, public health, pharmacology, and ecology.[1][2]

Definition

[edit]

While any statistical model containing both fixed effects and random effects is an example of a nonlinear mixed-effects model, the most commonly used models are members of the class of nonlinear mixed-effects models for repeated measures[1]

where

  • is the number of groups/subjects,
  • is the number of observations for the th group/subject,
  • is a real-valued differentiable function of a group-specific parameter vector and a covariate vector ,
  • is modeled as a linear mixed-effects model where is a vector of fixed effects and is a vector of random effects associated with group , and
  • is a random variable describing additive noise.

Estimation

[edit]

When the model is only nonlinear in fixed effects and the random effects are Gaussian, maximum-likelihood estimation can be done using nonlinear least squares methods, although asymptotic properties of estimators and test statistics may differ from the conventional general linear model. In the more general setting, there exist several methods for doing maximum-likelihood estimation or maximum a posteriori estimation in certain classes of nonlinear mixed-effects models – typically under the assumption of normally distributed random variables. A popular approach is the Lindstrom-Bates algorithm[3] which relies on iteratively optimizing a nonlinear problem, locally linearizing the model around this optimum and then employing conventional methods from linear mixed-effects models to do maximum likelihood estimation. Stochastic approximation of the expectation-maximization algorithm gives an alternative approach for doing maximum-likelihood estimation.[4]

Applications

[edit]

Example: Disease progression modeling

[edit]

Nonlinear mixed-effects models have been used for modeling progression of disease.[5] In progressive disease, the temporal patterns of progression on outcome variables may follow a nonlinear temporal shape that is similar between patients. However, the stage of disease of an individual may not be known or only partially known from what can be measured. Therefore, a latent time variable that describe individual disease stage (i.e. where the patient is along the nonlinear mean curve) can be included in the model.

Example: Modeling cognitive decline in Alzheimer's disease

[edit]
Example of disease progression modeling of longitudinal ADAS-Cog scores using the progmod R package.[5]

Alzheimer's disease is characterized by a progressive cognitive deterioration. However, patients may differ widely in cognitive ability and reserve, so cognitive testing at a single time point can often only be used to coarsely group individuals in different stages of disease. Now suppose we have a set of longitudinal cognitive data from individuals that are each categorized as having either normal cognition (CN), mild cognitive impairment (MCI) or dementia (DEM) at the baseline visit (time corresponding to measurement ). These longitudinal trajectories can be modeled using a nonlinear mixed effects model that allows differences in disease state based on baseline categorization:

where

  • is a function that models the mean time-profile of cognitive decline whose shape is determined by the parameters ,
  • represents observation time (e.g. time since baseline in the study),
  • and are dummy variables that are 1 if individual has MCI or dementia at baseline and 0 otherwise,
  • and are parameters that model the difference in disease progression of the MCI and dementia groups relative to the cognitively normal,
  • is the difference in disease stage of individual relative to his/her baseline category, and
  • is a random variable describing additive noise.

An example of such a model with an exponential mean function fitted to longitudinal measurements of the Alzheimer's Disease Assessment Scale-Cognitive Subscale (ADAS-Cog) is shown in the box. As shown, the inclusion of fixed effects of baseline categorization (MCI or dementia relative to normal cognition) and the random effect of individual continuous disease stage aligns the trajectories of cognitive deterioration to reveal a common pattern of cognitive decline.

Example: Growth analysis

[edit]
Estimation of a mean height curve for boys from the Berkeley Growth Study with and without warping. Warping model is fitted as a nonlinear mixed-effects model using the pavpop R package.[6]

Growth phenomena often follow nonlinear patters (e.g. logistic growth, exponential growth, and hyperbolic growth). Factors such as nutrient deficiency may both directly affect the measured outcome (e.g. organisms with lack of nutrients end up smaller), but possibly also timing (e.g. organisms with lack of nutrients grow at a slower pace). If a model fails to account for the differences in timing, the estimated population-level curves may smooth out finer details due to lack of synchronization between organisms. Nonlinear mixed-effects models enable simultaneous modeling of individual differences in growth outcomes and timing.

Example: Modeling human height

[edit]

Models for estimating the mean curves of human height and weight as a function of age and the natural variation around the mean are used to create growth charts. The growth of children can however become desynchronized due to both genetic and environmental factors. For example, age at onset of puberty and its associated height spurt can vary several years between adolescents. Therefore, cross-sectional studies may underestimate the magnitude of the pubertal height spurt because age is not synchronized with biological development. The differences in biological development can be modeled using random effects that describe a mapping of observed age to a latent biological age using a so-called warping function . A simple nonlinear mixed-effects model with this structure is given by

where

  • is a function that represents the height development of a typical child as a function of age. Its shape is determined by the parameters ,
  • is the age of child corresponding to the height measurement ,
  • is a warping function that maps age to biological development to synchronize. Its shape is determined by the random effects ,
  • is a random variable describing additive variation (e.g. consistent differences in height between children and measurement noise).

There exists several methods and software packages for fitting such models. The so-called SITAR model[7] can fit such models using warping functions that are affine transformations of time (i.e. additive shifts in biological age and differences in rate of maturation). SITAR can be employed as a sophisticated data reduction tool, reducing growth trajectories to three patient level effects summarising the variability in 'size', 'timing' and 'intensity' of growth across a sample. This can facilitate the assessment of the impact of predictor variables on overall growth trajectory, and has been used to quantify the impact of antenatal steroids on child growth.[8] An alternative approach is the so-called pavpop model[6] that can fit models with smoothly-varying warping functions, an example of which is shown in the box.

Example: Population Pharmacokinetic/pharmacodynamic modeling

[edit]
Basic pharmacokinetic processes affecting the fate of ingested substances. Nonlinear mixed-effects modeling can be used to estimate the population-level effects of these processes while also modeling the individual variation between subjects.

PK/PD models for describing exposure-response relationships such as the Emax model can be formulated as nonlinear mixed-effects models.[9] The mixed-model approach allows modeling of both population level and individual differences in effects that have a nonlinear effect on the observed outcomes, for example the rate at which a compound is being metabolized or distributed in the body.

Example: COVID-19 epidemiological modeling

[edit]
Extrapolated infection trajectories of 40 countries severely affected by COVID-19 and grand (population) average through May 14th

The platform of the nonlinear mixed effect models can be used to describe infection trajectories of subjects and understand some common features shared across the subjects. In epidemiological problems, subjects can be countries, states, or counties, etc. This can be particularly useful in estimating a future trend of the epidemic in an early stage of pendemic where nearly little information is known regarding the disease.[10]

Example: Prediction of oil production curve of shale oil wells at a new location with latent kriging

[edit]
Prediction of oil production rate decline curve obtained by latent kriging. 324 training wells and two test wells in the Eagle Ford Shale Reservoir of South Texas (top left); A schematic example of a hydraulically fractured horizontal well (bottom left); Predicted curves at test wells via latent kriging method (right)

The eventual success of petroleum development projects relies on a large degree of well construction costs. As for unconventional oil and gas reservoirs, because of very low permeability, and a flow mechanism very different from that of conventional reservoirs, estimates for the well construction cost often contain high levels of uncertainty, and oil companies need to make heavy investment in the drilling and completion phase of the wells. The overall recent commercial success rate of horizontal wells in the United States is known to be 65%, which implies that only 2 out of 3 drilled wells will be commercially successful. For this reason, one of the crucial tasks of petroleum engineers is to quantify the uncertainty associated with oil or gas production from shale reservoirs, and further, to predict an approximated production behavior of a new well at a new location given specific completion data before actual drilling takes place to save a large degree of well construction costs.

The platform of the nonlinear mixed effect models can be extended to consider the spatial association by incorporating the geostatistical processes such as Gaussian process on the second stage of the model as follows:[11]

where

  • is a function that models the mean time-profile of log-scaled oil production rate whose shape is determined by the parameters . The function is obtained from taking logarithm to the rate decline curve used in decline curve analysis,
  • represents covariates obtained from the completion process of the hydraulic fracturing and horizontal directional drilling for the -th well,
  • represents the spatial location (longitude, latitude) of the -th well,
  • represents the Gaussian white noise with error variance (also called the nugget effect),
  • represents the Gaussian process with Gaussian covariance function ,
  • represents the horseshoe shrinkage prior.

The Gaussian process regressions used on the latent level (the second stage) eventually produce kriging predictors for the curve parameters that dictate the shape of the mean curve on the date level (the first level). As the kriging techniques have been employed in the latent level, this technique is called latent kriging. The right panels show the prediction results of the latent kriging method applied to the two test wells in the Eagle Ford Shale Reservoir of South Texas.

Bayesian nonlinear mixed-effects model

[edit]
Bayesian research cycle using Bayesian nonlinear mixed effects model: (a) standard research cycle and (b) Bayesian-specific workflow.[12]

The framework of Bayesian hierarchical modeling is frequently used in diverse applications. Particularly, Bayesian nonlinear mixed-effects models have recently received significant attention. A basic version of the Bayesian nonlinear mixed-effects models is represented as the following three-stage:

Stage 1: Individual-Level Model

Stage 2: Population Model

Stage 3: Prior

Here, denotes the continuous response of the -th subject at the time point , and is the -th covariate of the -th subject. Parameters involved in the model are written in Greek letters. is a known function parameterized by the -dimensional vector . Typically, is a `nonlinear' function and describes the temporal trajectory of individuals. In the model, and describe within-individual variability and between-individual variability, respectively. If Stage 3: Prior is not considered, then the model reduces to a frequentist nonlinear mixed-effect model.


A central task in the application of the Bayesian nonlinear mixed-effect models is to evaluate the posterior density:


The panel on the right displays Bayesian research cycle using Bayesian nonlinear mixed-effects model.[13] A research cycle using the Bayesian nonlinear mixed-effects model comprises two steps: (a) standard research cycle and (b) Bayesian-specific workflow. Standard research cycle involves literature review, defining a problem and specifying the research question and hypothesis. Bayesian-specific workflow comprises three sub-steps: (b)–(i) formalizing prior distributions based on background knowledge and prior elicitation; (b)–(ii) determining the likelihood function based on a nonlinear function ; and (b)–(iii) making a posterior inference. The resulting posterior inference can be used to start a new research cycle.

See also

[edit]

References

[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
A nonlinear mixed-effects model (NLME or NLMEM) is a hierarchical statistical framework that combines nonlinear regression with random effects to analyze repeated measures or clustered data, where fixed effects capture population-average relationships and random effects model individual-specific deviations from those averages. These models are defined by a nonlinear mean structure for the conditional distribution of observations given random effects, typically assuming normally distributed random effects and possibly log-normal or other distributions for residuals, enabling the estimation of both population parameters and inter-individual variability from sparse or unbalanced datasets. They are particularly suited for scenarios where the response-predictor relationship involves nonlinear functions, such as exponential decay or logistic growth, and are estimated via maximum likelihood methods using approximations like first-order conditional estimation or Laplace integration to handle the integral over random effects. The development of NLMEs originated in to address population-level inference from data with inter-subject variability, with foundational work by Lewis B. Sheiner introducing the approach in 1977 for estimating population pharmacokinetic parameters from routine data. This was expanded by Sheiner and Stuart L. Beal in a series of papers from 1980 to 1983, which proposed estimation methods for nonlinear random effects models under Michaelis-Menten, biexponential, and monoexponential kinetics, forming the basis for the NONMEM software widely used in . A broader statistical formulation for repeated measures data was formalized by Mary J. Lindstrom and Douglas M. Bates in 1990, providing a general parametric framework and iterative algorithms for that influenced subsequent implementations in software like SAS PROC NLMIXED and R's nlme package. NLMEs have become essential across disciplines for handling hierarchical data structures, including for drug concentration modeling, for population growth trajectories, and biomedical for parameter estimation in PET studies. Their flexibility allows incorporation of covariates at fixed or random levels, supporting testing on variability sources and of individual profiles, though challenges remain in model and computational intensity for complex structures. Modern extensions include semiparametric variants and Bayesian approaches to enhance robustness in high-dimensional or non-normal data settings.

Introduction

Definition and Motivation

A nonlinear mixed-effects model is a statistical framework that extends mixed-effects models to capture nonlinear relationships between predictors and response variables, incorporating both fixed effects, which represent population-level parameters, and random effects, which account for variability across individuals or groups. This approach allows for the modeling of hierarchical or clustered data where the mean response follows a nonlinear function of covariates and parameters. The primary motivation for nonlinear mixed-effects models arises from the limitations of linear models in describing real-world phenomena exhibiting inherent nonlinearity, such as in biological processes or sigmoidal dose-response curves in . These models are particularly valuable for analyzing longitudinal data, repeated measures, or clustered observations, where inter-individual variability and nonlinear patterns occur naturally, enabling more accurate inference about population characteristics while accommodating sparse or unbalanced data structures common in fields like . In standard notation, the model for the response variable yijy_{ij}, representing the jj-th observation for the ii-th individual, is given by yij=f(xij,β,bi)+ϵij,y_{ij} = f(\mathbf{x}_{ij}, \boldsymbol{\beta}, \mathbf{b}_i) + \epsilon_{ij}, where f()f(\cdot) is a nonlinear function, xij\mathbf{x}_{ij} are covariates, β\boldsymbol{\beta} denotes the vector of fixed effects, bi\mathbf{b}_i is the vector of random effects for individual ii typically distributed as biN(0,Ω)\mathbf{b}_i \sim \mathcal{N}(\mathbf{0}, \boldsymbol{\Omega}), and ϵijN(0,σ2)\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2) captures residual error. These models originated in the 1970s within to estimate population-level kinetic parameters from sparse individual data, with Lewis B. Sheiner et al. introducing the population approach in 1977. This was expanded in the , including seminal contributions by Beal and Sheiner in their 1982 paper on estimation methods, which laid the groundwork for software like NONMEM.

Comparison to Linear Mixed-Effects Models

Linear mixed-effects models (LME) form the foundational framework for analyzing clustered or longitudinal , assuming the response variable is a linear combination of fixed and random effects along with residual . The standard is yi=Xiβ+Zibi+ϵi\mathbf{y}_i = \mathbf{X}_i \boldsymbol{\beta} + \mathbf{Z}_i \mathbf{b}_i + \boldsymbol{\epsilon}_i, where yi\mathbf{y}_i is the response vector for subject ii, β\boldsymbol{\beta} represents fixed effects, biN(0,D)\mathbf{b}_i \sim N(\mathbf{0}, \mathbf{D}) are subject-specific random effects, and ϵiN(0,Ri)\boldsymbol{\epsilon}_i \sim N(\mathbf{0}, \mathbf{R}_i) is the within-subject . This linearity in parameters enables efficient estimation for with additive structures but limits applicability to scenarios where relationships deviate from straight-line trends. Nonlinear mixed-effects models (NLME) extend LME by incorporating a nonlinear function for the mean response, typically expressed as yij=f(ϕi,xij)+ϵijy_{ij} = f(\boldsymbol{\phi}_i, x_{ij}) + \epsilon_{ij}, where ff is a nonlinear form (such as logistic or exponential), xijx_{ij} is the covariate for observation jj of subject ii, and ϕi=μ+ηi\boldsymbol{\phi}_i = \boldsymbol{\mu} + \boldsymbol{\eta}_i with ηi\boldsymbol{\eta}_i as random effects influencing parameters within ff. Unlike LME, which restricts random effects to linear predictors, NLME allows these effects to modulate nonlinear parameters directly, providing a more general structure that subsumes LME as a special case when ff is linear. NLME is warranted when data violate LME's linearity assumption, such as in processes exhibiting saturation, , or asymptotic convergence, whereas LME suffices for straightforward additive linear patterns and offers simpler computation. For instance, in longitudinal studies of biological growth, LME may inadequately capture curving trajectories, necessitating NLME for accurate representation. NLME confers advantages over LME by achieving superior fits to inherently nonlinear phenomena, such as tumor progression or pharmacokinetic profiles, which often display bounded or sigmoidal behaviors. Additionally, NLME accommodates heteroscedasticity and non-normal errors more flexibly through its parametric structure, enhancing robustness in heterogeneous populations. A illustrative transition involves growth modeling: LME might specify a straight-line form y=βty = \beta t, but NLME uses an asymptotic equation like y=α(1eβt)y = \alpha (1 - e^{-\beta t}) to reflect leveling-off, better aligning with empirical curves in developmental data.

Mathematical Foundations

Model Structure

The nonlinear mixed-effects (NLME) model provides a hierarchical framework for analyzing clustered or longitudinal where the relationship between predictors and responses is nonlinear. At its core, the model consists of two levels: a within-subject (Level 1) that describes individual observations and a between-subject (Level 2) that accounts for inter-individual variability through random effects. This allows the model to capture both population-level trends and subject-specific deviations, making it particularly suitable for arising from , growth curves, or other nonlinear processes. The within-subject model is typically expressed as
yij=f(ηij,bi)+εij,y_{ij} = f(\eta_{ij}, b_i) + \varepsilon_{ij},
where yijy_{ij} is the response for the jj-th observation on the ii-th subject, f()f(\cdot) is a nonlinear function, ηij\eta_{ij} incorporates covariates such as time or dose, bib_i represents subject-specific random effects, and εij\varepsilon_{ij} is the residual error. The nonlinear function ff can take various forms depending on the application; for instance, in pharmacokinetic modeling, it often follows the Michaelis-Menten equation f=VmaxdoseKm+dosef = \frac{V_{\max} \cdot \text{dose}}{K_m + \text{dose}}, where VmaxV_{\max} and KmK_m are parameters describing maximum velocity and half-saturation constant, respectively. Alternatively, for growth data, an exponential form like f=αeβtf = \alpha e^{\beta t} may be used, with α\alpha as the initial value and β\beta as the growth rate. Covariates enter through the fixed effects parameters β\beta, which define the population mean structure within ηij\eta_{ij} and ff, while random effects bib_i allow individual deviations from this mean.
At the between-subject level, the random effects follow a multivariate normal distribution:
biN(0,Ω),b_i \sim N(0, \Omega),
where Ω\Omega is the covariance matrix capturing correlations among random effects across subjects. The residual errors are generally assumed to be normally distributed as εijN(0,σ2)\varepsilon_{ij} \sim N(0, \sigma^2), but extensions accommodate heteroscedasticity, such as σ2=σi2g(yij)\sigma^2 = \sigma_i^2 \cdot g(y_{ij}), where g()g(\cdot) is a function (e.g., a power function) that scales variance with the response magnitude to better fit data with increasing variability. This hierarchical setup ensures that the model flexibly incorporates both fixed population parameters and random inter-subject heterogeneity while maintaining the nonlinear nature of the mean response.

Fixed and Random Effects

In nonlinear mixed-effects models, fixed effects, denoted as β\boldsymbol{\beta}, represent population-level parameters that describe the across all individuals or in the study. These parameters are estimated using the entire dataset and capture systematic influences, such as covariates or baseline characteristics, that apply uniformly to the population. For instance, in a growth curve model, a fixed effect β1\beta_1 might parameterize the baseline growth rate, providing an estimate of the typical trajectory without accounting for individual deviations. Random effects, denoted as bi\mathbf{b}_i for the ii-th subject, model subject-specific deviations from the fixed effects, thereby accommodating inter-individual variability in the nonlinear response. These effects are typically assumed to follow a , biN(0,Ω)\mathbf{b}_i \sim N(\mathbf{0}, \mathbf{\Omega}), where Ω\mathbf{\Omega} is the of the random effects that quantifies the variance and correlations among them. By incorporating random effects, the model allows for heterogeneity, such as varying growth rates among individuals, while shrinking extreme estimates toward the population mean to improve inference, especially with sparse data per subject. The parameterization of random effects in nonlinear mixed-effects models offers flexibility through additive or multiplicative (e.g., allometric) forms. In the additive parameterization, random effects enter linearly within the nonlinear predictor, as in f(xij,β+bi)f(\mathbf{x}_{ij}, \boldsymbol{\beta} + \mathbf{b}_i), where deviations are added directly to fixed effects, suitable for symmetric variability around the . Alternatively, the multiplicative form scales the fixed effects, often implemented as αi=αexp(bi)\alpha_i = \alpha \exp(b_i) to ensure positivity, as in f(xij,βexp(bi))f(\mathbf{x}_{ij}, \boldsymbol{\beta} \odot \exp(\mathbf{b}_i)), which is common in pharmacokinetic or growth models to model proportional inter-individual differences. The depends on the scientific context, with multiplicative forms preferred when variability scales with the response magnitude. The Ω\mathbf{\Omega} specifies the structure among random effects, enabling modeling of dependencies such as between intercepts and slopes. A diagonal Ω\mathbf{\Omega} assumes uncorrelated random effects, simplifying but potentially overlooking covariation, as in independent deviations for baseline and rate parameters. In contrast, a full unstructured Ω\mathbf{\Omega} allows for , capturing phenomena like faster-growing individuals having higher baselines, which enhances model fit for clustered or longitudinal but increases computational demands and requires sufficient to estimate off-diagonal elements reliably. Identifiability challenges arise when random effects enter nonlinearly, as multiple combinations of fixed and random parameters may yield similar likelihoods, leading to overparameterization. Constraints, such as fixing certain variances to zero or imposing bounds on , are often necessary to ensure unique , particularly in models with few observations per subject or complex nonlinearities. Practical assessments, like evaluating shrinkage in empirical Bayes estimates, help detect unidentifiability, where high shrinkage (>30%) indicates poor separation of fixed and random components.

Likelihood Function

In nonlinear mixed-effects models, parameter estimation relies on the , which marginalizes over the random effects to facilitate population-level rather than individual-specific predictions. This approach treats the random effects as parameters, integrating them out to focus on the fixed effects and variance components. The is defined as L(ψ)=L(yb;β)p(bΩ)db,L(\psi) = \int L(y \mid b; \beta) \, p(b \mid \Omega) \, db, where ψ=(β,Ω,σ2)\psi = (\beta, \Omega, \sigma^2) collects the fixed-effects parameters β\beta, the random-effects Ω\Omega, and the residual variance σ2\sigma^2; yy denotes the observed ; L(yb;β)L(y \mid b; \beta) is the conditional likelihood of the given the random effects bb; and p(bΩ)p(b \mid \Omega) is the of the random effects, typically assumed multivariate normal with zero and Ω\Omega. Assuming independent normally distributed residuals with variance σ2\sigma^2 and normally distributed random effects, the conditional likelihood for data from NN individuals is L(yb;β)=i=1Nj=1ni12πσ2exp((yijf(xij;β,bi))22σ2),L(y \mid b; \beta) = \prod_{i=1}^N \prod_{j=1}^{n_i} \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(y_{ij} - f(x_{ij}; \beta, b_i))^2}{2\sigma^2} \right),
Add your contribution
Related Hubs
User Avatar
No comments yet.