## Abstract

Neuroendocrine systems control many of the most fundamental physiological processes, e.g., reproduction, growth, adaptations to stress, and metabolism. Each such system involves the hypothalamus, the pituitary, and a specific target gland or organ. In the quantification of the interactions among these components, biostatistical modeling has played an important role. In the present article, five key challenges to an understanding of the interactions of these systems are illustrated and discussed critically.

Physiological experiments beginning in the 1970s demonstrated that secretion of the majority of hormones occurs in a pulsatile manner. Understanding exactly how pulsatility arises, and how it is propagated in the patterns of downstream products, has proven to be very difficult. Early investigators recognized the existence of patterns of peaks and troughs in the concentration profiles of various hormones, and were able to determine that they were not simply the result of random assay variability. A major subsequent insight was that, to elucidate the significance of such patterns, one needed to determine the specific individual contributions of both secretion and elimination, inasmuch as different combinations of the two (e.g., a faster vs. a slower rate of elimination, as well as greater or lesser secretion) could produce vastly different patterns. Summary accounts of this historical evolution of hormone modeling are reviewed in Refs. 5, 8, 54, 66. Significant progress has been achieved in modeling hypothalamic-pituitary interactions, and, as a result, the challenges in quantification have become more precisely defined.

With the general physiologist as the reader, we present five key challenges to neuroendocrine modeling, while introducing (in general terms) biostatistical methods that have been developed in that context, and which are reflective of modern statistical approaches. Statistical modeling in the past had emphasized the explainability of the data, with the predictability of future data being downplayed. In today's machine-learning approach to modeling, predictability is the overriding theme; moreover, the number of parameters may be of the same order as (or larger than) the number of observations. What is viewed as more important than the number of parameters that are theoretically justified is that the problem be appropriately *regularized* (made precise below). The age-old adage that a small number of parameters is always better is not necessarily always true.

## The Neuroendocrine Systems of Reproduction, Stress, and Growth

To illustrate the core structure of neuroendocrine systems in broad terms, **FIGURE 1, TOP**, schematizes three fundamental neuroendocrine systems that are responsible for the regulation of reproduction (male, female reproductive hormone axes), physiological adaptations to stress (the stress axis), and body growth [the growth hormone (GH) axis]. For the reproductive hormone axes, for simplicity, we will focus primarily on the male system. Data from the three systems will be used as illustrations throughout the discussion. In general, the desired physiological structures comprise unobserved secretion rates, neurohoromone disappearance kinetics, time-evolving blood (serum) hormone concentrations, as well as implicit feedback and feed-forward regulation among the hormones. How to optimally recover (i.e., estimate) the elements of this overall structure, as discussed below, constitutes the basic statistical question.

The primary output of the male reproductive hormone axis is gonadal testosterone (Te), which promotes a male phenotype, along with the pituitary hormone, luteinizing hormone (LH), and upstream gonadotropin-releasing hormone (GnRH) acting as driving intermediaries and feedback targets. The active output of the stress-adaptive axis is cortisol, which is synthesized in the adrenal glands, with the pituitary hormone adrenocorticotropin hormone (ACTH) and its proximal regulators [arginine vasopressin, AVP, and corticotropin-releasing hormone (CRH)] serving as feedforward (stimulatory) and feedback-targeted intermediaries. Cortisol mediates much of the body's responsiveness to physiological stress and is a primary catabolic hormone. Biologically active outputs of the GH axis are the pituitary hormone, GH, and peripheral tissue insulin-like growth factor-1 (IGF-1), the latter synthesized primarily but not exclusively in the liver. Te, along with GH and IGF-1, are important anabolic hormones, whose reduction or loss may contribute to age-related frailty and disability (8, 62, 66).

The reproductive hormone axes are under central control of primary feedforward hypothalamic peptides, GnRH, and its proximal agonist, kisspeptin, and proximal antagonist, gonadotropin-inhibiting hormone (GnIH). In the stress-adaptive axis, the two hypothalamic peptides, CRH and AVP, act not only individually but also synergistically. In the growth axis, two hypothalamic peptides, growth hormone-releasing hormone (GHRH) and somatostatin (SRIH), act individually as well as in opposition. There are multiple interactions among the three axes, e.g., Te (after aromatization) stimulates GH via the estrogen receptor, which actively upregulates the expression of GHRH and also inhibits hepatic IGF-1 production.

The principal hypothalamic and pituitary hormones are peptides, released by exocytosis, in a pulsatile manner (the result of intermittent signaling by other biomolecules). Downstream adrenal and testicular steroids are released in a more nearly continuous manner, but, being driven by pituitary pulses, they inherit a pulsatile pattern. There are hence three key aspects to modeling a single hormone concentration profile: *1*) pulse detection; *2*) secretion (pulsatile and nonpulsatile, or nearly continuous basal secretion) and elimination (kinetics) estimation; and *3*) random error in the measurement and sampling systems, as well as potentially random variations in the hormone output process. After the formulation of the key challenges, we review the statistical approaches for these issues. There is a more than four-decade history of sundry proposed methods by which to estimate various facets of hormone dynamics. Many early methods were empirically contrived algorithms, motivated by heuristics, without underlying formalism (5, 7, 39, 42–46, 53). More recent methods are based on a statistical framework from which to test physiological hypotheses with varying degrees of specificity (1, 4, 26–29).

A key limiting issue in neuroendocrine modeling is the unobservability of regulatory hypothalamic neuropeptides (**FIGURE 1, TOP**), which, with current technology, remain largely unmeasurable (49, 66, 72). In particular, hypothalamic molecules are released into hypophyseal blood vessels connecting the hypothalamus and pituitary; the volume of the vessels is on the order of 0.1–0.5 ml in humans. What drives diffusion out of hypophyseal vessels into pituitary tissue is local neurohormone concentration and not mass, and because of the small blood volume, the concentration can be quite high, even though the mass is quite small. By the time hypothalamic molecules reach systemic blood (with a volume of 3.5–5 liters in a 70-kg adult), they have undergone a 10,000-fold dilution, including with peripherally (pancreas, intestine, or other tissues) secreted counterparts, thus impeding accurate assay of hypothalamic specific moieties (6, 49). Assuming that overall dynamics of the three neuroendocrine axes are driven by regulated hypothalamic signaling, their unobservability complicates quantitation of interactive feedback-dependent regulation. By way of illustration, **FIGURE 1, BOTTOM**, presents plots of hormone concentration profiles in peripheral blood from the three systems. For each of the three neuroendocrine systems, the hormone concentrations were all sampled every 10 min for 24 h. The first column depicts measurable LH and Te concentrations in young (blue) and older (red) men. The second column shows ACTH concentrations for a normal subject and a subject with Cushing's disease (a syndrome due to an ACTH-secreting tumor). The third column displays GH concentrations in a normal subject and a patient with acromegaly due to a GH-secreting pituitary tumor. The data in **FIGURE 1, BOTTOM**, are analyzed and illustrated throughout the paper.

## Challenges to Understanding Hypothalamo-Pituitary Interaction

Within the above context, key challenges to quantifying neurohormone dynamics in this field include the following.

*1*) The unobservability of the hypothalamic peptides has made it difficult to estimate the feedback and feedforward relationships that involve the unobserved secretion changes. A key physiological and mathematical question is whether it is possible, noninvasively, to predict the unobserved hypothalamic factor levels based on their downstream effects.

*2*) In all biological systems, there is inherent biological variation, e.g., variable and repeated desensitization, recovery and upregulation of receptor-based mechanisms, and potentially limiting boundaries such as saturation of enzymes. Unless one can build in allowances for these variations, it is very difficult to ascertain relationships, because any given relationship at one point in time could appear quite different at another. Modeling this level of randomness is a formidable challenge.

*3*) The three systems of **FIGURE 1** do not function independently, but rather their target-gland products feed back on the hypothalamic factors and pituitary cells in the different axes, as well as within the same (autologous) axis. Modeling a given axis in isolation would thus restrict explainability due to concomitant modulation by other axes.

*4*) All three of the above axes are affected, to various degrees, by circadian (nearly 24-h) rhythms, mediated via inputs from the pineal gland (melatonin) and the suprachiasmatic nucleus (SCN) of the hypothalamus. These rhythms are not simple sinusoidal functions but have more complex periodicities.

*5*) For certain hormones (e.g., GH), the manner of pulse onset is not necessarily that of initially rapid release but may rather be highly variable, requiring an allowance for significant pulse-by-pulse variability in the waveform of initial and continuing release. This complexity in the underlying modeled process requires that one accommodate a large number of parameters. Valid and precise estimation of such large parameter models is at the heart of modern statistics (and machine learning).

Given this background, we next describe, by example, the process by which one creates probability models and statistical methods to estimate the entities of interest in **FIGURE 1**, e.g., total daily mass of hormone secreted, the numbers, shapes, and locations of pulse times and their regularity, average mass secreted per pulse, the rates of elimination and a basal (baseline) rate of secretion, as well as measurement and hormone-system error (unexplained variability). Then, in the *Discussion* section, using this statistical framework, we are able to further define and respond to the above five challenges.

## A Model of Hormone Concentration

Statistical modeling in physiology (and science, generally) has historically been motivated by the paradigm: data = structure (or fit) + error. One cannot model complex data while making simple assumptions as to both the structure and the error, since what is left out of either one must be represented in the other. This introduces parameter dimensions and error correlations that markedly increase modeling difficulty. Hence, it is important to accurately model both the structure (fit term) of the process as well as inherent error. An application of these concepts and caveats to physiological processes evolving in time is the expression: *Y*(*t*) = *X*(*t*) + ε(*t*), 0 ≤ *t* ≤ *T*. Here, a critical achievement is to have the model (*X*) accurately describe the known biology (29, 36, 42, 47, 50, 57, 73, 74). In the simplest model of a system, there is only decay, and the solution is exponential. However, when there is both removal and addition simultaneously, the solution is expressed as a convolution (an integral or sum), one of the most fundamental constructions in science and engineering; reversing the process, determining the time-varying addition, having observed only the combination of simultaneous removal and addition, is known as deconvolution. In endocrinology, the equation that is central to the statistical modeling of any given blood-borne biomolecule is a time-evolving convolution. Let *Z*(*t*) denote the time-varying secretion rate of a given biomolecule into the blood at instantaneous time *t*; the units are mass per volume per unit time. Let *X*(*t*) denote the concentration (mass/volume) at time *t*, and, therefore, *X*(0) denote the concentration at *time 0*. At each time *t*, molecules added at a preceding time *s*, exit the blood, with a fraction [*E*(*t* − *s*)] remaining, where *E* represents an elimination function. The convolution is given as:
(1)

The momentary hormone concentration at any given time is therefore a balance of molecules being removed and new molecules being added. What the experimental physiologist observes is a time sampling of this process along with measurement error.

Molecules that exit the blood enter tissue fluid spaces, and then bind cellular receptors, are taken up by cells, exit the body via kidney or liver excretion, or return to the blood via capillaries or lymphatics. Moreover, there is advection and diffusion of molecules within the blood. (Advection is the spatial derivative, i.e., rate of blood flow, and diffusion is the random dispersion of molecules over time.) New additions of molecules to the circulation by basal secretion or pulsatile secretion repeatedly but briefly perturb equilibrium. The blood volume circulates through the body at 3–5 l/min, so that (roughly) blood passes (or cycles) through the entire circulation in approximately a minute. Hence, the mixing in the heart, lungs, and other organs occurs quite quickly, so that after just a few cycles there is a new equilibration. In this framework, a simple model of the time-varying concentration would be d*X*(*t*)/d*t* = −α*X*(*t*) + *Z*(*t*), which results in an exponential elimination function *E*(*t*) = *e*^{−αt}.

In Refs. 26, 36, we showed that a single exponential model tended to underestimate elimination and overestimate secretion. In general, for a large organism like the human, there needs to be two elimination rates (a bi-exponential): a fast rate and a slow rate. The fast rate α^{(1)} represents combined advection, diffusion, and mixing effects that the molecules undergo immediately upon secretion, when molecules diffuse, flow more or less linearly, and become well mixed. The slow rate α^{(2)} represents irreversible elimination from the blood by metabolic conversion, catabolism, or excretion outside the body. Thus the elimination function becomes *E*(*t*) = *ae*^{−α(1)t} + (1 − *a*)*e*^{−α(2)t}, where *a* represents the fractional amplitude of the fast term. The physiologist would often wish the rates of elimination α^{(1)} and α^{(2)} to be estimated. This is made easier if their fractions (*a* and 1 − *a*) are taken as literature-based population values. *Equation 1*, with this elimination function, can be shown to be equivalent to a two-component model, with the two different elimination rates corresponding to the two rate constants. Thus the general model for the concentrations is:
(2)

Viewed mathematically, each model is uniquely characterized by a parameter θ, the collection of parameters being the parameter set Θ. For each statistical parameter, there is a probability model *p*(A|θ), where the set *A* varies over all the possible data, describing how probable is each. Thus, *X*(*t*) in *Eqs. 1* and *2* now depends on a parameter θ. Then, what the physiologist observes are measured hormone concentrations with sampling and laboratory error (IID normal, mean zero, variance σ_{ε}^{2}):
(3)
(4)

## Overview of Statistical Methods Utilized in Endocrinology

The methods described in this section for modeling hormone concentration data provide a framework for broader statistical approaches that are more generally applicable. These are *1*) the likelihood approach, with maximum likelihood estimation (MLE); *2*) penalized likelihood and penalized MLE; *3*) Bayesian estimation, in which probability calculations are adapted to incorporate the addition of new observed data; and *4*) regularization formulations, where the parameters can be of very high dimension.

### Likelihood Approach

Once physiological data, e.g., *Y*_{1}, *Y*_{2}, . . . , *Y*_{n}, are obtained over time, then the mathematical question becomes which of the estimated models most likely produced the data? Beginning in the early 20th century, a general approach was developed, based on probability models, wherein the data are viewed as fixed, and the parameter θ varies. For clarity, given the observed data, the model-probability function is called the likelihood function:

The MLE is that value of θ(θ̂_{MLE}) at which the global statistical maximum occurs. Although optimization theory could be formulated by way of either maxima or minima, the convention (historically) is minimization. Hence, rather than maximizing *L*, one minimizes:
(5)

Standard errors of parameters θ̂_{MLE} can then be calculated, based on second derivatives of *l*.

### Penalized Likelihood

A penalty term can be appended to the *l* = −log *L*(θ*|data*) function (*Eq. 5*) to account for certain factors, which are otherwise difficult to incorporate as model parameters. Generally, these are simple functions of θ, such as the number of parameters. In this regard, if *n* is the number of observations in the data, two standard penalty terms are the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) to penalize the number of parameters (*k* = dimension of θ):
(6)

The AIC is more commonly used, in that the BIC is often overly restrictive (overly limiting parameters). In hormonal modeling (below), we have used the AIC to penalize the number of pulse times needed to describe the data.

### Bayesian Approach

The Bayesian approach presents a different interpretation of the models, distinct from that of the likelihood approach. In Bayesian constructs, knowledge of the model parameters, before and after acquiring the data, is described probabilistically. That is, before observing the data, the a priori knowledge is described by a prior probability density on Θ: p(θ), θ ∈ Θ. As in the likelihood approach, the probability models describe what might potentially be observed. Once the data are observed, Bayes' Rule tells one how to combine the likelihood function and the prior (before the fact) probabilities to acquire the posterior (after the fact) probabilities: (7)

On the general grounds of statistical decision theory, the Bayesian approach is optimal (in a certain sense). Its limitation to being broadly applied, in the past, was due to certain difficulties in its practical implementation, such as the computational burden in calculating the above normalizing constant (*C*), when the dimension of θ (*m*) was even slightly large or when the form of the posterior density did not fit into a very simple prespecified form.

In most realistic settings, the relationships among the parameter components described by the posterior density were complicated and did not lend themselves to analytical simplicity. These major barriers were overcome in the early 1980s with the development of the methods of *Stochastic Relaxation* and *Gibbs Sampling* (10, 11), and today *Markov Chain Monte Carlo* (MCMC) methods (9, 14). Contemporary algorithms of this sort are relatively simple to implement, and when run for a sufficient number of algorithmic iterations (time steps), produce one random (*m*-dimensional) observation from the posterior distribution of θ. Repeating this process a large number of times (often, by parallel processing) approximates the posterior distribution quite well. This powerful approach has become the most commonly applied statistical method; e.g., virtually all statistical genetic paradigms utilize this approach. A common calculation from the posterior density is the maximum a posteriori (MAP), which is the mode. Taking (minus) log of both sides of *Eq. 7* above, one sees somewhat of the relationship between the MAP and the MLE

### Statistical Estimation via Regularization

Machine learning and statistical learning began in the 1960s, with important applications to computer vision and image processing. Regularization theory has existed even longer as part of applied mathematics. It allows one to consider the parameter to be not finite dimensional but more general, e.g., an unknown curve.

An early contribution toward this approach was due to Tikonov (Tikonov Regularization), in which θ is a smooth function (of time or space), and ρ is the integral of its squared derivative (hence it penalizes functions that change too rapidly). The constant λ is the regularization parameter, which defines the balance between the data fit and the emphasis on regularization. The statistical methodology of cross-validation was originally invented for the purpose of estimating an appropriate value for λ, providing a strong framework for regularization procedures.

## Construction of the Likelihood Function for Hormone Data

### Modeling the Secretion Rate Z

In *Eq. 2*, even if the elimination rates were known, solving for *Z*(*t*), *t* ≥ 0, a procedure known as deconvolution, would still be an ill-posed nonunique-solutions problem unless there were complete specification of the secretion rate, *Z*(*t*). Physiologically relevant constraints on the form of *Z*(*t*) must be imposed for the resulting statistical methods to produce accurate estimates of secretion. This is the essence of the above regularization formulation. The simplest regularization imposition is to restrict the unknown function *Z*(*s*) to be defined by a finite number of parameters. In turn, the number of parameters may be allowed to grow slowly as a function of the number of observations, *n* (the method of sieves), although the manner in which to do so is still unresolved.

In such a finite-parameter approach, one could assume that there is a constant, continuous basal secretion rate, β_{0}, and a superimposed nonbasal, pulsatile secretion rate. There is a pulse time set, *T*_{m} = [*T*^{(1)}, *T*^{(2)}, . . . , *T*^{(m−1)}, *T*^{(m)}], whose times are not directly observed and must be estimated, wherein the number of pulse times (*m*) is unknown. At each pulse time *T*^{(k)}, a secretory burst occurs with mass, *M*^{(k)}. In vitro pituitary, adrenal, and testis perfusion data along with in vivo selective venous sampling data of the same organs show skewed secretion profiles, marked by an initially rapid increase, a maximum, and a subsequently slow decrease in instantaneous secretion rates (8, 54, 62, 66). The time course of the mass of hormone that is released is represented by a waveform (mass/unit volume/unit time), which is well modeled by the flexible three-parameter Gamma density, whose parameters are to be estimated from the data:

In this construction of pulsatility, the overall resulting secretion rate is then (8)

### Allowing for Circadian Rhythm Modulation of Secretion

An important question in hormone modeling is how to account for the circadian (or other longer-term) rhythm modulation underlying the data. In **FIGURE 1**, although not included for simplicity, circadian and nychthemeral (light/dark) rhythms influence unobserved GnRH, CRH, and GHRH secretion. In one approach to this issue, the secretion model is modified to allow day-night differences in the waveform, one for day (D) and one for night (N). Daytime (to be estimated) is defined by the interval [ϕ_{1}, ϕ_{2}]. If the secretory-burst waveform is given by a psi function at pulse time *T*^{(k)}, which depends on whether the pulse time occurs during the day or the night, then the day/night waveform model becomes:

Statistical methods exist for estimating circadian rhythms of this type, as presented in Refs. 15, 53.

### Constructing Putative Pulse Time Sets

Since the secretion rate function, *Z*, is predicated on burst-like hormone release occurring at certain (unobserved) pulse times, valid estimation of pulse times is strongly intertwined in the overall deconvolution problem. If one did not get the pulse times right, estimates of pulse waveform, mass, basal secretion, and elimination kinetics could be highly distorted. For a given hormone concentration profile, in the present approach by the authors, a particular pulse-time detection algorithm is applied (4, 36). To illustrate the analysis, **FIGURE 2, A AND B**, depict results of the algorithm for older male LH concentration time series (**FIGURE 2A**) and younger male Te concentration time series (**FIGURE 2B**). The detection algorithm is a nonlinear diffusion equation that selectively smoothes the time series, wherein the diffusion coefficient is inversely related to the degree of positivity (upward movement of the hormone concentrations) of the gradient. One begins with all local minima; those of rapid increase are smoothed minimally, whereas those of little increase are smoothed maximally. To visualize the outcomes of this technique, **FIGURE 2, A AND B, TOP**, display the concentrations, with all local minima as possible pulse onsets given as red asterisks. As the algorithm proceeds, a decreasing sequence of potential pulse-onset sets is constructed, reflecting the de facto degree of ease with which pulse times were individually smoothed away. The actual progression of points being removed is shown in the second plot. The putative pulse-time sets are constructed thereby as those that remain successively in the stepwise process: *T*_{m} = [*T*^{(1)}, . . . , *T*^{(m−1)}, *T*^{(m)}], *M*_{1} ≤ *m* ≤ *M*_{2}. The *M* = *M*_{2} − *M*_{1} sets (*T*_{m}) so obtained in this example are displayed as rows of the third subplot of **FIGURE 2, A AND B**. The succeeding plots give the pulse times that remain sequentially along with the concentrations as the algorithm proceeds. In **FIGURE 2**, two examples of the determination of a collection of putative pulse time sets were presented; such a construction is key to accurate recovery of the unobserved secretion rate. The mathematical details of the pulse detection algorithm are given in Refs. 4, 36. The concept was motivated by diffusion-equation models used in computer vision, wherein the goal is to identify boundaries of objects. In particular, a rapid intensity change across a visual boundary is very similar to the onset of a pulse. A comparison of multiple other earlier methods is presented in Ref. 40.

### Modeling the Secretory Masses

The model construction illustrated so far yields a large number of parameters (the number and locations of the pulse times, the masses, the waveform variables, basal secretion, and two elimination rates): 2*m* + 7. One reasonable idea for reducing this number is to view secretory-burst mass as the net result of slow accumulation of granules to be released via exocytosis (the result of the pulse onset), such that the accumulated mass is a linear function of the preceding interpulse interval length, [*T*^{(k)} − *T*^{(k − 1)}], and a random effect [*A*^{(k)}]:

Here, *m* random effects are assumed to be IID normal with mean zero and variance σ_{A}^{2}. At an intuitive level, a linear increase in mass accumulation over time (as granules accumulate for exocytosis) is one of several models that one might infer. There may be multiple biological variations that operate on or in addition to this constant accumulation rate due to granule depletion, secretory potentiation, and/or desensitization. The added random effect, as part of the secretory-burst mass, allows for such variations in burst size that are not explicitly modeled otherwise. Under this reasoning, the secretion model is thus:
(9)

Because the random effects (*A*) in the secretion rate (*Z*) enter the model linearly, the resulting likelihood for the observed concentrations (*Y*) is Gaussian. From *Eqs. 2*, *3*, and *9*, with minor algebraic rearrangements, one obtains the resulting penalized (−log) likelihood:
(10)
(11)

Notably, at this stage of the modeling, the finite-dimensional parameter θ in *Eq. 11* replaces the hard-to-implement infinite dimensional θ in *Eq. 4*.

### Penalized MLE of the Parameter θ

An Akaike Information Criterion (AIC) penalty term was appended in *Eq. 10* above, penalizing the number of pulse times, *m*. Additionally, the solution requires constrained optimization, in that any estimate, θ̂, must result in nonnegative *Z*; i.e., the conditional expectation of the secretion rate evaluated at θ̂ must be nonnegative: *E*θ̂[*Z*(*t*_{i}), *i* = 1, . . . , *n*|*Y*_{i}, *i* = 1, . . . , *n*] ≥ 0. The results of the penalized maximum likelihood estimation (penalized MLE) are θ̂, m̂, and the AIC-optimal pulse-time set *T*_{m̂} = [*T*^{(1)}, . . . , *T*^{(m̂−1)}, *T*^{(m̂)}].

One question that arises is whether one or more of the *M* − *m̂* removed pulse times should be added back to the set *T*_{m̂}, obtaining thereby a superior model? Because the pulse sets *T*_{m} were defined by successive removal of single time points, the pulse times added back could produce pulse-onset time sets different from, and possibly better than, the original *T*_{m} sets. Accordingly, the add-back procedure continues only so long as the potential add-back points enhance the fit based on the AIC penalty.

By way of model outcomes, **FIGURE 3, A AND B**, displays final results for penalized maximum likelihood estimation (PMLE) in the young male LH and Te (**FIGURE 3A**) and the older male LH and Te (FIGURE B) 24-h time series. The time axis identifies the resulting estimated pulse times. Here, for each concentration profile, and for each of its possible pulse time sets as optimized by add-back, one finds the best secretion and elimination parameter values, and penalizes the likelihood by the number of pulses therein allowed. Once the secretion rates are estimated, as in **FIGURE 3**, one can then calculate such entities of interest, as total daily basal and pulsatile secretion, and mass per pulse. (**FIGURE 3, C AND D**, are discussed below.) This model has been widely applied to modeling pituitary hormones (20, 22, 24, 30, 31, 64, 67).

### Estimating the Pulse Time Structure (Frequency and Regularity)

In the previous sections, pulse-onset times were estimated, but no inferences were drawn as to underlying structure of the pulsing mechanism. In the axes considered here, pulsing originates in the hypothalamus, the result of synchronization in the firing of a large network of neurons, all secreting the same neurotransmitter. Generally, such concerted firing occurs on the order of 10–40 times/day. The given neuropeptide is released in the hypophyseal vessels leading to the pituitary. The pituitary inherits this pulsing, which is the only direct evidence of hypothalamic pulsatility under clinical conditions. The frequency and regularity of pulsing can be fundamental to the resulting dynamics, e.g., in the premenopausal female GnRH-LH axis, increased GnRH pulse frequency is important for ovulation to occur; even though the mass of each pulse may be small, the rapid firing produces a large cumulative mass. In previous work, the interpulse intervals IPI_{k} = *T*^{(k+1)} − *T*^{(k)}, *k* = 1, . . . , *m* − 1 (where there are *m* pulse times) have been modeled as a Weibull renewal process. That is, the IPI_{k} is assumed to be IID from a Weibull distribution. There are two parameters in the Weibull (both positive values): pulse frequency λ and regularity γ. The larger the value of γ (above 1), the more regular the pulse times, eventually becoming equally spaced (and nonrandom). The Poisson process is a subcategory of this process, wherein γ = 1. As a general statement, hypothalamo-pituitary pulsing is more regular than a Poisson. Indeed, regularity similar to the Poisson may indicate a loss of physiological regulation as observed in older men for LH dynamics.

**FIGURE 4, A AND B**, illustrates the Weibull concept, showing the results of stepwise pulse detection (**FIGURE 4A**) and then penalized MLE estimation in the case of ACTH concentrations (**FIGURE 4B**) in a Cushing's disease subject. The first three subpanels on the *right* are the model fit to the concentrations, the estimated secretion rates, and the day and night ψ functions, which describe a circadian rhythm as modifying mass release. The fourth subpanel exhibits the corresponding estimate of the Weibull fit to the IPI_{k} (the asterisks on the *x*-axis). The statistical theory for modeling interpulse intervals via renewal processes is given in Refs. 4, 12, 26, 36, 40, 66, 69.

### A Bayesian Formulation of Hormone Kinetics and Secretion Estimation

The statistical formulations for estimating hormone kinetic and secretion were given in the previous sections as maximum likelihood (MLE) or penalized MLE methods, in which the penalty term applied to the number of allowable pulse times. An alternative approach to estimation, which has many favorable properties, is a Bayesian approach. In this formulation, the parameters are viewed as being selected by a probability distribution, hence as a random variable. Before the data are observed (before the experiment is performed), a prior probability distribution describes the a priori knowledge obtained potentially from other experiments and/or observations concerning the probability model (i.e., the model parameters). Once the data are observed, one can calculate the likelihood of a given model based on such prior knowledge and the observed data. Bayes Rule is a formula that describes how to combine the prior knowledge with the information inherent in the observed data, resulting in a posterior probability distribution, which is an updated probability assessment of the model parameters. **FIGURE 4C** illustrates the results of a Bayesian analysis of the Cushing's disease ACTH concentration time series, which differ from normal physiological ACTH release patterns. The posterior distribution is a distribution on the (multiple) parameters jointly, and thus is not easy to plot. The *left* subpanels display the marginal posterior distributions for the various parameters. If one randomly generates parameters (called sampling) from the posterior distribution to create secretion rates *Z*(*t*_{i}) and resulting concentrations *X*(*t*_{i}), *i* = 1, . . . ,*n* (*Eqs. 2* and *9*), then one can undertake sampling repeatedly from the posterior to create a whole collection of secretion curves *Z* and concentration curves *X*. This collection can be viewed capturing much of the secretion/elimination process probability. **FIGURE 4D** presents such plots for pulsatile ACTH data. In the Bayesian formulation, probability statements are made (assuming that the prior and likelihood model are appropriate) as distinguished from statements of confidence (e.g., confidence sets) in a likelihood-based approach. To achieve this end, Markov Chain Monte Carlo (MCMC) simulation has become one of the most effectively applied statistical methods, which are generalized in Refs. 4, 36. In **FIGURE 4**, the key steps of a Bayesian analysis are displayed, with the obtainment of the posterior distributions for the parameters being the overall objective, from which probability statements can then be made.

### An Estimation Approach That Allows for Pulse-By-Pulse Flexibility in Pulse Shape

In the previous sections, whether the statistical approach was likelihood-based (penalized MLE) or Bayesian, the model of hormone secretion assumed that there was one estimable and prototypical secretory-pulse shape (Ψ) for any given sampled hormone time series in any given individual under any given condition of physiological experimentation (or two shapes, as above, when separate versions for day and night are considered). The secretion rate over time within bursts was this one shape, Ψ, scaled larger or smaller by the amount of mass released in that pulse. For many hormones, this model works well, e.g., those proteins whose onset of release (exocytosis) occurs rapidly and then decays more slowly, consistent with a smoothly varying waveform. However, for pulses of hormones (e.g., GH) that often have a slow gradual or a stuttering onset, reproducible identification of potential pulses with allowance for the nonuniform pulse-by-pulse variation in onset or overall shape requires more flexible models.

Motivated by these issues, a more flexible model is currently under investigation: (12)

where the pulse shape ψ_{k} is now allowed to vary pulse-by-pulse. The masses, *M*^{(k)}, *k* = 1, . . . , *m*, are estimated as parameters. As one could expect, the number of parameters can become quite large, relative to the amount of data. If there are *m* (potential) pulse times, then there are, in addition, *m* masses, 3*m* pulse shape parameters, a basal rate, and an error standard deviation: 5*m* + 2 parameters. Fifty years ago, a large dimensional parameter space was often viewed as placing too much emphasis on explaining the data, as opposed to predictability of the model. Subsequent developments in machine learning and statistical learning have shown that the dimension is not necessarily the pivotal factor but, rather, attaining an appropriate degree of regularization, such that the allowable values of the parameters are constrained locally and globally. In **FIGURE 5**, this tentative model is applied to the two GH concentration profiles in **FIGURE 1**. One can visualize that the GH-normal concentration time series is quite different from the others in the figure (**FIGURE 1**).

To accommodate greater pulse-by-pulse flexibility, the identification of putative pulse times cannot be determined exclusively by the degree of rapid increase in concentration but also by other local and global information. For example, at a given local minimum, one might use the information of the change in height from that minimum to the next local maximum; the greater that value (which is global in nature), the stronger the evidence of the local minimum's being a pulse-onset time. Furthering the degree of an increase or a decrease, relative to preceding changes, can indicate whether (allowing for randomness) there must have been an increase in secretion. A pulse-detection procedure built on these and other criteria has been developed. Results from this enhanced-flexibility model are displayed in **FIGURE 5** for the GH profiles (**FIGURE 5, A AND B**, normal; **FIGURE 5, C AND D**, GH-pituitary tumor). Examples of the statistical modeling of GH also appear in Refs. 58–61, 68.

### Modeling Pituitary Feedforward on Target-Gland Secretion (Allowing for Hysteresis)

Whereas feedforward concepts are applicable to any of the axes of **FIGURE 1**, here we focus on the LH-Te relationship for simplicity. To begin to reconstruct feedforward by pituitary LH in each of the young and older males, one may apply the penalized likelihood estimation method to LH concentrations and to Te concentrations, individually. This step yields fitted (estimated or reconvolved) LH concentrations *Ŷ*_{L}_{,}_{i} = 1, . . . , *n*, which form the basis for the LH feedforward signal on Te secretion. Likewise, one obtains the estimated Te secretion rate *Ẑ*_{Te,i}, *i* = 1, . . . , *n*. The next objective is estimation of the dose-response mechanism of (function mediating) LH on Te secretion. Two considerations arise immediately in this context: *1*) there is a varying time delay between the onset of an LH pulse and the Te secretory response, and *2*) there may be variable loss of responsiveness of Te to the LH-feedforward signal. If one hopes to accurately recover the dose-response structure, both of these considerations must be addressed and resolved.

Utilizing the fitted (reconvolved) LH concentrations *Ŷ*_{L,i}, *i* = 1, . . . , *n*, the LH feedforward signal *F*_{L}(*t*) is constructed piecewise from one Te pulse onset time to the next; this allows one to account for the varying time lag between an LH pulse onset and a Te pulse onset. For each Te pulse time *T*_{Te}^{(k)}, the LH pulse *T*_{L}^{(j)} nearest within an allowable time, e.g., (−60, 10) min, is identified. If such an LH pulse exists, then it is shifted to the Te onset time point so that the two onset points are aligned. If no such pulse existed within the time interval, then a time lag of 40 min is applied in this example of LH concentrations, starting at the Te pulse onset time. One may also allow for the possibility that the LH pulse might slightly (10 min) follow the Te pulse, putatively via CNS outflow pathways to the testis. **FIGURE 3, A AND B, TOP**, displays the fitted LH concentrations, which, allowing for time delays, are interpreted as the LH feedforward signal on Te secretion.

The estimation then proceeds by using the calculated Te secretion rate *Ẑ*_{Te,i}(*i* = 1, . . . ,*n*), assumed to be *Z*_{Te}(*t*_{i}) + error. Random effects (*A*) in feedforward (LH) efficacy (maximal drive) are included to accommodate pulse-by-pulse variation. To allow for desensitization (the loss of slope responsiveness) to LH drive, a mid-pulse shift in the slope of the dose response is included. The shift is assumed to occur at time *M*_{LonTe} (to be estimated) following the Te pulse onset. For *M*_{LonTe} time, the response is via one response curve and, after that, for the remainder of the Te response via a slope-shifted curve. Hence, a recycling-like phenomenon occurs, with the system desensitizing within a pulse and then resetting between pulses to the initial curve before the next secretory pulse. In a similar manner, the short-term hyporesponsiveness observed within a LH-Te pulse pair can be modeled by way of a finite right shift of the estimated dose-response curve, viz., a potency shift. We denote this dynamic as hysteresis. Four models are considered: *model 1*, no hysteresis shift; *model 2*, shift in half-maximally effective stimulus concentration (LH potency); *model 3*, shift in dose-response slope (testes sensitivity); and *model 4*, shift in the asymptotic maximum (LH efficacy). For simplicity, we display only *models 1* and *4* below.

*Model 1*: no hysteresis:
(13)

*Model 4*: asymptotic maximum (LH efficacy):

**FIGURE 3, C AND D**, presents LH-Te feedforward estimations for all four of the above models for the young (**FIGURE 3C**) and older (**FIGURE 3D**). The solid blue line is a logistic curve, denoting the estimated initial (before downregulation) mean dose response across all the pulses considered over the 24-h profile; if there is downregulation, the mean dose-response, averaging over all pulses, is the solid red curve. Complementary to the mean values in any given time-series pair, we also present the individual pulse-by-pulse realizations of such curves, each with the allowed random effect in potency, sensitivity or efficacy. That is, for each pulse, the estimated dose-response curve, before downregulation (blue dashed curves) and after downregulation (red dashed curves, the hysteresis effect for *models 2–4*), are plotted. This model is intended to provide greater understanding of dynamic (time-evolving) pituitary feedforward on a target gland in health and disease, as applied to date to LH-Te (16, 19, 27, 32, 34, 35, 41, 48, 63, 70) and to ACTH-cortisol (3, 17, 21, 23, 25, 33, 41, 56).

## Discussion

Each of the five challenges listed at the beginning is important. The solution to *challenge 1*, estimation of unobserved signals operating within a physiological endocrine system, would have the most immediate, direct scientific effect. For certain nonhuman species (horse, sheep), hypothalamic peptides have been measured repeatedly over time, with the output being observed for all three nodes: hypothalamus, pituitary, and target gland (16, 17, 19). Because of the inherent biological variation in the timing and effects of the signaling by and on the different nodes, the relationships are not straightforward. However, recent statistical modeling efforts have detected reproducible overall patterns, with informative and novel physiological insights resulting. In the human, however, hypothalamic signaling is virtually never observed, and one must obtain information on brain-pituitary relationships by circumvention. One approach, when available, is the use of graded doses of highly selective competitive antagonists for the hypothalamic factor that acts on the specific given pituitary cell type. For example, the GnRH receptor on the pituitary gonadotropes is a G-protein-coupled receptor (GPCR). Ganirelix (GRX) is a selective pure GnRH-receptor antagonist with an especially long half-life. GnRH and GRX molecules compete for binding at the GnRH receptor. Under these conditions, pharmacologically defined as competitive antagonism, the GnRH-LH dose-response curve shifts to the right, i.e., there is a potency change. Increasing doses of GRX thus produce a progressive rightward shift of the curve. Exploiting the statistical methods in the preceding sections, and given LH concentrations measured at frequent intervals over multiple hours for each GRX level, one then can estimate LH secretion rates for each GRX dose. If one also observes GRX concentrations in the blood and injects and measures GnRH along with the reduced Te concentrations (due to reduced LH), then one can piece all the above together and make statistical statements about otherwise unobserved endogenous (not injected) GnRH feedforward on LH and Te's negative feedback on GnRH. This approach appeared in Ref. 19. Similarly, one can block the body's production of Te through the use of ketoconazole (an antifungal steroidogenic inhibitor drug) and add back fixed, graded iv infusions of crystalline testosterone. Information about GnRH feedforward on LH could be inferred from such studies. Thus this approach to recovering unobserved hypothalamic dynamics involves perturbation of steady state to multiple, new (temporary) steady states.

Hypothalamic signaling to the pituitary in the stress-adaptive ACTH-adrenal axis and in the GH-IGF-I axis is more complex than that in the male reproductive axis, and fewer precise antagonists are available as yet. Thus the present difficulty level is greater in these cases. Additionally, there are strong dynamic downregulating differences between the stress or GH axes and the GnRH-LH-Te axis. Specifically, constant, physiological-level infusion of GnRH shuts down GnRH-LH drive (this is the basis for various treatments for prostate hyperplasia and cancer), whereas constant supraphysiological infusions of CRH or AVP (stress axis) or GHRH (GH axis) result in high ACTH and GH secretion with sustained pituitary pulses. An additional issue with the stress axis is that both CRH and AVP stimulate ACTH secretion, and together exert a strongly synergistic effect on ACTH secretion. Hence, for a given ACTH pulse, one cannot determine a priori whether it was due to an uninfused CRH pulse, an uninfused AVP pulse, or both. A method has been developed to make such inferences based on infusing either CRH, AVP, both, or neither (four combinations) at a constant supraphysiological level (17, 38). For each combination, the entire system is perturbed to a (temporary) new steady state, for which the effects of secreted CRH and secreted AVP on ACTH secretion are determinable individually and together.

In general, if one does not account for the inherent biological variation that is part of every biological system (*challenge 2*), one cannot extract the general patterns of regularity. Neglect of inherent biological randomness in the model would create inferences/estimates in which such physiological dynamics are hidden or misrepresented. In model-deficient circumstances, dynamic adaptations within the axis over time, such as desensitization and potentiation, could force estimates that appear to contradict the observed time-evolving hormone pattern. In the preceding formulations of statistical models, the inclusion of estimable random effects in *Eqs. 9* and *13* is designed to allow for such inherent variations. The allowance for hysteresis-like behavior is also intended to accommodate the inherent biological variability. From a statistical perspective, in *Eq. 9* (the secretion model for a single hormone) and for the first two models of *Eq. 13* (no hysteresis and a two-efficacy model), estimable random effects enter in a linear manner, and statistical estimation methods can be developed in a straightforward manner. When the necessary random effects enter into the model nonlinearly (the two-potency and two-sensitivity models of *Eq. 7*), no standard approaches are available. The authors have developed several iterative methods, but there are a host of unanswered issues involved. How to allow for inherent biological variability in physiological systems generally, and neuroendocrine systems specifically, remains an enormous, currently incompletely answered challenge.

Estimating interactions among the multiple neuroendocrine systems (*challenge 3*) is perhaps the stated challenge that will be the last to be fulfilled, in part because of the cost and labor intensity of such studies. Until relatively recently, a significant amount of serum (∼0.5 ml) was needed to assay hormone concentrations, with the amount of blood that can safely be withdrawn over a 2-wk period being ∼1 liter. For accurate modeling estimates, one needs to sample at intervals that are significantly shorter than the hormone's half-life, otherwise significant amounts of hormone could be secreted and removed without detection. These limitations have resulted in a conventional pituitary (but not insulin or glucagon) protocol of sampling every 10 min for 24 h (145 samples/day). Sampling over a full day is important to have some assessment of circadian modulation, but multiple days will be important to understand longer-term effects. For the three neuroendocrine axes in **FIGURE 1**, if one wishes to ascertain their multiple interactions, one would need to assay LH, Te (in the male axis), ACTH, cortisol, and GH at a minimum. IGF-1, because of the large number of carrier (binding) proteins and its more stable pattern, has been difficult to model and is often not included as a consequence. The major difficulty, besides cost, is the number of unobserved hypothalamic factors (GnRH, CRH, AVP, SRIF, and GHRH, as a minimum) that govern the dynamics of these three systems.

Most neuroendocrine ensembles are affected by circadian (∼24 h) and ultradian (<24 h) rhythms (*challenge 4*). Recurring rhythms have effects at multiple levels, including receptor regulation, feedback intensity, second-messenger signaling, and gene transcription. Thus inclusion of circadian rhythm modulation in a model must be at the appropriate level. If one merely estimated a trigonometric function to approximate the concentration profile, the resultant model would likely alter the accuracy and meaningfulness of other estimates. Moreover, circadian patterns are not described by simple sinusoidal functions; one segment (day or night) is often longer or higher than the other. If one does not properly incorporate the underlying rhythms, the resulting calculated entities (half-lives, secreted mass, etc.) can be inaccurate. In the statistical methods described above, an allowance for circadian rhythm modulation on secretion was incorporated in two ways. First, it was incorporated by including two mass release waveforms *(*psi functions), one for day and one for night, with the decomposition of the 24 h into day and night being estimated. Second, waveform variability over 24 h was generalized by allowing each pulse to have a unique shape (*challenge 5*). Nonetheless, these propositions are not based explicitly on the biology of the underlying rhythms or their mechanisms of action. Accordingly, improved biologically framed models will be necessary to properly model feedforward and feedback neuroendocrine regulation.

In summary, we have suggested five key challenges to the quantification of regulated pulsatility of hypothalamo-pituitary hormones. Two main challenges deal with the unobservability of hypothalamic factors; potential solutions in those two examples are based on the specific anatomic and physiological structures of the GnRH-LH-Te axis and the CRH-AVP-ACTH-cortisol axis. At present, there is no universal approach to drawing inferences about unobserved hypothalamic signaling. Until there are significant developments in quantifying the time-varying concentrations of otherwise unmeasurable hypothalamic neuropeptides (e.g., GnRH, CRH, GHRH), the capacity to draw inferences about regulatory interactions within and among the various neuroendocrine axes depends on a novel interplay between experimental design and the prediction of the unobserved based on the observed.

In conclusion, further solutions to the complex challenges of evaluating dynamic physiological signaling ensembles will require innovative new models built on the underlying biological structure, including inherent biological variation due to such phenomena as amplification and desensitization, modulation by circadian rhythms, mutual interplay among various neuroendocrine axes, and time-varying pulse shape, pulse size, basal secretion, and (possibly) disappearance kinetics.

## Acknowledgments

We thank Jill Smith for support of manuscript preparation. Supported in part by National Institutes of Health Grants R01 AG-019695, R01 DK-073148, R01 AG-029362, and R01 AG-031763. The project described was supported by UL1 TR000135 from the National Center for Advancing Translational Sciences (NCATS). Contents are solely the responsibility of the authors and do not necessarily represent the official views of any federal institution.

## Footnotes

No conflicts of interest, financial or otherwise, are declared by the author(s).

- ©2016 Int. Union Physiol. Sci./Am. Physiol. Soc.