Keywords

1 Introduction

Neuronal networks—and even individual neurons—are frequently so complex that it is difficult to predict how changing one or several cellular or synaptic parameters will affect the electrical activity of the cell or circuit. This complexity is reflected in the way neurons and neuronal networks are described mathematically, with systems of coupled, non-linear differential equations that are usually well beyond analytical solution. At the level of single neurons, non-linear systems theory can provide some insights into how neuron activity is shaped by individual neuronal parameters, such as the maximal conductance of a particular membrane current (14). However, non-linear systems theory requires significant simplification of the neuronal model in question, usually reducing the number of dynamic variables that describe the system from ten or more to just two, and thus does not adequately represent the complexity and dynamical richness of biological neurons. Furthermore, non-linear systems theory becomes less and less adequate the more the complexity of a system increases, such that its application cannot easily be generalized from the neuron to the network level.

The computational exploration method described here takes a different approach by leaving the full complexity of a neuron or network model intact and by systematically exploring the dependence of the system’s behavior on its parameters through a large number—often many millions—of computational simulations (see Fig. 1). Each of these simulations assumes a different set of neuron or network parameters and determines how the neural system behaves—and responds to inputs—for that particular combination of parameters. The simulation results are then saved in a database that is searchable by neuron or network parameter and by salient features of the circuit behavior, allowing mining of the data set under different aspects (5,6).

Fig. 1
figure 10_1_978-1-59745-520-6

Examples of parameter space exploration schemes. (A) A regular grid of simulations covering the maximal conductance space of a model neuron. Only two of eight dimensions are shown. Every dot symbolizes a simulation with a different combination of maximal conductances. (B) A quasi-logarithmic grid of simulations covering the synaptic conductance space of a network model. Note the logarithmic scale. Only two of ten dimensions are shown. LP and PY designate two neurons in the network.

Beyond showing how changes in an individual parameter affect neuron or network behavior, model databases constructed in this manner have various useful applications. Perhaps the most obvious example is the identification of particular parameter combinations that reproduce a desired behavior of a neural system (5). Such functional parameter combinations have traditionally been identified by hand-tuning neuron or network models one parameter at a time (79), a process that requires significant experience and time and is not always successful. By searching a data set created through systematic exploration of a model’s parameter space, one can quickly identify parameter combinations that generate a desired behavior without the need for extensive hand-tuning.

Another advantage of model neuron or model network databases is that—rather than providing just one parameter set that results in a given behavior—they can be used to delineate entire regions of parameter space, which generate neural system output in a desired range, for example, bursting neuron activity with a burst period and burst duration within experimentally observed bounds (5,6,10,11). Analysis of the shape and extent of such functional subspaces in different directions of parameter space can provide valuable information about the sensitivity of a neural system to changes in individual parameters or combinations of parameters. In addition, such analysis tells us what properties of a neural system need to be relatively tightly tuned to produce functional behavior and what can vary more widely (6).

2 Materials

2.1 Simulation Software

Model neuron or network databases can be constructed with any simulation software that is capable of stepping through a large number of different neuron or network parameter combinations in an automated fashion, of outputting and saving the desired activity measures for each parameter set and of running on the available simulation hardware platform. Databases have in the past been constructed with custom-designed C++ programs (5,6), with NEURON (12) (see Chapter 6) and with the GENESIS (see Chapter 7 software packages(11,13), although using GENESIS under some circumstances requires re-starting the simulation program before each individual simulation because of program initialization issues.

Custom-made software, while requiring a larger initial programming investment, has the advantage of larger flexibility, which allows it to be optimized for simulation speed, a consideration that can be of crucial importance when large parameter spaces are to be explored. Furthermore, custom-made software can be programmed to include adaptive algorithms that minimize the simulation time per parameter set by monitoring the activity pattern generated by the simulated system during runtime and terminating each individual simulation as soon as initialization artifacts have decayed and a stable limit cycle is reached (see Note 1 (5).

2.2 Simulation Hardware

Database simulations of the scale described here—covering many millions of different parameter combinations—are usually executed on high-performance computer clusters. The inherently serial nature of the simulation task, however, does not require truly parallel computation, and smaller databases on the order of a few million different neurons or networks can be run with low priority in the background on the computers available in most laboratories. In fact, an earlier version of one of the first model neuron databases (5) was constructed in this manner, with the simulation of 1.7 million model neurons requiring a few weeks of background computation on 8–10 standard laboratory computers. A user-friendly client–server application that can be installed to run large numbers of NEURON simulations on any number of available laboratory computers as a screen saver is available at http://neuronpm.homeip.net. As computer power increases over the years, simulation of larger and larger databases will become more and more feasible and eventually routine.

2.3 Analysis Software

Analysis software for typical model neuron and model network databases needs to be able to detect events such as spikes and bursts and to automatically extract salient activity features such as spike or burst period, burst duration, number of spikes per burst, and relationships between the activity of different neurons in a network, such as delays between spikes in one cell relative to spikes in the other or phase relationships of synchronous oscillations in multiple neurons. Custom-made analysis software is again at an advantage, because it allows the analysis to be incorporated into the simulation process and thus does not require the saving of large amounts of voltage trace information for offline analysis. As previously mentioned, online analysis of simulation results can lead to significant reductions in the simulation time per parameter set, because it can lead to early detection of stable limit cycle activity (see Note 1).

2.4 Database Administration Software

An important part of the database approach to complex dynamical systems is that a lot of information about the activity of the model neuron or network in question get preserved for later analysis under various aspects, including questions not directly related to the initial purpose of the database. The types of information preserved in a database can include raw simulation results such as voltage traces or voltage maxima and minima (see Fig. 2) or snapshots of the model system’s dynamic variables for a point on the limit cycle (see Note 2). In addition, characteristics of the system activity that were extracted from the raw data, such as spike or burst frequencies, can also be stored in a database.

Fig. 2
figure 10_2_978-1-59745-520-6

Examples of the information saved for every parameter combination in a model neuron database. The dots indicate local voltage maxima and minima whose time and voltage values are saved. Additional information saved in the database includes duty cycle (burst duration divided by period), number of spikes per burst, area under the voltage trace per burst, resting potential (for silent models), spike frequency (for tonically spiking models), response to current injection, dynamic variable snapshot, and others.

Ideally, the software used for administering and searching a model database should allow the user to extract correlations between any combination of neuron or network parameter and activity feature. In practice, this is accomplished with custom-made C++- or MATLAB (The MathWorks, Natick, MA)-based program packages or with commercially available database programs such as Access (Microsoft, Redmond, WA).

2.5 Visualization Tools

Information contained in a model neuron or network database can be difficult to access and analyze because of the high dimensionality of the parameter space. This can be overcome with a recently developed visualization tool that makes use of a technique called “dimensional stacking” to display regularly spaced data points from a high-dimensional space in the form of a lower dimensional—typically 2D—projection that makes previously hidden structures emerge (14). Every entry in the database corresponds to one pixel in the 2D image, and the pixel color can be assigned to indicate any model characteristic of interest, such as activity type (spiking, bursting, and/or silent), discharge frequency, or spike amplitude, and so on. The example in Fig. 3 shows the models in a single neuron database (5)—or pixels—arranged in a 2D image in 6 × 6 = 36 major blocks, with each block corresponding to a different combination of the maximal conductances of the membrane currents I CaT (vertical axis) and I K(Ca) (horizontal axis), which each can take six different values. Within each of these blocks, pixels are arranged in 36 smaller blocks according to their values of the I Cas and the I Kd conductances and so on through two additional levels of organization. As Fig. 3 indicates, 2D stacking displays can reveal how different types of neuronal or network activity are distributed in parameter space.

Fig. 3
figure 10_3_978-1-59745-520-6

Dimensional stacking visualization of a model neuron database that reveals the dependence of spike frequency on the underlying conductances. Scales at the top and left indicate the stacking order. The insets show voltage traces for selected conductance combinations and a spike frequency histogram of all tonic spikers in the database.

3 Methods

This section will describe the preparation steps necessary before a model database can be constructed and the steps involved in constructing and analyzing a database. It is assumed that a fully defined model neuron or network (collectively referred to as “neural model system”) is used as a starting point.

3.1 Preparations

  1. 1.

    Decide which model parameters can vary in the database and over what ranges and which parameters to keep fixed. Ideally, one would want to vary all parameters of a neural system, but because parameters are by definition varied independently during database construction, large numbers of varied parameters quickly lead to combinatorial explosion (see Note 3). The decision which subset of parameters can vary will obviously be influenced by the type of scientific questions the database is supposed to address, but physiological constraints can also provide guidance. For example, in a mature biological neuron, the amounts of different ion channels in the neuronal membrane are likely to vary over larger ranges—because of channel turn over—than the total amount of membrane area, suggesting that varying the maximal conductances of different ion currents in a model may be more meaningful than varying the membrane capacitance. Another consideration is the amount of information available about different parameters. Some parameters may be known to be relatively tightly constrained from experimental observations, whereas others may be less well defined and may thus warrant exploration in the database. Known experimental ranges for neuron or network parameters can also be helpful in constraining the range over which a given parameter is varied.

  2. 2.

    Decide how many different values to explore for each varied parameter and how to distribute the values over the explored range. This decision calls for a compromise between computational feasibility and sufficient resolution in parameter space (see Note 4). Most existing databases have, however, been able to capture salient features of parameter space with relatively few explored values per varied parameter, typically on the order of three to six (5,6,11). Whether these values are best distributed linearly or logarithmically (or otherwise) over the covered parameter range depends on the type of parameter that is being varied (see Fig. 1). Although maximal membrane conductances are typically distributed linearly during database construction, a logarithmic distribution may make more sense for synaptic strengths, because small changes in synapse strength can have major impact when a synapse is weak, whereas even doubling the strength of an already strong synapse usually has little functional impact (15).

  3. 3.

    Generate a database simulation program that steps through a list of parameter combinations, simulates the spontaneous activity for each combination and/or activity under current injection or other perturbations (see Note 5), detects salient features of that activity (such as spikes, burst, voltage maxima and minima, etc.), classifies the activity based on these features (e.g., as silent, tonically spiking, bursting, or irregularly spiking for individual neurons or as rhythmic or non-rhythmic for a network), and saves the features, the classification result, and a snapshot of all dynamic variables in the model at the end of each simulation (see Note 2). An important consideration that can have major influence on the amount of simulation time needed per parameter combination is the choice of initial values for the dynamic variables of the system (see Note 1). A frequently used approach is to select a relatively hyperpolarized starting voltage, for example −65 mV, for each neuron in the model and to initialize all activation and inactivation variables to the steady-state values consistent with that voltage (see Note 6).

  4. 4.

    Test the simulation program on a random subset of parameter combinations from the entire parameter range to be covered with the database. Determine how long it typically takes for a simulation to reach a stable limit cycle ( see Fig. 4), whether the salient activity features are detected correctly, whether the activity is classified correctly, and whether there are unexpected types of activity whose proper detection and classification requires modifications to the simulation program.

  5. 5.

    Once the simulation program is finished, run it on a random subset of parameter combinations on the same type of processor that will run the entire database simulation to estimate how long simulation of all parameter combinations will take and how much data it will generate. If necessary, repeat steps 1 and 2 above to reduce the number of parameter combinations and thus the total simulation time, or repeat step 3 to reduce the amount of saved data per simulation.

Fig. 4
figure 10_4_978-1-59745-520-6

Models that take a long time to reach steady state after initialization. (A) A silent model neuron that takes over 30 min to reach its steady state. (B) An initially irregularly firing model neuron that suddenly enters a tonically spiking limit cycle 1 min after initialization. Horizontal bars indicate −49 mV in (A) and −50 mV in (B). Adapted from ref. 5.

3.2 Database Construction

  1. 1.

    Run the simulation program on all available processors until all parameter combinations have been simulated. Because database construction inherently does not require communication between different processors, all parameter combinations can be covered by dividing the entire list of simulations among the processors at the start of the simulation and having each processor work independently. If varying numbers of processors (on a shared cluster) or processors of different speeds are used, efficient distribution of the simulations among the different processes can be achieved with a master list of simulations that is accessed by one process at a time to acquire a short list of scheduled simulations that are then executed before that processor accesses the master list again.

  2. 2.

    Upon termination of all simulations, check whether the output files contain all combinations of parameters. Database simulations frequently run for days or weeks, and small subsets of simulations and results can get lost during simulation. It is therefore necessary to crosscheck the simulation results with the initial list of parameter combinations to determine which simulations need to be executed again.

  3. 3.

    Combine the simulation results from different processors in easily searchable files. Once all simulations for a database have been completed and missing simulations have been re-run, database entries are often distributed over a large number of files, such that finding individual entries can require long search times. Re-organizing all results in a small number of database files with a fixed order of entries can facilitate subsequent database analysis.

  4. 4.

    Verify the classification results for a random subset of database models. Neural system models, especially for exotic parameter combinations, can sometimes show behavior that was not anticipated during the initial design of the classification algorithm. For example, the silent neuron in Fig. 4A can easily be mistaken for a tonically spiking neuron if classification is based on the intervals between voltage maxima alone (5). If a significant number of models were misclassified, the classification algorithm may have to be modified and run again on saved raw activity data, such as local voltage maxima and minima, to reclassify misclassified models.

  5. 5.

    Perform offline data extraction steps. Database construction and analysis are exploratory methods, and it is not always a priori clear what types of activity feature will be worth extracting and analyzing further. By design, the general nature of the raw data typically saved during database construction, such as voltage maxima and minima, allows for later extraction of many higher-level activity features (see Fig. 2). Examples are the extraction of burst durations, number of spikes per burst, inter-spike interval profile during a burst, phase relationships between neurons in a rhythmic network, and so on, which can all be determined from the saved voltage maxima and minima offline and stored in secondary database files.

3.3 Database Analysis

The analysis steps necessary for extracting higher-level information from the saved data largely depend on the types of scientific questions addressed with the database, such that there is no general set of steps to be followed. A few possible types of analysis are described below.

  1. 1.

    Select parameter combinations that produce a desired range of behaviors. On the basis of the classification results and particular activity features, one can select a subset of models that exhibit a certain type of behavior, for example, bursting neurons with a burst period and duration within experimentally observed bounds (5). Such a subset of neuron or network models can then be analyzed for their distribution in parameter space, for example, by determining their 1D distributions over the values of one of the system parameters. If models with similar behavior show a wide distribution over the values of a given parameter, that parameter is unlikely to be a major determinant of that type of behavior, whereas a narrow distribution indicates that the behavior sensitively depends on the parameter in question(6,10).

  2. 2.

    Locate a model neuron or network that best matches a given experimentally observed behavior (5). This alternative to time-consuming hand-tuning of a model requires the definition of a goodness-of-fit measure, which can be anything from a spike-count deviation per time window to the mean-square-root deviation in membrane potential to a complex combination of many different measures. By comparing the desired activity profile to the stored information for each parameter combination, one can then determine the goodness-of-fit for each parameter combination and locate the particular point in parameter space that best reproduces the desired behavior (see Note 4).

  3. 3.

    Visualize database information through dimensional stacking. Figure 3 shows an example of dimension stacking for a single neuron database (5) and illustrates the type of information that can be gained from this analysis. For example, the figure shows that the two distinct populations of fast and slow spikers are located in separate regions of conductance space, that almost all slow spikers lack I CaT and that fast spiking—as opposed to other types of activity—is favored by small amounts of I KCa in the neuronal membrane. An important issue with dimension stacking is the stacking order, that is, which model parameters determine the highest level of image organization and which parameters determine order at lower levels. Although Fig. 3 clearly reveals the distributions of fast and slow spikers in conductance space, many other stacking orders for the same database would lead to an almost homogeneous distribution of fast and slow spikers and non-spikers in the 2D image. A stacking order optimization algorithm that overcomes that problem has recently been devised (14).

4 Notes

  1. 1.

    Different model neurons or model networks may take different amounts of time to reach a stable steady state after initialization of the dynamic variables (see Fig. 4). A simulation strategy that allows enough simulation time for each model to reach steady state while minimizing overall simulation time uses an adaptive algorithm to terminate simulation in each model as soon as a steady state is reached. This can be achieved by running simulations in brief epochs of, for example, 1-s duration and by searching the accumulated simulation results for periodic spike or burst patterns or for constant membrane potential after each epoch (5).

  2. 2.

    Database analysis can lead to the finding that additional simulations are necessary for a subset or all the parameter combinations tested. In this event, the simulation time needed for the initial state of the system to decay away can be saved if simulation can resume for each parameter combination where it left off during the initial database construction, namely on the limit cycle. It is advisable therefore to save a snapshot of each model, consisting of the values of all dynamic variables, at the end of each simulation (5,6).

  3. 3.

    The number of simulations necessary for the exploration of a complex neural system can often be dramatically reduced by breaking up the system into known subsystems. For example, for a small neuronal network consisting of identified neurons whose behavior in synaptic isolation has been experimentally characterized, the model parameter spaces of the individual neurons can be explored first to locate parameter combinations that match the experimentally observed neuron behavior (6). The full network simulation then only has to be performed for combinations of these viable subsystems, not for the full network parameter space.

  4. 4.

    High-dimensional model neuron or network databases are often constructed with fairly low parameter value resolution. However, once a region of interest in parameter space has been identified, that area can be re-sampled with higher resolution. For example, this can be done to investigate an area that produces a desired activity type in more detail or to search the parameter space vicinity of a model that optimally reproduces experimental data for parameter combinations that do even better.

  5. 5.

    As for all numerical simulations, a decision has to made as to the integration method to be used and the integration time step, but because of the large number of simulations involved in database construction, a good balance between computational speed (calling for a large time step) and accuracy (calling for a small time step) is more crucial than usual. Regardless of the integration method, a series of test simulations with a random subset of models from the database grid and various time steps should be run. On the basis of these simulations, the most efficient time step to choose is the largest one that for the vast majority of models leads to negligible deviations of the activity pattern from the “true” pattern obtained with a very small time step. The percentage of models affected by deviations and the extent of these deviations should be quantified to assess the validity of the database simulation results (5).

  6. 6.

    Other initialization schemes include initializing all activation variables to 0 and all inactivation variables to 1 (corresponding to the steady state for a very hyperpolarized membrane potential (5)) or using the final dynamic variable values from the previous simulation to initialize the next simulation during database construction. This latter approach works best for simulation strategies that step through all database entries in a way that minimizes parameter changes from one simulation to the next, because similar model parameters are likely to lead to similar limit cycles.