Blog of Andrés Aravena

The Biologist Toolbox: Simulating Systems on the computer

01 April 2018

Once we have a description of a system, and a nice drawing to represent it, we can answer some interesting questions. One of the most common questions is what is the behavior of the system? In other words, we usually want to know what will happen?.

To answer this, we can use the computer. It is not hard to write a small program to simulate the system. The simulation will be an approximate answer, good enough to see the important characteristics of the system’s behavior.

In this document we will consider systems represented by graphs like in the last post “Drawing Systems”, and we will write our computer code using the R language.

The nodes have values

Figure 1. A model of the cell growth system.

For this example we will use the system represented by the graph of Figure 1.

In this graph each circle represents an item of the system. The label of the circle is the name of the item. In our example we have two circles, named cells and food. The quantity of each item is represented by a vector with the same name. In this case, the vectors are cells and food.

Each circle has also a value that can change through time, usually the quantity or concentration of the item in the system. This value has the same name as the circle label. In our example an item as the value cells and the other the value food.

In these kind of systems each item has a second value, corresponding to the change of the item’s quantity in each time-step. The name of this value starts with d_, followed by the item label. In our example these values are called d_cells and d_food.

The process are represented by boxes and are much simpler. They only have one constant value, which is a rate. The value name is made with the label of the box and the suffix _rate. So the box called replication has a constant named replication_rate.

In summary, each circle has two vectors, and each box a fixed number.

Discrete time

In the computer the time is represented by an integer that increases step by stepIn other words, we use discrete time.

. The units are arbitrary, so we can think of anything between microseconds and millennia, including years, days, hours, seconds, weeks, etc. We only assume that each time-step has the same duration.

For example in our case we can think that each time-step is one hour. Time 1 corresponds to the initial condition, that is, the beginning of the experiment. Time 2 is one hour later. Time 25 is one day later.

We can use any symbol to represent time. Most people uses letters i, j or k, since these are the letters traditionally used as index of vectors and matrices. In this example we are going to represent time with i.

The items of the system, represented by circles in the graph, have two values that change with time. We represent these variables by vectors, one element for each time-step. Thus, the number of cells at time i is cells[i], and its change in the same time is d_cells[i].

Finding the equations

To simulate the system we need to find the formulas. We start with the boxes, because they are the easy ones.

For each box in the graph we get a single term: the multiplication of the rate constant and each of the quantity variables of the circles connected by incoming arrows. If there are several incoming arrows from the same circle, then the variable is multiplied several times. The outgoing arrows are not important in this part.

In our example the formula for replication box is replication_rate * cells[i-i] * food[i-1]. Here we use the index i-1 because we do not know yet the value of cells[i] or food[i]. We will calculate them later. Metaphorically, at the begin of “today” we only know “yesterday”.

The formula for the circles is made with the sum of all the terms of boxes connected with incoming arrows, minus all the terms of boxes connected with outgoing arrows. This value is assigned to the change variable of the circle.

In our example the variable d_food[i] gets assigned the value -replication_rate * cells[i-i] * food[i-1], since the food circle has only one outgoing arrow. The cell circle has two incoming arrows from replication and one outgoing arrow to replication. Therefore the variable d_cells[i] gets the value

replication_rate * cells[i-i] * food[i-1] 
  + replication_rate * cells[i-i] * food[i-1] 
  - replication_rate * cells[i-i] * food[i-1]

which, after simplification, is replication_rate * cells[i-i] * food[i-1] resulting on a final result of one positive incoming arrow. In summary, the formulas for the change variables are:

d_food[i]  <- -replication_rate * cells[i-i] * food[i-1]
d_cells[i] <-  replication_rate * cells[i-i] * food[i-1]

Finally, the quantity variables have to be updated. Each quantity variable is the cumulative sum of the change variables. In our example we have

food[i]  <- food[i-1]  + d_food[i]
cells[i] <- cells[i-1] + d_cells[i]

Initial conditions

The last missing piece necessary for the simulation are the initial values of the circles’ variables. The value of cells[today] depends on the value of cells[yesterday], and we can only calculate that for time i greater than or equal to 2.

In our case we will use the variables cells_ini and food_ini to store the values corresponding to cells[1] and food[1]. For the change variables, we can assume that they are initially zero.

Complete code

Our simulation will be a function. The inputs are:

and the output will be a data frame with the vectors cells and food. Thus, we know that the code in R should be something like this:

cell_culture <- function(N, replication_rate, cells_init, food_ini) {
# create empty vectors
# initialize values  
  for(i in 2:N) {
    # update `d_cells` and `d_food`
    # update `cells` and `food` as a cumulative sum
  }
  return(data.frame(cells, food))
}

To create an empty vector in R we only need to know the vector sizeStrictly speaking, we do not need the exact vector size, but the program is faster if we use the correct number.

, which in this case is N. Therefore we ca write

# create empty vectors
cells <- d_cells <- rep(NA, N)
food  <- d_food  <- rep(NA, N)

Here we use a trick in R that allows us to assign the same value to two variables. The line

cells <- d_cells <- rep(NA, N)

is equivalent to

d_cells <- rep(NA, N)
cells <- rep(NA, N)

The first component of each vector needs an initial value. Our function receives cells_ini and food_ini as inputs. To make thing simple we assume that the change values start at zero, so we write

cells[1]   <- cells_init
food[1]    <- food_ini
d_cells[1] <- d_food[1] <- 0

Putting all together, we get

cell_culture <- function(N, replication_rate, cells_init, food_ini) {
  # create empty vectors
  cells <- d_cells <- rep(NA, N)
  food  <- d_food  <- rep(NA, N)
  # initialize values  
  cells[1]   <- cells_init
  food[1]    <- food_ini
  d_cells[1] <- d_food[1] <- 0
  for(i in 2:N) {
    # update `d_cells` and `d_food`
    d_cells[i] <- +replication_rate * cells[i-1] * food[i-1]
    d_food[i]  <- -replication_rate * cells[i-1] * food[i-1]
    # update `cells` and `food` as a cumulative sum
    cells[i] <- cells[i-1] + d_cells[i]
    food[i]  <- food[i-1]  + d_food[i]
  }
  return(data.frame(cells, food))
}

Other examples

This modeling and simulation technique is not limited to a particular field of science. The same rules apply to may other systems, small and big. That is why we can use examples from very different areas and still learn something useful for out own interest. Let’s see other examples

Water formation

%3
Figure 2. A model of the water formation reaction.

A chemical reaction combines hydrogen and oxygen to produce water. To simplify we assume that the reaction is this: \[2H+O\leftrightarrow H_2O\]

This system has 3 items: Hydrogen, Oxygen and Water, represented with the letters H, O and W. This reaction is reversible, so we represent it with two irreversible reactions at the same time: water formation (r_1) and water decomposition (r_2). We can represent this system with the graph in Figure 2. The R code to simulate this system is:

water_formation <- function(N, r1_rate=0.1,  r2_rate=0.1,
    H_ini=1, O_ini=1, W_ini=0) {
# first, create empty vectors to fill later
  W <- d_W <- rep(NA, N) # Water, quantity and change on each time
  H <- d_H <- rep(NA, N) # Hydrogen
  O <- d_O <- rep(NA, N) # Oxygen
# fill the initial quantities of water, hydrogen and oxygen
  W[1] <- W_ini
  H[1] <- H_ini
  O[1] <- O_ini
  d_W[1] <- d_H[1] <- d_O[1] <- 0 # the initial change is zero

  for(i in 2:N) {
    d_W[i] <-    r1_rate*H[i-1]*H[i-1]*O[i-1] - r2_rate*W[i-1]
    d_O[i] <-   -r1_rate*H[i-1]*H[i-1]*O[i-1] + r2_rate*W[i-1]
    d_H[i] <- -2*r1_rate*H[i-1]*H[i-1]*O[i-1] + 2*r2_rate*W[i-1]

    W[i] <- W[i-1] + d_W[i]
    O[i] <- O[i-1] + d_O[i]
    H[i] <- H[i-1] + d_H[i]
  }
  return(data.frame(W, H, O))
}

Predator-Prey population dynamics

cat_mouse
Figure 3. A model of the Lotka-Volterra system.

This is a system with several applications, known as “Lotka-Volterra system”. It was discovered and first analyzed by Alfred J. Lotka, and Vito Volterra, and appears again and again in ecology and population dynamics.

Here we have two items: cats and mice, and three processes: birth (of mice), catch (of mice, also reproduction of cats) and death (of cats). The system is represented by the graph in Figure 3 and simulated by the following R code:

cat_and_mouse <- function(N, birth_rate, catch_rate, death_rate) {
  mice <- d_mice <- rep(NA, N)
  cats <- d_cats <- rep(NA, N)
  mice[1] <- 1
  cats[1] <- 3
  d_mice[1] <- d_cats[1] <- 0

  for(i in 2:N){
    d_mice[i] <-  birth_rate*mice[i-1] - cats[i-1]*mice[i-1]*catch_rate
    d_cats[i] <- -death_rate*cats[i-1] + cats[i-1]*mice[i-1]*catch_rate
    mice[i] <- mice[i-1] + d_mice[i]
    cats[i]  <- cats[i-1] + d_cats[i]
  }
  return(data.frame(mice, cats))
}

Money in the bank

This one is an exercise. Consider the following system.

%3

What will be the money in the bank after 12 month if the interest rate is 1/12? What if the interest is 1/100 and the total time is 100 months?

References

Koch, Ina. “Petri Nets - A Mathematical Formalism to Analyze Chemical Reaction Networks.” Molecular Informatics 29, no. 12 (December 17, 2010): 838–43. doi:10.1002/minf.201000086.

Baez, John C., and Jacob Biamonte. “Quantum Techniques for Stochastic Mechanics,” September 17, 2012. http://arxiv.org/abs/1209.3632.