ETHZ/Modeling Basics

From 2007.igem.org

< ETHZ(Difference between revisions)
(Modeling Basics)
 
(18 intermediate revisions not shown)
Line 1: Line 1:
-
<center>[[Image:Eth_zh_logo_4.png|830px]]</center>
+
[[Image:ETHZ_banner.png|830px]]
-
 
+
<!--
<center>[[ETHZ | Main Page]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Model | System Modeling]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Simulation | Simulations]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Biology | System Implementation]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Biology/Lab| Lab Notes]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Meet_the_team | Meet the Team]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Internal | Team Notes]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Pictures | Pictures!]]</center><br>
<center>[[ETHZ | Main Page]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Model | System Modeling]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Simulation | Simulations]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Biology | System Implementation]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Biology/Lab| Lab Notes]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Meet_the_team | Meet the Team]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Internal | Team Notes]] &nbsp;&nbsp;&nbsp;&nbsp; [[ETHZ/Pictures | Pictures!]]</center><br>
 +
-->
 +
__NOTOC__
 +
<html>
 +
<script type="text/javascript" src="http://christos.bergeles.net/eth_dropdowntabs.js">
 +
 +
/***********************************************
 +
* Drop Down Tabs Menu- (c) Dynamic Drive DHTML code library (www.dynamicdrive.com)
 +
* This notice MUST stay intact for legal use
 +
* Visit Dynamic Drive at http://www.dynamicdrive.com/ for full source code
 +
***********************************************/
 +
 +
</script>
 +
 +
<!-- CSS for Drop Down Tabs Menu #1 -->
 +
<link rel="stylesheet" type="text/css" href="http://christos.bergeles.net/eth_ddcolortabs.css" />
 +
<div id="colortab" class="ddcolortabs">
 +
<ul>
 +
<li><a href="https://2007.igem.org/wiki/index.php?title=ETHZ" title="Home" rel="dropmenu_home"><span>Home</span></a></li>
 +
<li><a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Model" title="Modeling" rel="dropmenu_modeling"><span>System Modeling</span></a></li>
 +
<li><a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Simulation" title="Simulations" rel="dropmenu_simulation"><span>Simulations</span></a></li>
 +
<li><a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology" title="System Implementation" rel="dropmenu_biology"><span>System Implementation</span></a></li>
 +
<li><a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Meet_the_team" title="Meet the team" rel="dropmenu_meettheteam"><span>Meet the team</span></a></li>
 +
<li><a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Pictures" title="Pictures!" rel="dropmenu_pictures"><span>Pictures!</span></a></li>
 +
</ul>
 +
</div>
 +
<div class="ddcolortabsline">&nbsp;</div>
 +
 +
 +
<!--1st drop down menu -->
 +
<div id="dropmenu_home" class="dropmenudiv_a">
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ#Introduction">Introduction</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ#Team_Members">Team Members</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ#Acknowledgments">Acknowledgments</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ#Site_Map">Site map</a>
 +
</div>
 +
 +
 +
<!--2nd drop down menu -->
 +
<div id="dropmenu_modeling" class="dropmenudiv_a" style="width: 150px;">
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Model#Introduction">Introduction</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Model#Model_Overview">Model Overview</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Model#Detailed_Model">Detailed Model</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Model#Final_Model">Final Model</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Modeling_Basics">Modeling Basics Page</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Model#Mathematical_Model">Mathematical Model</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/FSM">FSM View Page</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/FlipFlop">Flip-Flop View Page</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Parameters">Parameters Page</a>
 +
</div>
 +
 +
<!--3rd drop down menu -->
 +
<div id="dropmenu_simulation" class="dropmenudiv_a" style="width: 150px;">
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Simulation#Introduction">Introduction</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Simulation#Simulation_of_Test_Cases">Test Cases</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Simulation#Sensitivity_Analysis">Sensitivity Analysis</a>
 +
</div>
 +
 +
<!--4th drop down menu -->
 +
<div id="dropmenu_biology" class="dropmenudiv_a" style="width: 150px;">
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology#Introduction">Introduction</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology#The_Complete_System">The Complete System</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology#System_Phases">System Phases</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology#Current_Cloning_Status">Current Cloning Status</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology/parts">System Parts Page</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Biology/Lab">Lab Notes Page</a>
 +
</div>
 +
 +
<!--5th drop down menu -->
 +
<div id="dropmenu_meettheteam" class="dropmenudiv_a" style="width: 150px;">
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Meet_the_team#The_ETH_Zurich_07_Team">The ETH Zurich 07 Team</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Meet_the_team#Team_Description">Team Description</a>
 +
<a href="https://2007.igem.org/wiki/index.php?title=ETHZ/Internal">Brainstorming Page</a>
 +
</div>
 +
 +
<script type="text/javascript">
 +
//SYNTAX: tabdropdown.init("menu_id", [integer OR "auto"])
 +
tabdropdown.init("colortab", 3)
 +
</script>
-
HEADER. MENU, HIDE TOC, CHANGE FONTS, PROOF READ!
+
</html>
__NOTOC__
__NOTOC__
Line 9: Line 87:
=Modeling Basics=
=Modeling Basics=
-
The functioning of our model depends mostly on ''protein concentrations''. These proteins are produced within the ''E. coli'' cells, based on genes that we introduced into the cell. To understand the system, it is crucial to model the gene expression and regulation accurately.
+
The functioning of our model depends mostly on ''protein concentrations''. These proteins are produced within the ''E.coli'' cells, based on genes that we introduced. To understand the system, it is crucial to model the gene expression, gene regulation, and the resulting gene product concentrations accurately.
This Wiki page is intended to present the basic mechanisms and assumptions that went into the mathematical description of our iGEM model.
This Wiki page is intended to present the basic mechanisms and assumptions that went into the mathematical description of our iGEM model.
Line 15: Line 93:
== Constitutive Protein Production ==
== Constitutive Protein Production ==
-
The most simple parts in the system are genes that are continuously transcribed to produce a protein. At the same time, proteins have a certain half-life time, which means that they are degraded. This leads to the following simple model of protein production/degradation shown in Fig. 1.
+
In the most simple case, proteins are produced through continuous transcription of genes. At the same time, proteins have a certain half-life time, which means that they are degraded. This leads to the following simple model of protein production/degradation shown in Fig. 1.
-
[[Image:basic_fig01.png|thumb|<b>Fig. 1</b>: System of a constitutively produced protein P. Production rate is c<sup>max</sup> and degradation rate is d<sub>P</sub>.]]
+
[[Image:basic_fig01.png|thumb|<b>Fig. 1</b>: A system constitutively producing protein P. The production rate is c<sup>max</sup> and the degradation rate is d<sub>P</sub>.]]
-
To find the concentration of protein P (as a function of time), the system of Fig. 1 can be written as ordinary differential equation (ODE)
+
To find the concentration of protein P (as a function of time), the system of Fig. 1 can be written as an [http://en.wikipedia.org/wiki/Ordinary_differential_equation ordinary differential equation] (ODE):
[[Image:basic_eq01.png|center|150px]]
[[Image:basic_eq01.png|center|150px]]
Line 25: Line 103:
This equation states that the change of protein concentration is a function of protein production (c<sup>max</sup>) and protein degradation (d<sub>P</sub>[P]).  
This equation states that the change of protein concentration is a function of protein production (c<sup>max</sup>) and protein degradation (d<sub>P</sub>[P]).  
-
It is worth looking at the protein production a bit closer. The production of protein P depends on the expression of a gene that codes for this protein. In the case of constant protein production, the gene can be modeled by a ''constitutive promoter'' and a coding region for the protein (Fig. 2).
+
It is worth looking at the protein production a bit closer: The production of protein P depends on the expression of a gene that codes for this protein. In the case of constant protein production, the gene can be modeled by a ''constitutive promoter'' and a coding region for the protein (Fig. 2).
[[Image:basic_fig02.png|thumb|<b>Fig. 2</b>: Model for the gene coding for protein P. The promoter of the gene is continuously expressed.]]
[[Image:basic_fig02.png|thumb|<b>Fig. 2</b>: Model for the gene coding for protein P. The promoter of the gene is continuously expressed.]]
Line 31: Line 109:
== Regulated Protein Production ==
== Regulated Protein Production ==
-
To build a system with logic functionality, it is necessary to produce a protein depending on the concentration of another protein (i.e. a transcription factor). To model such a system, the promoter of the constitutive production system in Fig. 2 must be extended to take into account the presence of the regulatory protein R. There are two possible cases: the regulatory protein either ''inhibits'' or ''activates'' the expression of the gene (see Figs. 3 and 4, respectively).
+
Often, genes are not constitutively expressed but their expression depends on the presence of other proteins R (i.e., transcription factors). These transcription factors can activate or inhibit the promoter of the gene in question. To model such a system, the promoter of the constitutive production system in Fig. 2 must be extended to take into account the presence of the regulatory protein R. Example systems for inhibition and activation of a promoter by R are given in Fig. 3 and Fig.4, respectively.
[[Image:basic_fig03.png|thumb|<b>Fig. 3</b>: Regulated protein production where the regulatory protein has inhibitory effect. That is, the  higher the concentration of R, the smaller the expression of protein P.]]
[[Image:basic_fig03.png|thumb|<b>Fig. 3</b>: Regulated protein production where the regulatory protein has inhibitory effect. That is, the  higher the concentration of R, the smaller the expression of protein P.]]
Line 39: Line 117:
=== Inhibition ===
=== Inhibition ===
-
To derive the equations describing the regulated transcription, we first need a model how the transcription factor interacts with DNA. The most simple model is when the protein binds reversibly to the DNA:
+
To derive the equations describing the regulated transcription, we first need a model of how the transcription factor interacts with DNA. The simplest model assumes that the protein binds reversibly to the DNA:
[[Image:basic_eq02.png|center]]
[[Image:basic_eq02.png|center]]
Line 47: Line 125:
When the transcription factor binds to DNA, it blocks the enzymes transcribing the gene. Thus, the higher the concentration of R, the smaller the transcription of the gene. By controlling the concentration of the regulatory protein R, the expression of protein P can effectively be regulated.
When the transcription factor binds to DNA, it blocks the enzymes transcribing the gene. Thus, the higher the concentration of R, the smaller the transcription of the gene. By controlling the concentration of the regulatory protein R, the expression of protein P can effectively be regulated.
-
To understand this process in more detail, we make the simplification that the binding of R to DNA is in ''equilibrium''. That is, one can write
+
To understand this process in more detail, we make the simplification that the binding of R to DNA is in ''equilibrium''. That is, forward and backward reaction rates are identical, so we can write
[[Image:basic_eq03.png|center|225px]]
[[Image:basic_eq03.png|center|225px]]
-
In case of inhibition, the expression of the gene is proportional to the probability that the DNA is 'free' (meaning that there is no transcription factor bound to it). After some algebraic manipulation of the above equation an expression for the 'free DNA' as a function of transcription factor concentration can be derived:
+
In case of inhibition, the expression of the gene is proportional to the probability that the DNA is 'free' (i.e., there is no transcription factor bound to it). After some algebraic manipulation of the above equation an expression for the 'free DNA' as a function of transcription factor concentration can be derived:
[[Image:basic_eq04.png|center|152px]]
[[Image:basic_eq04.png|center|152px]]
-
Now, we have basically everything together to write down a differential equation for the concentration of protein P whose expression is regulated by protein R:
+
Now, all elements are in place to write down an ODE for the concentration of protein P whose expression is regulated by the regulatory protein R:
[[Image:basic_eq05.png|center|253px]]
[[Image:basic_eq05.png|center|253px]]
Line 63: Line 141:
=== Activation ===
=== Activation ===
-
To understand activation, we start from the same model of the transcription factor binding reversibly to DNA
+
To understand activation, we start with the same assumption: a transcription factor reversibly binding to DNA.
[[Image:basic_eq02.png|center]]
[[Image:basic_eq02.png|center]]
-
But in case of activation, we are interested in the DNA - transcription factor complex, because the transcription rate is proportional to the probability that a protein is bound to DNA.
+
Here, we are interested in the DNA - transcription factor complex because the transcription rate is proportional to the probability that a protein is bound to DNA.
-
Assuming equilibrium conditions as in the inhibition case, we can derive an expression for the DNA-protein complex concentration as a function of protein concentration:
+
Again assuming equilibrium conditions, we can derive an expression for the DNA-protein complex concentration as a function of protein concentration:
[[Image:basic_eq06.png|center|175px]]
[[Image:basic_eq06.png|center|175px]]
Line 79: Line 157:
== Basic Production ==
== Basic Production ==
-
The equations so far assume perfect inhibition/activation. This means, in case of inhibition, that if the inhibitor concentration is high enough, the transcription of protein P is practically zero. Or, in the case of activation, that in the absence of activator protein there is no transcription. In reality, one observes always some ''basic production'', despite high inhibitor concentrations or absence of activator protein. The basic transcription is usually around 10-20% of the maximum transcription rate. The case of inhibition is illustrated in Fig. 5. [[Image:basic_fig05.png|thumb|<b>Fig. 5</b>: Inhibition of transcription factor is only effective between basic transcription and maximum transcription. That is, protein production cannot be shut off completely by inhibition.]]  
+
The equations so far assume perfect inhibition/activation. This means, in case of inhibition, that if the inhibitor concentration is high enough, the transcription of protein P is practically zero. Or, in the case of activation, that in the absence of activator protein there is no transcription. In reality, one observes some ''basic production'', despite high inhibitor concentrations or absence of activator protein. The basic transcription is usually around 10-20% of the maximum transcription rate. The case of inhibition is illustrated in Fig. 5. [[Image:basic_fig05.png|thumb|<b>Fig. 5</b>: Inhibition of transcription factor is only effective between basic transcription and maximum transcription. That is, protein production cannot be shut off completely by inhibition.]]  
Thus we have to introduce some 'basic transcription' that always takes place. For inhibition we have
Thus we have to introduce some 'basic transcription' that always takes place. For inhibition we have
Line 89: Line 167:
[[Image:basic_eq09.png|center|361px]]
[[Image:basic_eq09.png|center|361px]]
-
Note that the basic transcription rate is introduced as a ''leakiness factor'' ''a'', which is a percentage of the maximum transcription rate ''c''<sup>max</sup>. Regulation of transcription by protein R now is only effective in the range between ''a''&middot;''c''<sup>max</sup> and ''c''<sup>max</sup>.
+
Note that the basic transcription rate is introduced as a ''leakiness factor'' ''a'' which is a percentage of the maximum transcription rate ''c''<sup>max</sup>. Regulation of transcription by protein R is now only effective in the range between ''a''&middot;''c''<sup>max</sup> and ''c''<sup>max</sup>.
-
 
+
== Inducer Molecules ==
== Inducer Molecules ==
-
A problem in biology is that regulatory proteins (such as protein R in the previous sections) cannot be used as system inputs directly. It is not possible to add such proteins to an assay to steer the behavior of a biological system, because these proteins are big and do not diffuse through cell walls. They cannot enter the cell.
+
A problem in biology is that regulatory proteins (such as protein R in the previous sections) cannot be used as system inputs directly. It is not possible to add such proteins to an assay to steer the behavior of a biological system because these proteins are big and do not diffuse through cell walls. Therefore, they cannot enter the cells.
To circumvent this limitation, one has to produce these proteins directly in the cell. But then, one needs a possibility to switch the functionality of these proteins on and off. This is where inducer molecules become useful. Inducer molecules are small molecules that can diffuse through cell walls freely. Furthermore, they are able to bind to the regulatory proteins and switch the functionality on or off.
To circumvent this limitation, one has to produce these proteins directly in the cell. But then, one needs a possibility to switch the functionality of these proteins on and off. This is where inducer molecules become useful. Inducer molecules are small molecules that can diffuse through cell walls freely. Furthermore, they are able to bind to the regulatory proteins and switch the functionality on or off.
-
Thus, we need a model to describe the binding of the inducer to the protein. We assume again, that the inducer binds reversibly to the protein
+
Thus, we need a model to describe the binding of the inducer to the protein. We again assume that the inducer I binds reversibly to the protein R
 +
 
 +
[[Image:basic_eq10.png|center]]
 +
 
 +
Further, we again assume that the reaction is in equilibrium. We thus have
 +
 
 +
[[Image:basic_eq11.png|center|157px]]
 +
 
 +
Depending on the species at hand we are either interested in the protein-inducer complex concentration or in the concentration of 'free protein'. For some species (e.g., LuxR), the complex acts as an activator. For others, complex formation relieves repression (e.g., TetR).
 +
 
 +
It is again possible to derive a formula for both, the concentration of free protein R and the complex concentration R-nI as a function of total inducer concentration (for notational convenience, we write R* for the complex R-nI in the following equations).
 +
 
 +
[[Image:basic_eq12.png|center|157px]]
 +
 
 +
<br>
 +
 
 +
[[Image:basic_eq13.png|center|157px]]
 +
 
 +
== Caveats ==
 +
 
 +
With this approach to modeling, the equations turn out to be both, easy to understand and to simulate. But to arrive at this point, a certain number of assumptions must be made. This section points out possible problems with these assumptions and ideas to further improve the modeling.
-
[[Image:basic_eq10.png|center|361px]]
+
* '''Equilibrium''': This is a very basic assumption that was the basis of all results in the above discussion. At the same time, it is also the least problematic assumption. This can be seen by remembering the massive machinery involved in transcription and translation of DNA/RNA compared to simple reversible binding of molecules. It is plausible to assume that the latter indeed happens on a much shorter time scale than the first.
-
We assume once more that this reaction is in equilibrium. We thus have
+
* '''Excess of substrate''': This assumption was implicitly made when we wrote down the equations of regulation and inducer binding. The basic idea is that the total inducer concentration is very close to the concentration of unbound inducer. Mathematically more precise, this condition is met when K<sub>I</sub> &gt;&gt; [R]<sub>t</sub>. As the uncertainty in the values of dissociation constants and steady state concentrations is sometimes very large, it is difficult to say whether this assumption is justified.
-
[[Image:basic_eq11.png|center|361px]]
+
* '''Mechanism dependence''': If the [http://en.wikipedia.org/wiki/Hill_coefficient Hill cooperativity coefficient ]''n'' is greater than one, the above formulas become in fact dependent on the mechanism of the binding process. In these cases, the formulas can be off up to a factor ''n''. As the true mechanism of binding is often unknown, it is practically impossible to take this fact into account.
-
Depending on the species at hand we are either interested in the protein-inducer complex concentration of in the concentration of 'free protein'. For some species (e.g. luxR), the complex form is functional and for others it is the free protein (e.g. tetR).
+
* '''Low protein concentration'''. Whenever the concentration is very low (say, below 100 nM in the case of ''E.coli''), the number of molecules per cell becomes small. In this situation, the assumptions behind the ODE modeling approach (e.g., well mixed compartment where each molecule can freely interact with other molecules) are not met anymore and simulation results become inaccurate. Then, one would have to resort to stochastic simulations.

Latest revision as of 19:33, 26 October 2007

ETHZ banner.png

 


Modeling Basics

The functioning of our model depends mostly on protein concentrations. These proteins are produced within the E.coli cells, based on genes that we introduced. To understand the system, it is crucial to model the gene expression, gene regulation, and the resulting gene product concentrations accurately.

This Wiki page is intended to present the basic mechanisms and assumptions that went into the mathematical description of our iGEM model.

Constitutive Protein Production

In the most simple case, proteins are produced through continuous transcription of genes. At the same time, proteins have a certain half-life time, which means that they are degraded. This leads to the following simple model of protein production/degradation shown in Fig. 1.

Fig. 1: A system constitutively producing protein P. The production rate is cmax and the degradation rate is dP.

To find the concentration of protein P (as a function of time), the system of Fig. 1 can be written as an ordinary differential equation (ODE):

Basic eq01.png

This equation states that the change of protein concentration is a function of protein production (cmax) and protein degradation (dP[P]).

It is worth looking at the protein production a bit closer: The production of protein P depends on the expression of a gene that codes for this protein. In the case of constant protein production, the gene can be modeled by a constitutive promoter and a coding region for the protein (Fig. 2).

Fig. 2: Model for the gene coding for protein P. The promoter of the gene is continuously expressed.

Regulated Protein Production

Often, genes are not constitutively expressed but their expression depends on the presence of other proteins R (i.e., transcription factors). These transcription factors can activate or inhibit the promoter of the gene in question. To model such a system, the promoter of the constitutive production system in Fig. 2 must be extended to take into account the presence of the regulatory protein R. Example systems for inhibition and activation of a promoter by R are given in Fig. 3 and Fig.4, respectively.

Fig. 3: Regulated protein production where the regulatory protein has inhibitory effect. That is, the higher the concentration of R, the smaller the expression of protein P.
Fig. 4: Regulated protein production where the regulatory protein activates protein production. That is, the higher the concentration of R, the larger the expression of protein P.

Inhibition

To derive the equations describing the regulated transcription, we first need a model of how the transcription factor interacts with DNA. The simplest model assumes that the protein binds reversibly to the DNA:

Basic eq02.png

Note that the equation above involves n transcription factors. For certain transcription factors the number of proteins involved is indeed greater than 1. These are interesting cases that enable applications such as toggle switches.

When the transcription factor binds to DNA, it blocks the enzymes transcribing the gene. Thus, the higher the concentration of R, the smaller the transcription of the gene. By controlling the concentration of the regulatory protein R, the expression of protein P can effectively be regulated.

To understand this process in more detail, we make the simplification that the binding of R to DNA is in equilibrium. That is, forward and backward reaction rates are identical, so we can write

Basic eq03.png

In case of inhibition, the expression of the gene is proportional to the probability that the DNA is 'free' (i.e., there is no transcription factor bound to it). After some algebraic manipulation of the above equation an expression for the 'free DNA' as a function of transcription factor concentration can be derived:

Basic eq04.png

Now, all elements are in place to write down an ODE for the concentration of protein P whose expression is regulated by the regulatory protein R:

Basic eq05.png

The transcription does not always take place at the maximum rate cmax as was the case for the constitutively produced proteins, but is modulated by the concentration of protein R.

Activation

To understand activation, we start with the same assumption: a transcription factor reversibly binding to DNA.

Basic eq02.png

Here, we are interested in the DNA - transcription factor complex because the transcription rate is proportional to the probability that a protein is bound to DNA.

Again assuming equilibrium conditions, we can derive an expression for the DNA-protein complex concentration as a function of protein concentration:

Basic eq06.png

Analogously, we can now write the whole differential equation for protein concentration if transcription is activated by protein R

Basic eq07.png

Basic Production

The equations so far assume perfect inhibition/activation. This means, in case of inhibition, that if the inhibitor concentration is high enough, the transcription of protein P is practically zero. Or, in the case of activation, that in the absence of activator protein there is no transcription. In reality, one observes some basic production, despite high inhibitor concentrations or absence of activator protein. The basic transcription is usually around 10-20% of the maximum transcription rate. The case of inhibition is illustrated in Fig. 5.
Fig. 5: Inhibition of transcription factor is only effective between basic transcription and maximum transcription. That is, protein production cannot be shut off completely by inhibition.

Thus we have to introduce some 'basic transcription' that always takes place. For inhibition we have

Basic eq08.png

and for activation

Basic eq09.png

Note that the basic transcription rate is introduced as a leakiness factor a which is a percentage of the maximum transcription rate cmax. Regulation of transcription by protein R is now only effective in the range between a·cmax and cmax.

Inducer Molecules

A problem in biology is that regulatory proteins (such as protein R in the previous sections) cannot be used as system inputs directly. It is not possible to add such proteins to an assay to steer the behavior of a biological system because these proteins are big and do not diffuse through cell walls. Therefore, they cannot enter the cells.

To circumvent this limitation, one has to produce these proteins directly in the cell. But then, one needs a possibility to switch the functionality of these proteins on and off. This is where inducer molecules become useful. Inducer molecules are small molecules that can diffuse through cell walls freely. Furthermore, they are able to bind to the regulatory proteins and switch the functionality on or off.

Thus, we need a model to describe the binding of the inducer to the protein. We again assume that the inducer I binds reversibly to the protein R

Basic eq10.png

Further, we again assume that the reaction is in equilibrium. We thus have

Basic eq11.png

Depending on the species at hand we are either interested in the protein-inducer complex concentration or in the concentration of 'free protein'. For some species (e.g., LuxR), the complex acts as an activator. For others, complex formation relieves repression (e.g., TetR).

It is again possible to derive a formula for both, the concentration of free protein R and the complex concentration R-nI as a function of total inducer concentration (for notational convenience, we write R* for the complex R-nI in the following equations).

Basic eq12.png


Basic eq13.png

Caveats

With this approach to modeling, the equations turn out to be both, easy to understand and to simulate. But to arrive at this point, a certain number of assumptions must be made. This section points out possible problems with these assumptions and ideas to further improve the modeling.

  • Equilibrium: This is a very basic assumption that was the basis of all results in the above discussion. At the same time, it is also the least problematic assumption. This can be seen by remembering the massive machinery involved in transcription and translation of DNA/RNA compared to simple reversible binding of molecules. It is plausible to assume that the latter indeed happens on a much shorter time scale than the first.
  • Excess of substrate: This assumption was implicitly made when we wrote down the equations of regulation and inducer binding. The basic idea is that the total inducer concentration is very close to the concentration of unbound inducer. Mathematically more precise, this condition is met when KI >> [R]t. As the uncertainty in the values of dissociation constants and steady state concentrations is sometimes very large, it is difficult to say whether this assumption is justified.
  • Mechanism dependence: If the Hill cooperativity coefficient n is greater than one, the above formulas become in fact dependent on the mechanism of the binding process. In these cases, the formulas can be off up to a factor n. As the true mechanism of binding is often unknown, it is practically impossible to take this fact into account.
  • Low protein concentration. Whenever the concentration is very low (say, below 100 nM in the case of E.coli), the number of molecules per cell becomes small. In this situation, the assumptions behind the ODE modeling approach (e.g., well mixed compartment where each molecule can freely interact with other molecules) are not met anymore and simulation results become inaccurate. Then, one would have to resort to stochastic simulations.