Abstract
In this notebook we will perform some of the basic operations in working with a genome-scale metabolic model (GEM). The vast majority of software that has been developed surrounding GEMs has been done in MATLAB, likely because this form of modeling has origins in engineering (specifically chemical engineering). Although well-suited for metabolic modeling, MATLAB is not open-source and therefore limits the accessibility of such software. Fortunately, the modeling community has implemented the MATLAB COnstrant-Based Reconstruction and Analysis (COBRA) Toolbox in Python, as COBRApy.
COBRApy is still relatively new and therefore lacks some of the functionality of its MATLAB counterparts, but the core utilities are available and quickly expanding. Here, we will demonstrate some of the basic functions and classes of the COBRApy package, which should also familiarize the user with the fundamentals of GEM structure and simulation.
Most of the commands and material covered in this tutorial can be found in the COBRApy Documentation, so we encourage you to reference the documentation if you encounter errors, warnings, or need further detail about something.
Contents
import cobra
import cobra.test
import os
Before jumping right into things, it is always nice to see what sort of default settings are in place. COBRApy has organized such defaults into a global configuration object, which can be viewed or adjusted as needed.
cobra_config = cobra.Configuration()
We can view a brief summary of the object
cobra_config
Attribute | Description | Value |
solver |
Mathematical optimization solver | glpk |
tolerance |
General solver tolerance (feasibility, integrality, etc.) | 1e-07 |
lower_bound |
Default reaction lower bound | -1000.0 |
upper_bound |
Default reaction upper bound | 1000.0 |
processes |
Number of parallel processes | 11 |
cache_directory |
Path for the model cache | /Users/rui.benfeitas/Library/Caches/cobrapy |
max_cache_size |
Maximum cache size in bytes | 104857600 |
cache_expiration |
Model cache expiration time in seconds (if any) | None |
As well as the default reaction flux bounds (min,max)
cobra_config.bounds
(-1000.0, 1000.0)
GEMs, as their name implies ("genome"-scale), are often quite large, containing thousands of reactions, metabolites, and genes. It is therefore best to begin working with a simplified model that is quick to load and use, and easy to conceptualize.
For this exercise, we will use the textbook
model that is provided with the COBRApy package. This model encompasses the core pathways of central carbon metabolism in E. coli. The cobra package ships with several test models in different formats
data_dir = cobra.test.data_dir
os.listdir(data_dir)[:10]
['textbook.xml.gz', 'invalid_annotation_format.json', 'iJO1366.xml.gz', 'JSON_with_inf_bounds.json', 'mini_fbc2.xml', 'mini.json', 'salmonella.genes', 'mini_fbc1.xml', 'mini_fbc2.xml.gz', 'invalid2.xml']
We will load the "textbook" model from the SBML (.xml) file
model = cobra.io.read_sbml_model(os.path.join(data_dir, "textbook.xml.gz"))
model
Name | e_coli_core |
Memory address | 0x017b762040 |
Number of metabolites | 72 |
Number of reactions | 95 |
Number of groups | 0 |
Objective expression | 1.0*Biomass_Ecoli_core - 1.0*Biomass_Ecoli_core_reverse_2cdba |
Compartments | cytosol, extracellular |
GEMs are often shared as SBML (Systems Biology Markup Language), an XML-based format commonly used to store GEMs. The aim of SBML is to serve as an open and standardized format to facilitate sharing of models and software. Feel free to open the textbook.xml
file in a text editor to get an idea of how it is formatted.
You can find plenty of SBML models at the Metabolic Atlas, and can contribute to their curation on their github. Let's list the first few reactions in the model, with their reaction equations
for x in model.reactions[:10]:
print("%s : %s" % (x.id, x.reaction))
ACALD : acald_c + coa_c + nad_c <=> accoa_c + h_c + nadh_c ACALDt : acald_e <=> acald_c ACKr : ac_c + atp_c <=> actp_c + adp_c ACONTa : cit_c <=> acon_C_c + h2o_c ACONTb : acon_C_c + h2o_c <=> icit_c ACt2r : ac_e + h_e <=> ac_c + h_c ADK1 : amp_c + atp_c <=> 2.0 adp_c AKGDH : akg_c + coa_c + nad_c --> co2_c + nadh_c + succoa_c AKGt2r : akg_e + h_e <=> akg_c + h_c ALCD2x : etoh_c + nad_c <=> acald_c + h_c + nadh_c
We can specify the target method to retrieve a specific reaction (e.g., AKGDH)
model.reactions.AKGDH
Reaction identifier | AKGDH |
Name | 2-Oxogluterate dehydrogenase |
Memory address | 0x017b7c7e50 |
Stoichiometry |
akg_c + coa_c + nad_c --> co2_c + nadh_c + succoa_c 2-Oxoglutarate + Coenzyme A + Nicotinamide adenine dinucleotide --> CO2 + Nicotinamide adenine dinucleotide - reduced + Succinyl-CoA |
GPR | b0726 and b0116 and b0727 |
Lower bound | 0.0 |
Upper bound | 1000.0 |
Here we can see some of the reaction-associated information that is stored in the model. The Reaction identifier
must be a unique string, and is typically a short abbreviation or code since there is also a more descriptive Name
field.
Question: Is this reaction irreversible or reversible? From what field(s) can this be determined?
Question: Is the reaction catalyzed by isozymes or an enzyme complex?
Let's list the first few metabolites in the model, along with their chemical formula
for x in model.metabolites[:10]:
print("%s : %s" % (x.id, x.formula))
13dpg_c : C3H4O10P2 2pg_c : C3H4O7P 3pg_c : C3H4O7P 6pgc_c : C6H10O10P 6pgl_c : C6H9O9P ac_c : C2H3O2 ac_e : C2H3O2 acald_c : C2H4O acald_e : C2H4O accoa_c : C23H34N7O17P3S
We can use the identifiers above to inspect the 3pg_e
metabolite in more detail
model.metabolites.get_by_id('3pg_c')
Metabolite identifier | 3pg_c |
Name | 3-Phospho-D-glycerate |
Memory address | 0x017b762fd0 |
Formula | C3H4O7P |
Compartment | c |
In 3 reaction(s) | PGM, PGK, Biomass_Ecoli_core |
Note that we cannot reference the metabolite using model.metabolites.3pg_c
because the metabolite ID begins with a number, which python doesn't like.
We can see the abbreviation of the compartment in which the metabolite exists, though c
by itself is not super informative.
Question: What is the name of the c
compartment, and what other compartments does the model have?
GEMs are never really "finished" because we continue to improve models based on known metabolism, finding errors or missing content, adding newly discovered reactions/genes/metabolites. It is therefore common for a user to need to add or remove content from the GEM.
For this example, we will add the aspartate aminotransferase reaction to enable the synthesis of aspartate:
L-glutamate + oxaloacetate <==> 2-oxoglutarate + L-aspartate
Let's create a template reaction and determine what information we need to provide
reaction = cobra.Reaction('ASPAMTR')
reaction
Reaction identifier | ASPAMTR |
Name | |
Memory address | 0x017b762400 |
Stoichiometry |
--> --> |
GPR | |
Lower bound | 0.0 |
Upper bound | 1000.0 |
To add it we'll need to add the name
reaction.name = 'aspartate aminotransferase'
We need to find the IDs of the metabolites involved in the reaction
met_patterns = ['glutamate', 'oxaloacetate', 'oxoglutarate', 'aspartate']
for met in model.metabolites:
if any([x in met.name.lower() for x in met_patterns]):
print("%s : %s" % (met.id, met.name))
akg_c : 2-Oxoglutarate akg_e : 2-Oxoglutarate glu__L_c : L-Glutamate glu__L_e : L-Glutamate oaa_c : Oxaloacetate
Two interesting observations:
2-Oxoglutarate
and L-Glutamate
For the first point, note that the _c
and _e
suffixes represent the compartment to which the metabolite belongs. Note that not all GEMs append the compartment information to the metabolite IDs, but it is quite common.
We can use the .compartments
method to see the model's compartments
model.compartments
{'c': 'cytosol', 'e': 'extracellular'}
We want to add our reaction to the cytosol (c
) compartment, so we will use the _c
form of the metabolites.
For the second point, we will need to add aspartate to the model.
Let's create the aspartate
metabolite
asp_c = cobra.Metabolite('asp_c')
asp_c # view its (missing) properties
Metabolite identifier | asp_c |
Name | |
Memory address | 0x017b8997c0 |
Formula | None |
Compartment | None |
In 0 reaction(s) |
And fill in some information about the new aspartate metabolite
asp_c.name = 'L-Aspartate'
asp_c.formula = 'C4H6NO4'
asp_c.compartment='c'
We will also need to specify all the stoichiometric coefficients for each metabolite in the new reaction
reaction.add_metabolites({
model.metabolites.glu__L_c: -1.0,
model.metabolites.oaa_c: -1.0,
model.metabolites.akg_c: 1.0,
asp_c: 1.0
})
reaction
Reaction identifier | ASPAMTR |
Name | aspartate aminotransferase |
Memory address | 0x017b762400 |
Stoichiometry |
glu__L_c + oaa_c --> akg_c + asp_c L-Glutamate + Oxaloacetate --> 2-Oxoglutarate + L-Aspartate |
GPR | |
Lower bound | 0.0 |
Upper bound | 1000.0 |
Is a reaction reversible by default?
reaction.reversibility
False
This reaction is reversible biochemically, but the model considers it as irreversible by default. Unfortunately, we cannot directly edit the "reversibility" field
reaction.reversibility = True
/Users/rui.benfeitas/opt/miniconda3/envs/env_merged_nets/lib/python3.9/site-packages/cobra/core/reaction.py:561: UserWarning: Setting reaction reversibility is ignored warn("Setting reaction reversibility is ignored")
But instead we need to change the lower bound of the reaction
reaction.lower_bound = -1000
reaction.reversibility # verify that the reversibilty has been updated
True
And now we can verify the arrow direction
reaction
Reaction identifier | ASPAMTR |
Name | aspartate aminotransferase |
Memory address | 0x017b762400 |
Stoichiometry |
glu__L_c + oaa_c <=> akg_c + asp_c L-Glutamate + Oxaloacetate <=> 2-Oxoglutarate + L-Aspartate |
GPR | |
Lower bound | -1000 |
Upper bound | 1000.0 |
The reaction will still function even if it doesn't have a GPR (GEMs contain many non-enzymatic reactions, after all), but this information is important to include since it can affect some analyses, such as gene deletion analysis or reporter metabolite analysis.
Let's specify the new gene in the GPR. Aspartate aminotrasferase is encoded by aspC (b0928) in E. coli
reaction.gene_reaction_rule = 'b0928'
reaction
Reaction identifier | ASPAMTR |
Name | aspartate aminotransferase |
Memory address | 0x017b762400 |
Stoichiometry |
glu__L_c + oaa_c <=> akg_c + asp_c L-Glutamate + Oxaloacetate <=> 2-Oxoglutarate + L-Aspartate |
GPR | b0928 |
Lower bound | -1000 |
Upper bound | 1000.0 |
Note that gene(s) in the GPR rule are automatically added to the "genes" field of the immutable reaction object.
reaction.genes
frozenset({<Gene b0928 at 0x17b8ae0a0>})
We can then add the reaction to the model. The input should be a list.
model.add_reactions([reaction])
model.reactions.ASPAMTR # verify that the new reaction, metabolite, and gene are now in the model
Reaction identifier | ASPAMTR |
Name | aspartate aminotransferase |
Memory address | 0x017b762400 |
Stoichiometry |
glu__L_c + oaa_c <=> akg_c + asp_c L-Glutamate + Oxaloacetate <=> 2-Oxoglutarate + L-Aspartate |
GPR | b0928 |
Lower bound | -1000 |
Upper bound | 1000.0 |
model.metabolites.asp_c
Metabolite identifier | asp_c |
Name | L-Aspartate |
Memory address | 0x017b8997c0 |
Formula | C4H6NO4 |
Compartment | c |
In 1 reaction(s) | ASPAMTR |
We can also provide the gene name
model.genes.b0928.name = 'aspC'
model.genes.b0928
Gene identifier | b0928 |
Name | aspC |
Memory address | 0x017b8ae0a0 |
Functional | True |
In 1 reaction(s) | ASPAMTR |
FBA is one of the most common and fundamental GEM-based analyses. It involves an optimization to estimate the flux (metabolite flow) through each reaction in the model, given the following constraints:
The system is at steady state - each metabolite must be consumed and produced at the same rate (represented by the equation S•v = 0).
Reaction fluxes must be within their defined lower and upper bounds (lb
and ub
, respectively).
The optimization will seek to minimize or maximize some objective defined by the user. Most often the objective is to maximize the flux through a "biomass" reaction, which represents an organism trying to allocate its resources such that it maximizes its growth rate.
To help understand and visualize FBA and the resulting reaction fluxes, there is an excellent tool called Escher FBA. Follow this link to Escher FBA, and start the browser app by clicking the Launch image.
By default, the app is maximizing flux through the Biomass_Ecoli_core_w_GAM
reaction, and flux values are represented by reaction arrow color and thickness. Hovering over a reaction name will show some information, as well as options to knock-out the reaction or to change the objective to maximizing or minimizing flux through that reaction.
Try changing the objective to different reactions, and see how the flux distribution changes. Also try knocking out some reactions to view how it affects the results. You can reset the app using the Reset Map
button in the lower right-hand corner.
Now we will perform FBA using (the less pretty but more functional) COBRApy. First let us take a look at the optimization objective (Biomass reaction by default in this model).
We can check the default objective function in this model with cobra.util.solver
from cobra.util.solver import linear_reaction_coefficients
linear_reaction_coefficients(model)
{<Reaction Biomass_Ecoli_core at 0x17b7e5df0>: 1.0}
Alternatively, we can use list comprehension. The "objective_coefficient" indicates which reactions are being maximized (1) or minimized (-1).
[[x.id, x.objective_coefficient] for x in model.reactions if x.objective_coefficient !=0]
[['Biomass_Ecoli_core', 1.0]]
The 2 cells above also tell us the id of the objective functions. We can look into this reaction in more detail
model.reactions.Biomass_Ecoli_core
Reaction identifier | Biomass_Ecoli_core |
Name | Biomass Objective Function with GAM |
Memory address | 0x017b7e5df0 |
Stoichiometry |
1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c... 1.496 3-Phospho-D-glycerate + 3.7478 Acetyl-CoA + 59.81 ATP + 0.361 D-Erythrose 4-phosphate + 0.0709 D-Fructose 6-phosphate + 0.129 Glyceraldehyde 3-phosphate + 0.205 D-Glucose 6-phosphate + 0.2557... |
GPR | |
Lower bound | 0.0 |
Upper bound | 1000.0 |
As one might expect, the biomass reaction formula is quite long and has therefore been truncated in the preview. View the full formula to see all the metabolites involved, and the (molar) ratios in which they are consumed/produced.
Let's examine the entire reaction stoichiometry
model.reactions.Biomass_Ecoli_core.build_reaction_string()
'1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c + 0.5191 pep_c + 2.8328 pyr_c + 0.8977 r5p_c --> 59.81 adp_c + 4.1182 akg_c + 3.7478 coa_c + 59.81 h_c + 3.547 nadh_c + 13.0279 nadp_c + 59.81 pi_c'
We can also view the objective direction (maximize or minimize the reaction flux)
model.objective_direction
'max'
In order to run FBA, we can simply call model.optimize()
. As indicated above, this will aim to maximmize the objective function. If we wanted to minimize it we could run model.optimize(objective_sense='minimize')
.
solution = model.optimize()
solution
fluxes | reduced_costs | |
---|---|---|
ACALD | 0.000000 | 6.938894e-18 |
ACALDt | 0.000000 | 0.000000e+00 |
ACKr | 0.000000 | 1.040834e-17 |
ACONTa | 6.007250 | 0.000000e+00 |
ACONTb | 6.007250 | 1.387779e-17 |
... | ... | ... |
THD2 | 0.000000 | -2.546243e-03 |
TKT1 | 1.496984 | -1.387779e-17 |
TKT2 | 1.181498 | 1.387779e-17 |
TPI | 7.477382 | -6.938894e-18 |
ASPAMTR | 0.000000 | -6.938894e-18 |
96 rows × 2 columns
We can also view a summary of the returned optimal flux distribution
model.summary()
1.0 Biomass_Ecoli_core = 0.8739215069684306
Metabolite | Reaction | Flux | C-Number | C-Flux |
---|---|---|---|---|
glc__D_e | EX_glc__D_e | 10 | 6 | 100.00% |
nh4_e | EX_nh4_e | 4.765 | 0 | 0.00% |
o2_e | EX_o2_e | 21.8 | 0 | 0.00% |
pi_e | EX_pi_e | 3.215 | 0 | 0.00% |
Metabolite | Reaction | Flux | C-Number | C-Flux |
---|---|---|---|---|
co2_e | EX_co2_e | -22.81 | 1 | 100.00% |
h2o_e | EX_h2o_e | -29.18 | 0 | 0.00% |
h_e | EX_h_e | -17.53 | 0 | 0.00% |
From the summary, we can view some details beyond the value of the objective, such as which metabolites are being consumed from the media, which are being produced, and at what rate.
We can also get a summary of the fluxes involving a specific metabolite
model.metabolites.atp_c.summary()
C10H12N5O13P3
Percent | Flux | Reaction | Definition |
---|---|---|---|
66.58% | 45.51 | ATPS4r | adp_c + 4.0 h_e + pi_c <=> atp_c + h2o_c + 3.0 h_c |
23.44% | 16.02 | PGK | 3pg_c + atp_c <=> 13dpg_c + adp_c |
2.57% | 1.758 | PYK | adp_c + h_c + pep_c --> atp_c + pyr_c |
7.41% | 5.064 | SUCOAS | atp_c + coa_c + succ_c <=> adp_c + pi_c + succoa_c |
Percent | Flux | Reaction | Definition |
---|---|---|---|
12.27% | -8.39 | ATPM | atp_c + h2o_c --> adp_c + h_c + pi_c |
76.46% | -52.27 | Biomass_Ecoli_core | 1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c + 0.5191 pep_c + 2.8328 pyr_c + 0.8977 r5p_c --> 59.81 adp_c + 4.1182 akg_c + 3.7478 coa_c + 59.81 h_c + 3.547 nadh_c + 13.0279 nadp_c + 59.81 pi_c |
0.33% | -0.2235 | GLNS | atp_c + glu__L_c + nh4_c --> adp_c + gln__L_c + h_c + pi_c |
10.94% | -7.477 | PFK | atp_c + f6p_c --> adp_c + fdp_c + h_c |
Alternatively, we can get a summary of the fluxes involving a specific reaction
model.reactions.GAPD.summary()
g3p_c + nad_c + pi_c <=> 13dpg_c + h_c + nadh_c
Bounds: -1000.0, 1000.0
Flux: 16.02
We have mentioned above that we could change an objective function to maximize/minimize. What if we want to change the reaction to optimize altogether? Let us now optimize the flux through ATPM ("ATP Maintenance"), which is just the hydrolysis of ATP.
model.reactions.ATPM.build_reaction_string()
'atp_c + h2o_c --> adp_c + h_c + pi_c'
Change the objective to ATPM
model.objective = 'ATPM'
This will effectively maximize the production of ATP. Can you think of why we chose the ATPM
reaction as the objective to do this?
Now let's run with the new objective and summarize the results
solution = model.optimize()
model.summary()
1.0 ATPM = 174.99999999999983
Metabolite | Reaction | Flux | C-Number | C-Flux |
---|---|---|---|---|
glc__D_e | EX_glc__D_e | 10 | 6 | 100.00% |
o2_e | EX_o2_e | 60 | 0 | 0.00% |
Metabolite | Reaction | Flux | C-Number | C-Flux |
---|---|---|---|---|
co2_e | EX_co2_e | -60 | 1 | 100.00% |
h2o_e | EX_h2o_e | -60 | 0 | 0.00% |
Note that there is now no metabolic flux through the biomass reaction model.reactions.Biomass_Ecoli_core.summary()
. This gives an error because of its null flux.
solution.fluxes.Biomass_Ecoli_core
0.0
Let's set the objective back to Biomass again, for the next sections
model.objective = 'Biomass_Ecoli_core'
As you may have seen in the Escher FBA app, we can simulate the effect of a gene knockout. In practice, this entails setting the upper and lower flux bounds of the associated reaction equal to zero, so that it cannot be used.
Although in reality we knockout a gene from an organism, not a reaction, we will start with knocking out a reaction.
Let's first copy the original model and optimize biomass production to view the initial maximum flux value
# copy the model so we don't alter the original
model_ko = model.copy()
biomass_original = model_ko.optimize().objective_value
biomass_original # view starting biomass flux
0.8739215069684305
What if we knock out the AKGDH reaction? Is the output of the next cell expected after AKGDH is knocked out?
model_ko.reactions.AKGDH.knock_out()
model_ko.reactions.AKGDH
Reaction identifier | AKGDH |
Name | 2-Oxogluterate dehydrogenase |
Memory address | 0x017bb721c0 |
Stoichiometry |
akg_c + coa_c + nad_c --> co2_c + nadh_c + succoa_c 2-Oxoglutarate + Coenzyme A + Nicotinamide adenine dinucleotide --> CO2 + Nicotinamide adenine dinucleotide - reduced + Succinyl-CoA |
GPR | b0726 and b0116 and b0727 |
Lower bound | 0 |
Upper bound | 0 |
Note that the reaction is still present in the model, but it now cannot carry any flux. If we wanted to completely remove it from the model altogether, we could use the remove_from_model
function: model.reactions.AKGDH.remove_from_model()
Let's now re-optimize biomass flux
model_ko.optimize().objective_value
0.8583074080226875
And then calculate the difference in biomass flux between the original and knock-out objectives.
biomass_original - model_ko.optimize().objective_value
0.01561409894574306
Since we observe a decrease in the biomass flux after knocking out the AKGDH reaction, it indicates that the model found an alternative but less efficient pathway to generate the required biomass precursors. However, the biomass is not zero, so we would predict that this knockout would likely not be lethal to the cell.
In reality, genes are knocked out, not reactions. Let us now try knocking out the gapA (b1779) gene encoding the GAPD (Glyceraldehyde-3-phosphate dehydrogenase) reaction.
# copy the original model again
model_ko = model.copy()
# knock out the b1779 gene
model_ko.genes.b1779.knock_out()
# check which reactions became inactive (lower bound == upper bound == 0)
[rxn.id for rxn in model_ko.reactions if rxn.upper_bound == rxn.lower_bound]
['GAPD']
Let's view the GAPD reaction. Note the Lower and Upper bounds
model_ko.reactions.GAPD
Reaction identifier | GAPD |
Name | glyceraldehyde-3-phosphate dehydrogenase |
Memory address | 0x017bb72970 |
Stoichiometry |
g3p_c + nad_c + pi_c <=> 13dpg_c + h_c + nadh_c Glyceraldehyde 3-phosphate + Nicotinamide adenine dinucleotide + Phosphate <=> 3-Phospho-D-glyceroyl phosphate + H+ + Nicotinamide adenine dinucleotide - reduced |
GPR | b1779 |
Lower bound | -1000.0 |
Upper bound | 1000.0 |
And finally let's re-optimize the knockout model
model_ko.optimize().objective_value
1.2803448850014704e-15
Here we can see that the gapA gene (and its encoded GAPD reaction) were quite important, as the biomass flux has effectively been reduced to nearly zero ($<10^{-14}$). This is consistent with reports that E. coli cannot grow without this gene.
Since it is a common analysis, COBRApy has specific functions for iterating through each gene (or reaction) in the model, knocking it out, and calculating the associated objective value.
from cobra.flux_analysis import single_gene_deletion, single_reaction_deletion
# delete all genes, one by one, and view the results
gene_del_results = single_gene_deletion(model)
gene_del_results
ids | growth | status | |
---|---|---|---|
0 | {b1603} | 0.873922 | optimal |
1 | {b2133} | 0.873922 | optimal |
2 | {b3737} | 0.374230 | optimal |
3 | {b0723} | 0.814298 | optimal |
4 | {b2935} | 0.873922 | optimal |
... | ... | ... | ... |
133 | {s0001} | 0.211141 | optimal |
134 | {b4153} | 0.873922 | optimal |
135 | {b1380} | 0.873922 | optimal |
136 | {b4077} | 0.873922 | optimal |
137 | {b3733} | 0.374230 | optimal |
138 rows × 3 columns
We find 2 genes whose knockout renders the model infeasible
gene_del_results.loc[gene_del_results.status=='infeasible',]
ids | growth | status | |
---|---|---|---|
19 | {b2416} | NaN | infeasible |
132 | {b2415} | NaN | infeasible |
Question: Can you find the chemical formula of the reactions they catalyze, and suggest why these specific reactions are rendered infeasible?
model.genes.get_by_id('b2415')
Gene identifier | b2415 |
Name | ptsH |
Memory address | 0x017b7b8310 |
Functional | True |
In 2 reaction(s) | GLCpts, FRUpts2 |
Let's delete all reactions, one by one, and view the results
rxn_del_results = single_reaction_deletion(model)
rxn_del_results
ids | growth | status | |
---|---|---|---|
0 | {TPI} | 0.704037 | optimal |
1 | {ETOHt2r} | 0.873922 | optimal |
2 | {MALS} | 0.873922 | optimal |
3 | {AKGt2r} | 0.873922 | optimal |
4 | {FRUpts2} | 0.873922 | optimal |
... | ... | ... | ... |
91 | {EX_for_e} | 0.873922 | optimal |
92 | {EX_glc__D_e} | NaN | infeasible |
93 | {SUCCt2_2} | 0.873922 | optimal |
94 | {FORti} | 0.873922 | optimal |
95 | {NADTRHD} | 0.873922 | optimal |
96 rows × 3 columns
If you still have time, try knocking out genes that encode an isozyme or a complex subunit to see what effect it has on a reaction (remember that the bounds will change to zero once it has been knocked out).
# copy the original model again
model_ko = model.copy()
First try to inactivate the AKGDH reaction by knocking out one or more of its associated genes
# insert your code here
model_ko.reactions.AKGDH
Reaction identifier | AKGDH |
Name | 2-Oxogluterate dehydrogenase |
Memory address | 0x0179da0df0 |
Stoichiometry |
akg_c + coa_c + nad_c --> co2_c + nadh_c + succoa_c 2-Oxoglutarate + Coenzyme A + Nicotinamide adenine dinucleotide --> CO2 + Nicotinamide adenine dinucleotide - reduced + Succinyl-CoA |
GPR | b0726 and b0116 and b0727 |
Lower bound | 0.0 |
Upper bound | 1000.0 |
Next try knocking out one or more of the genes associated with the ACALD reaction
# insert your code here
model_ko.reactions.ACALD
Reaction identifier | ACALD |
Name | acetaldehyde dehydrogenase (acetylating) |
Memory address | 0x0179da0af0 |
Stoichiometry |
acald_c + coa_c + nad_c <=> accoa_c + h_c + nadh_c Acetaldehyde + Coenzyme A + Nicotinamide adenine dinucleotide <=> Acetyl-CoA + H+ + Nicotinamide adenine dinucleotide - reduced |
GPR | b0351 or b1241 |
Lower bound | -1000.0 |
Upper bound | 1000.0 |
Did you notice anything different in how each of these reactions responds to having one of its genes knocked out? Is that consistent with your understanding of the difference between isozymes and enzyme complex subunits?
This notebook was adapted from a notebook previously developed by Jonathan Robinson.