Metabolic Networks


Tools for metabolic Networks Construction and Analysis

Document Actions
Nobody Alone, Networks everywhere. A brief description of the existing networks and tools

Getting Connected :- Networks everywhere


Metabolites are responsible for communication with the environment. They are building blocks, energy providers, signaling molecules, defender and intracellular physical properties regulators. They executes this wide range of functions under the control of highly regulated networks of gene and proteins (enzymes). These networks combined with there metabolite counterpart are responsible for the physiological state of a organism in a specific environment or genetic condition. These networks can shows typical topological behavior of scale free networks (Power law distribution) or hierechical networks. They are robust, redundant, flexible, extremely controlled and follow nature laws of thermodynamics.
Analysis of this networks to understand the dynamic behavior of a biological system has great promise for present days science mandates (energy, health, business, and food etc).

We are interested in the analysis of behavior of metabolites in metabolomics experiments and there possible explanation by the combined analysis and interpretation of existing large scale databases.

Networks can be analyzed in terms of Node Connectvity, Modules and motifs Identification, Clustering Coefficient and Networks Statistics.

1. Large Scale Datasets :-

Due to the advancement of computer technologies we have access to many bulky datasets which hosted valuable biological informations. The main datasets which we are using are as follow.

  1. Pubchem
  2. Kyoto encyclopedia of Genes and Genomes.
  3. MetaCrop:- A Recently Published DataBase of Crop metabolism. We can download the pathways in SBML format. (Provides the kinetic parameters of biochemical reactions.).
  4. Biocyc. (Metacyc, Aracyc, Ecocyc)
  5. Reactome
  6. Arabidopsis Reactome
  7. Stanford Pathway Knowledge
  8. Biomodel
  9. Atomic Reconstruction of Metabolism (ARM)
  10. MetnetDb
  11. Brenda :- For enzyme Information
  12. SABIO RK:- Reaction Kinetics Database. SBML supported.
  13. NIST Enzyme Thermodynamics Database:-
  14. Nature Pathway Interaction Database
  15. Pathway Commons Database : Reactome + Biocyc for EColi, Yeast and Human. 
  16. Nature Signaling Gateway

I would like to comprise the whole list but we have a more comprehensive list of existing networks and pathways databases. Check the www.pathguide.org for incredible collection of networks sources.

Regulation of metabolism can be accomplished at many level. But the main level of control are on gene or enzyme. Control at the level of enzyme is more favorable in terms of energy conusmption but we do not have a comprehensive view how cell is doing this.

With the availability of super fast sequencing platforms we will be surrounded by huge biochemical networks with extreme level of complexity. This will be a major task to simplify this complexity by the add of artificial intelligence, machine learning or advanced computer algorithms.

2. Network exchange Format :-

A network can be defined as a collection of  interactions between different pairs of  nodes. It can be presented in a text file. Various software like various type of file format. It is pretty easy to convert  in to one another. We can do it by  Textpad or MS Excel. But in the biology, we need to specify the schema of a network because in biology we can have ton of things which can be linked to node or edge. So the large number of databases and software driven us for a  standard network exchange file format.

BioPax, SBML and PSI-MI are the frequently used formats. We can also find more like SIF, XML, KGML but they have certain restrictions. I would recommond the BioPax and SBML format (They are supporting the subcellular localization). We have good tools like KEGG2SBML, KEGG2BioPax, SBML2BioPax etc. SBML is more favorable because it is supported by over a hundred plateforms.
I would recommend to convert or create networks in Cytoscape SIF (Simple Interaction Format).


3. Visualization Tools.


After getting the networks in right format we need to visualize on our computer screen. There are plenty of software are available for this task.  I liked  following ones.

  1. Cytoscape:- Best Visualization software, Powerful, Amazing plugins, Support Biopax, SBML, and simple Text formats. We can map any kind of data on the generated networks. 
  2. Vanted:- Networks Statistics. It has a nice advantage to query the metabolite data (Time series or treatment) on the KEGG database. Also we can cluster the data, map the identified one on the kegg pathways and correlated with unidentified on a single screen. It provide a Excel templete to upload the experimental data. Use of KEGG IDs is recommended.
  3. Pajek:-
  4. Visant :- 
  5. UCINET
It is upto users preference to choose the plateform but cytoscape is the superior. We need to biased as per out targets.

In a recent paper in nature protocol comperative analysis of different visualization tools has been done. It is a good efforts. Check it.



A screen shot of cytoscape showing the Protein-Protein Interaction Network with hexokinase keyword in APIDTONET plugin. Hexokinase is a primary regulator of energy metabolism and responsible for first deterministic steps of glucose metabolism. Expression level is primarly affected in to the cancer or stress conditions. 

4. Metabolomics Data Mapping and Analysis.


We can map metabolomics data by these tools on the existing networks.
  1. Kegg Array :- Use kegg Id and relative ratio. Good to visualized the co, down, up- regulated metabolites invoved in various metabolic pathways.
  2. Kegg Color Pathway : The service can be used to check that how many compound have pathway annotation in kegg Database. 
  3. Kegg Atlas : Global maps for exploring KEGG virtual cells and organisms. Map the data with Kegg Ids. 
  4. KEGG Pathcomp/E-zyme:-  A web service by the KEGG to calculate the all possible pathways between two compounds or enzyme on a given thresold. Check it here. 
  5. KEGG_pathcomp
  6. Cytoscape:- It can be possible to upload any numerical value as a node or edge attribute file on cytoscape. IDs can be use NAME, KEGG Ids, CAS number, Chebi ID, Pubchem IDs, EC Number, BioPax ID etc.
  7. Vanted :- Load the experimental dataon the kegg pathways with KEGG IDs Its also support the various statistics performable in the network. It provide some nice statistics tools. 
  8. Pathway Tool :- Yet to explore.
  9. Aracne :- waiting for some hand.
  10. MetaNetwork:-
  11. geWorkbench:-Its a Big platform for integration of large scale Biological Datasets. ARACNE algrothim is integrated with this.
  12. Systems Biology Workbench:- It need SBML File to be analysis by its powerful simulation tool. 
  13. CellDesigner:- One of the most utilized tools for creating the biochemical models and Its analysis. It requires SBML formats which can be downloaded from the MetaCrop database. 
  14. Copasi:- Complex Pathways Simulator, Requires SBML format (MetaCrop Database), It has good simulation power.

5. Candidate Identification:-

By the statistical treatment with supervised and unsupervised method, correlation analysis,  likelyhood, relevance networks , clustering etc we can identified candidate metabolites for further exploration of their role in biology.

  1. Statistica Excellent Statistical Software for Univariate and Multivariate treatment to data. Comprehensive help files with good example. 
  2. TIGR Multi Experimental Viewer : An Excellent Open Source Statistical tools from TIGR. Highly recommended
  3. Likelynet.
  4. Pajek.
  5. r-project:- 
These are main tools for filter out the candidates. Once we get the candidate then we need to look at the reason why they show some kind of behavior in metabolomics data. Our first approach should be Literature mining by existing tools.

6. Literature Mining.

Due to the explosion of biological information in past few years it is really hard to read each paper to trace out the relevance information. Lack of strict vocabulary and phrases are making the condition more complex. (Ontological efforts are appreciable in this direction to get rid of this).  So we need to use some computer tools that can correlated two or more keyword and search for any relationship by their their inbuilt vocabulary library.  Our Target is to get all the information about our candidates  from existing published works.

  1. EBIMED:- A good search tool based on medline.  Output will be GO annotation, uniprot IDs and many more data and tabular form. Check the "Oleic Acid" on Ebimed
  2. Pubmatrix.
  3. Pathbinder
  4. Pubmed Assistant
  5. Agilent Literature Search.

References.

  1. Cytoscape :- Analyzing and Visualizing Network Data
  2. Metanetwrok: A Computational protocol for the genetic study of metabolic networks.
  3. Kyoto Encyclopedia of Genes and Genomes.
  4. Reactome :- A knowledgebase of Biological Pathways.
  5. SABIO_RK Rection kinetics Database :- SBML Download
  6. Pajek
  7. Vanted
  8. Systems Biology Markup Language
  9. Biopax
  10. CellDesigner
  11. Visant :- A integrative plateform for network analysis
  12. Biocyc
  13. MetnetDb
  14. KnowledgePathwayBase
  15. Biomodel
  16. Pathways Interaction Database NCI 
  17. KEGG2SBML
  18. KEGG2BioPax
  19. SBML2BioPax
  20. KeggArray
  21. PathwayTools
  22. r-project
  23. Multi-Experimental Viewer
  24. Algorithm for the Reconstruction of Accurate Cellular Networks 
  25. Systems Biology WorkBench

Important Articles :-


  1. "Metabolomics Networks" of google scholar
  2. Understanding biological Function Through Molecular Networks
  3. Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network
  4. Metabolic network structure determines key aspects of functionality and regulation
  5. The large-scale organization of metabolic networks
  6. Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae
  7. Getting connected: analysis and principles of biological networks
  8. Current progress in network research: toward reference networks for key model organisms

No comments:

Post a Comment

Harvard Proteomics E seminars