Reading and using the pathway map:
Build your own model from text
This website implements an interactive pathway map (IPM) built using INDRA, an automated model assembly system for molecular biology. The goal of INDRA-IPM is to allow users to build, contextualize, and share biological pathway models by describing them in natural language.
The visualization aims to display pathways in a visual style similar to that used by biologists in textbooks and presentations. In addition we offer a layer of contextualization and an interactive user interface:
We start off displaying a pre-built model which demonstrates all of these features. The RAS Pathway Map model was drawn by Dr. Frank McCormick in collaboration with the NCI RAS Initiative community.
Users have the ability to define their own biological models in text under the “Build” tab. Here, we start off with the text necessary to build The NCI RAS Pathway Map as an example. A full list of the mechanistic relationships that can be represented by INDRA (and therefore INDRA-IPM) can be found in the software documentation, and examples of models described in natural language (processed via the TRIPS system and assembled by INDRA) can be found in Gyori, Bachman, et. al. (2017).
Users should note that the natural language processing systems are fairly robust but not without limitations. Proper grammar and punctuation should be used. The reading systems do not consider newlines to be sentence separators and may return erroneous output for sentences which are not terminated with a period.
The recognition and grounding of named entities (proteins, etc.) to database identifiers is done automatically. Nevertheless, using standardized names such as HGNC symbols (as opposed to informal synonyms) is preferred to avoid ambiguity. To normalize node names in the pathway map, the IPM performs name standardization, in which entities mentioned by their synonyms are normalized to standard names such as HGNC symbols (for instance, MEK1, Map2k1 and Mek1 are all normalized to the standard symbol MAP2K1). Note that by clicking on a node, a tooltip opens that allows linking out to databases (HGNC, UniProt, CiteAb), and checking the original text that the standardized node was created from.
INDRA-IPM also recognizes protein families and complexes and grounds them in the FamPlex ontology. In some cases, there is ambiguity in the name of a specific gene and a family it is part of. An example of this is the grounding of “JUN” from text to the JUN family, which also includes the JUN gene. In this case the user can use a synonym such as “c-JUN” that refers to the singular entity in order to reference only the gene and not the family.
We have exposed two reading systems to users. The REACH reader developed by the CLU Lab at the University of Arizona is an information extraction system for the biomedical domain, which aims to read scientific literature and extract cancer signaling pathways. We recommend users try REACH first due to its speed. The TRIPS/DRUM system developed by IHMC may offer greater mechanistic detail in some use cases (for instance, it supports recognizing complex molecular conditions such as “BRAF-V600E not bound to Vemurafenib”), but it requires significantly longer to run.
Users are able to project data from the Cancer Cell Line Encyclopedia (CCLE) onto their pathway maps. This is done automatically when the IPM is loaded initially (using the LOXIMVI skin cancer cell line) and can be changed to any other CCLE cell line in the Model Options dialogue panel. Wild type genes are colored green, while mutated genes are colored orange. Color intensity indicates the relative level of expression. Context is unavailable for gray nodes because they were not measured in CCLE.
Users can share models using the NDEx network sharing website . To upload the current model, click the “NDEX” button at the bottom of the interface, then click “Upload”. A link to NDEx will appear one the upload is complete.
One can load a model by entering the unique key at the end of this link (e.g., 9b901d8f-2e2d-11e9-9f06-0ac135e8bacf) into the Load field. Alternatively, one can share the link in the address bar (e.g., pathwaymap.indra.bio/?uuid=9b901d8f-2e2d-11e9-9f06-0ac135e8bacf) which will send a user to the IPM website and immediately load the shared model. Shared models preserve their text description, INDRA statements, graph layout, cell line context, and any evidence retrieved from INDRA DB.
Users can export models in a variety of formats.
In order to simplify the user interface, only PNG export is available on mobile devices with limited screen width.
This work was funded by ARO Grants W911NF‐14‐1‐0397 and W911NF‐15‐1‐0544 under the DARPA Big Mechanism and Communicating with Computers programs, and by NIGMS Grant P50GM107618.