This page is directly from the PyMood "Description" page on allometra.com.
PyMood is a desktop application for visualizing genomic data. It is used to analyze and display relations
between complete genomes, genomic fragments, proteins, ESTs, full length cDNAs, and virtually any sets of
sequences. The program is also used to visualize and analyze expression data.
The two most common applications of PyMood are visualization of sequence similarities between a query set of
sequences with up to three other gene sets, or visualization of expression data from up to three samples. The
visualization principle in PyMood is coloring and positioning objects (genes, proteins, etc.) in space according
to their data values. PyMood uses additive mixing, meaning it mixes red, green, and blue colors.
Every dataset has one primary color assigned to it, and every data point gets a color saturation degree
according to its data value. The objects are positioned in space according to their three coordinates (Red - x,
Green - y, Blue - z). When sequence similarities are displayed, the PyMood pipeline uses parsed data from the
blast outputs produced by PyMood Blast Launcher as the spatual coordinates. An example of visualization output
is presented in Figure 1.
Here, PyMood displays BLAST outputs from the comparison of Yersinia pestis with E. coli K12 (the
red database), Salmonella typhi (the green database), and Salmonella typhimurium (the blue
database) in the 3D Viewer. Every Yersinia pestis gene has three coordinates according to its similarity
to the best sequence hits in the other three species. The darkest objects have coordinates (0,0,0), and in this
particular experiment, they are Yersinia pestis proteins that lack homologs in E. coli, S. typhi,
and S. typhimurium. The white objects. coordinates are (100,100,100) and they represent the most
conservative genes in all four genomes. The 3D output in PyMood can be rotated (Figure 2a), zoomed (Figure 2b)
in and out, shifted, and queried (Figure 3).
The genes can be selected using three different options that can be combined with each other:
Range", "Selection by Equality", and "Selection by Search (gene description)" (Figure 3).
The produced outputs allow visual presentation of patterns in the database, so the user can form the queries
based on observed patterns. For example, if someone is interested in retrieving the Yersinia pestis
genes evolving at the same rate in all four bacteria (the genes having equally distant homologs in the other
three species), the user will simply select the option R=G=B and will input an offset value in the .Selection by
Equality. box. The program will select all the genes positioned on the main gray meridian, that start from the
darkest element and going to the most white elements (Figure 4).
If an expression dataset was presented in this picture, genes selected by R(ed)=G(reen)=B(lue) would be the
genes equally expressed in all three conditions, or housekeeping genes.
Similarly, by selecting the option B=G, R=0, the user will select the cyan diagonal on the bottom panel (Figure
1). These genes are Yersinia pestis genes with equally distant homologs in S. typhi and in S.
typhimurium, but without homologs in E. coli K12. In this particular example, they are good
candidates to be the pathogenicity responsible genes.
Clicking on the "Annotation" button retrieves the annotation data corresponding to the selected elements. The
Annotation window (Figure 5) displays the IDs and Annotations of the selected genes and their best BLAST hits
from up to three target databases, and their similarity scores (expectation values or percentage of identity).
The IDs are linked to a database of the user's choice (PyMood has a built-in browser). The raw BLAST output
data can also be retrieved from the Annotation window.
Another example of 3D visualization using PyMood is presented in Figure 6. Here, Arabidopsis cDNAs are sorted
in space according to their levels of expression in three growing conditions as detected by Affymetrix genechip
This visualization allowed effortless querying for groups of genes showing distinct expression patterns. This
dataset produced clear patterns with well separated groups of genes. The groups can be easily retrieved using
PyMood selection widgets.
The 2D Visualization capabilities of PyMood include:
- generation of a Canvas with the colored elements (genes, proteins, DNA, or peptide sequences). PyMood
elements and the Canvas can be displayed in a variety of shapes and sizes. Their color will reflect the
associated data values.
- 2D Plot with the elements sorted in 2D accordingly with their color/data values.
- automated generation of web pages with Canvas and 2D Plot images where every element is clickable and linked
to a database of the user.s choice (Figure 7).
The other major part of the PyMood package is the PyMood BLAST Launcher (Figure 8), which allows:
- running BLAST programs in a user-friendly environment.
- manipulations with FastA files: combining, splitting files and sequences, creation of non-redundant sets,
- post-processing of BLAST outputs (BLAST parser is built into the PyMood BLAST Launcher).
- automated creation of browsable web tables from parsed BLAST data outputs.
- clustering, or finding connected (homologous) groups of genes.
The PyMood package also includes the PyMood Help Browser, and an upgrade tool.
PyMood is cross-platform and works on Windows 2000/XP, Mac OS X 10.2/10.3, and