What is OrthoDisease?
How is OrthoDisease constructed?
What is the advantages of Inparanoid analysis?
How do I find orthologs to genes associated to my disease?
How do I find orthologs to my genes/proteins?
What is free text search?
What is "OMIM gene map search"?
What is table generator?
What organisms are present in OrthoDisease?
What is "bootstrap % value"?
What is OrthoDisease?
Over 1000 human genes have been associated with a disease phenotype to date and this number continues to rise on a
daily basis. Studying their orthologous counterparts in model organisms assists in the understanding of the role these genes
both in normal and pathological situations. A confounding factor in this process is that cross-species comparisons often
identify genes which, although highly similar, do not represent a true ortholog and may in fact be functionally dissimilar. In
order to resolve this we plan to identify truly orthologous genes in several model organisms, initially in the worm C. elegans,
the fly D. melanogaster, the mustard weed A. thaliana and the yeast S. cerevisiae. We will collect and curate a set of genes
that are associated with diseases by extracting and filtering excerpts from the Online Mendelian Inheritance in Man. The
sequences of the resulting genes will then be analyzed and scored for orthology using methods previously developed at the
Center for Genomics and Bioinformatics, primarily the Inparanoid algorithm. This will yield
assignments of orthologs in model organisms, at different levels of confidence, forming the database
OrthoDisease.
BACK
How is OrthoDisease constructed?
Orthodisease is constructed primarily using Inparanoid analysis. Inparanoid is a program that automatically detects orthologs
(or groups of orthologs) from 2 species. Pairwise comparisons between several model organisms are shown here. The species are
derived mainly from the Ensembl resource as well as several submitted organism genomes. The program itself and accessory programs can be downloaded from the
Inparanoid website.The algorithm is based on pairwise similarity scores which
are by default calculated with NCBI BLAST program. Inparanoid detects best-best hits between sequences from 2 different species.
These are the two seed ORTHOLOGS that form an orthologous group. Other sequences are added to this group if they are closely related
to one of the main orthologs. These members of the orthologous group are called IN-PARALOGS. A confidence value is provided for
each in-paralog that shows how closely related it is to the main ortholog. Genes/protiens are derived from the Online Inheritence
In Man (OMIM) database, both from the "Morbid map", which contains genes directly linked to specific diseases,and from the "Genemap"
which contains genes with a know cytogenetic location that have been mentioned in OMIM (but not necessarily in morbidmap).
BACK
What is the advantages of Inparanoid analysis?
By definition orthologs between 2 species have evolved from one single gene in their ancient common ancestor. Thus, orthologs
are likely to have the same function in both species.Another way to detect orthologs would be from phylogenetic trees. This is
widely used for single gene families, but these are slow and difficult to automate. Morover, the preliminary steps - like
clustering genes into homologous families and creation of multiple alignments are needed. Also the topology of the phylogenetic
tree is strongly dependent on choice of tree building method. Automatic clustering methods based on two-way best genome-wide
matches on the other hand, have so far not effectively separated in-paralogs from out-paralogs. The problem of in-paralog
clustering is more important for analyzing eukaryotic genomes. Eukaryotic genes form large homologous families that cannot be
classified by simple best-best hit methods.
Inparanoid is a fully automatic method for finding orthologs and in-paralogs between
TWO species. Ortholog clusters in Inparanoid are seeded with a two-way best pairwise match, after which an algorithm for
adding in-paralogs is applied. The method bypasses multiple alignments and phylogenetic trees, which can be slow and error-prone
steps in classical ortholog detection. Still, it robustly detects complex orthologous relationships and assigns confidence values
for in-paralogs.
BACK
How do I find orthologs to genes associated to my disease?
Most recommended in this case is the "disease search" tool. This allows one to search the database according to disease name. This
will then lead to a link containing all proteins associated with that disease. Clicking on these protein symbols will display
inparanoid clusters for these proteins. If no disease is found, try again using a shorter search term. The program searches for all
search terms, the default is "AND" from all searches but can be deactivated by using "OR". It should be noted that only diseases which
have an associated protein accession number will be present in this search.
BACK
How do I find orthologs to my gene?
Most recommended in this case is the "gene search" tool. This allows one to search the database according to protein name. This
will then lead to a link containing all proteins associated with a disease. Clicking on these protein symbols will display
inparanoid clusters for these proteins. If no protein is found, try again using a shorter search term. The program searches for all
search terms, the default is "AND" from all searches but can be deactivated by using "OR". It should be noted that only proteins
which have an associated disease in MIM will be present in this search. Alternatively one can search using the "OMIM gene map search"
which will perform the search in a similar way but on a larger dataset of genes that have a know cytogenetic location which have been
mentioned in OMIM (but not necessarily in morbidmap).
BACK
What is the text search?
This is a search tool that one can use to search for for example protein name, gene name, disease, EC number or MIM number, etc.
The program searches for all search terms, the default is "AND" from all searches but can be deactivated by using "OR". It should
be noted that only proteins which have an associated disease in the morbid map will be present in this search.
BACK
What is "OMIM gene search"?
This is a search tool that one can use to search for for example protein name, gene name, disease, EC number or MIM number, etc.
The program searches for all search terms, the default is "AND" from all searches but can be deactivated by using "OR". It should
be noted that only proteins which have an associated disease in MIM will be present in this search. It is similar to "free text
search" but differs in the fact that the search is performed on a larger dataset of genes that have a know cytogenetic location
which have been mentioned in OMIM (but not necessarily in the morbidmap).
BACK
What is "table generator"?
This is a tool which generates a table of orthologs of human disease genes for a selected organism. If you require the full inparanoid cluster data for that organism, it can be downloaded here here.
BACK
What organisms are present in OrthoDisease?
New genomes are being added all the time, at time of writing it is 25 and rising. Visit the Inparanoid site to find out which are currently available.
BACK
What is "bootstrap % value"?
In an inparanoid cluster, each cluster member recieves a score between zero and one. The higher the score, the closer the similarity
to the main orthologs (always having a score 1.0). The main ortholog pair alignment is scored for reliability using bootstrapping.
Most reliable/likely assignment of main ortholog is 100%.
BACK
|