On this page you can find some general statistics about the data present in DIDA.
General database statistics
Variant effects
Almost all variants present in DIDA are non-synonymous: 68.41% are missense, 13.74% are frameshift and 8.79% are nonsense. All intronic (2.75%) and silent (0.82%) variants occur in combination with a non-synonymous variant.
Allelic state of the digenic combinations
A digenic combination is composed of four alleles: if two of those four alleles are different from the reference sequence, the digenic combination is considered as di-allelic. When there are three or four variant alleles, the combination is classified as respectively tri-allelic or tetra-allelic. Almost two-thirds (62.44%) of the digenic combinations present in DIDA are di-allelic. The other third are tri-allelic, with more than half of the combinations belonging to one disease: Bardet-Biedl syndrome. There are also four tetra-allelic examples, where both variants are present in a homozygous state.
Oligogenic effect of the digenic combinations
Each digenic combination was categorized into one of two digenic effect classes: either "on/off" where variant combinations in both genes are required to develop the disease or "severity" where variants in one gene are enough to develop the disease and carrying variant combinations in two genes increases the severity or affects its age of onset. 26.29% of digenic combinations belong to the "severity" class, 31.92% are classified as "on/off" and for 41.78% there was no classification to be derived based on the information present in the publication.
Distribution of the relationship between the genes involved in digenic combinations
It has been shown that the genes that cause a digenic disease often have a physical or functional relationship. For each digenic combination, we determined this relationship, focussing on five different relationship types:
- directly interacting: are the digenic combinations for which the two genes are annotated as directly interacting in BioGrid, IntAct or ConsensusPathDb.
- indirectly interacting: are the digenic combinations for which the two genes share a third gene with whom they directly interact according to BioGrid, IntAct or ConsensusPathDb.
- pathway membership: are the digenic combinations for which the two genes have the same pathway annotation in KEGG or Reactome.
- co-expression: are the digenic combinations for which the two genes are expressed in the same tissue(s) according to the annotations retrieved from GNF/Atlas.
- similar function: are the digenic combinations for which the two genes have one or more functional conserved motifs or conserved domains in common according to Pfam or InterPro.
The 213 digenic combinations present in DIDA are composed of 116 (54.46%) unique gene pairs. Twenty-one of those (18.10%) cannot be classified in either one of five relationship categories, 30 (25.86%) show one type of relationship and more than half (65 or 56.03%) belong to multiple categories.
Disease statistics
Diseases with at least two digenic combinations are shown on the graph
The most represented digenic disease in DIDA is Bardet-Biedl syndrome, with almost 20% of digenic combinations (43 cases), 20% (76) of variants and 9% (12) genes mapped to this disorder. The majority (29 out of 44 or 66%) of diseases in DIDA are represented by one or two digenic combinations.