In the life sciences, biological entities often do not exist independently. They are connected and interacted to form a huge biological network. Considering the network data and the gradual rise of network mining algorithms, we collected and constructed a heterogeneous biological network centered on genes and diseases, so as to offer help with further data analysis and algorithm development.
An Example of Heterogeneous Biological Network Focused on Breast Cancer

(The black nodes represent genes, and the red nodes represent diseases. We can find that genes and diseases can be related by some other biological entities (chemicals, pathways, mutations).)
Pipeline: The Construction of The Network
We collected node-related biological networks from 7 databases. There are 163,024 nodes and 25,265,607 edges in these biological networks. As shown in following Table, there are 6 types of node: 27,165 gene nodes are from NCBI, 2,665 disease nodes are obtained from Disease-Ontology, 15,076 chemical nodes and 2,363 pathway nodes are from CTD, 108,023 mutation nodes are from DisGeNET, the number of which is the largest, 2,363 phenotype nodes are form HPO.
Data Format and Full Data Downloading
In these biological networks, the chemical, mutation, pathway, phenotype nodes are separately associated to gene and disease nodes, and there are also associations between gene nodes, associations between disease nodes and associations between gene nodes and disease nodes. Hence, these associations compromise11 types of edge: disease-chemical edges, gene-chemical edges, disease-pathway edges, gene-pathway edges and gene-disease edges obtained from CTD, disease-mutation edges and gene-mutation edges from DisGeNET, disease-phenotype edges and gene-phenotype edges from HPO, gene-gene edges from BioGRID , and disease-disease from BioSNAP.
Items | Counts | ID Type | Database |
---|---|---|---|
Gene | 27165 | Entrez ID | NCBI |
Disease | 2665 | MESH ID | Disease-Ontolgy |
Chemical | 15076 | MESH ID | CTD |
Mutation | 108023 | SNPID | DisGeNET |
Pathway | 2363 | KEGG ID | CTD |
Phenotype | 7732 | HPO ID | HPO |
Disease-Chemical | 1310249 | MESH ID-MESH ID | CTD |
Disease-Mutation | 54032 | MESH ID-SNPID | DisGeNET |
Disease-Pathway | 150855 | MESH ID-KEGG ID | CTD |
Disease-Phenotype | 1033 | MESH ID-HPO ID | HPO |
Disease-Disease | 1826 | MESH ID-MESH ID | BioSNAP |
Gene-Chemical | 507004 | Entrez ID-MESH ID | CTD |
Gene-Mutation | 116033 | Entrez ID-SNPID | DisGeNET |
Gene-Pathway | 135345 | Entrez ID-KEGG ID | CTD |
Gene-Phenotype | 160931 | Entrez ID-HPO ID | HPO.obo |
Gene-Gene | 379235 | Entrez ID-Entrez ID | BioGRID |
Gene-Disease | 11224532 | Entrez ID-MESH ID | CTD |
Disease-Gene | 11224532 | MESH ID-Entrez ID | CTD |
Contact Us
College of Informatics
Huazhong Agricultural Univ
Wuhan, Hubei 430070
China
Jingbo Xia, xiajingbo.math@gmail.com
Kaiyin ZHou, zhoukaiyinhzau@gmail.com
Yuxing Wang, yuxingwang.www@gmail.com