Heterogeneous Biological Network

In the life sciences, biological entities often do not exist independently. They are connected and interacted to form a huge biological network. Considering the network data and the gradual rise of network mining algorithms, we collected and constructed a heterogeneous biological network centered on genes and diseases, so as to offer help with further data analysis and algorithm development.

An Example of Heterogeneous Biological Network Focused on Breast Cancer

An example of heterogeneous Biological Network centered on genes and diseases.
(The black nodes represent genes, and the red nodes represent diseases. We can find that genes and diseases can be related by some other biological entities (chemicals, pathways, mutations).)

Pipeline: The Construction of The Network

We collected node-related biological networks from 7 databases. There are 163,024 nodes and 25,265,607 edges in these biological networks. As shown in following Table, there are 6 types of node: 27,165 gene nodes are from NCBI, 2,665 disease nodes are obtained from Disease-Ontology, 15,076 chemical nodes and 2,363 pathway nodes are from CTD, 108,023 mutation nodes are from DisGeNET, the number of which is the largest, 2,363 phenotype nodes are form HPO.

Data Format and Full Data Downloading

In these biological networks, the chemical, mutation, pathway, phenotype nodes are separately associated to gene and disease nodes, and there are also associations between gene nodes, associations between disease nodes and associations between gene nodes and disease nodes. Hence, these associations compromise11 types of edge: disease-chemical edges, gene-chemical edges, disease-pathway edges, gene-pathway edges and gene-disease edges obtained from CTD, disease-mutation edges and gene-mutation edges from DisGeNET, disease-phenotype edges and gene-phenotype edges from HPO, gene-gene edges from BioGRID , and disease-disease from BioSNAP.

Items Counts ID Type Database
Gene 27165 Entrez ID NCBI
Disease 2665 MESH ID Disease-Ontolgy
Chemical 15076 MESH ID CTD
Mutation 108023 SNPID DisGeNET
Pathway 2363 KEGG ID CTD
Phenotype 7732 HPO ID HPO
Disease-Chemical 1310249 MESH ID-MESH ID CTD
Disease-Mutation 54032 MESH ID-SNPID DisGeNET
Disease-Pathway 150855 MESH ID-KEGG ID CTD
Disease-Phenotype 1033 MESH ID-HPO ID HPO
Disease-Disease 1826 MESH ID-MESH ID BioSNAP
Gene-Chemical 507004 Entrez ID-MESH ID CTD
Gene-Mutation 116033 Entrez ID-SNPID DisGeNET
Gene-Pathway 135345 Entrez ID-KEGG ID CTD
Gene-Phenotype 160931 Entrez ID-HPO ID HPO.obo
Gene-Gene 379235 Entrez ID-Entrez ID BioGRID
Gene-Disease 11224532 Entrez ID-MESH ID CTD
Disease-Gene 11224532 MESH ID-Entrez ID CTD

Contact Us

College of Informatics
Huazhong Agricultural Univ
Wuhan, Hubei 430070
Jingbo Xia, xiajingbo.math@gmail.com
Kaiyin ZHou, zhoukaiyinhzau@gmail.com
Yuxing Wang, yuxingwang.www@gmail.com

%d bloggers like this: