To mine the plant trait of rice genes automatically in the literatures by NLP technique to propel the knowledge discovery of plant cultivation and molecular breeding. The main work of this research includes:
◆ Developed Rice Trait Ontology (RTO). (Detailed mentioned in 1.1 and 1.2)
◆ Developed encRiceTrait, a python tool for rice trait ontology. The traits are normalized with RTO1.0. (Detailed mentioned in 1.1.)
◆ Developed an unsupervised Gene-to-trait association extraction (GTAE) Pipeline. (Mentioned in 2.2)
◆ To develop a gold corpus for rice trait. (Ongoing). (Mentioned in 2.4)
1. TOOLS and RESOURCES
❏ 1.1 enrRiceTrait, an ontology-based tool for rice trait enrichment
(Video released in Apr, 2021). Alternatively, watch the video here.
❏ 1.2 Rice Trait Ontology Development
We released RTO (format version 1.1, referenced ontologies: TO, WTO). Download the RTO obo file here. (12/23/2020)
2. Main IDEA
❏ 2.1 Methodology Video: Gene-to-trait association extraction (GTAE) Pipeline
(Video released in Jan, 2021). Alternatively, watch the video here.
❏ 2.2 Methodology Figure: Gene-to-trait association extraction (GTAE) Pipeline
❏ 2.3 Methodology Figure: Unsupervised Rice Trait Extraction
❏ 2.4 Methodology: 2.1k Project for Rice Traits Annotation
3. Early results
❏ 3.1 Result: Unsupervised Rice Gene Mention Extraction
We developed a HunFlaire-based unsupervised method, and compared the performance with other known gene tagger on rice gene mentions of OryzaGP (29,098 keywords among 13,136 PubMed abstracts).
❏ 3.2 Result: Novel Discovery of Gene-to-Trait Associations for Rice
GTAE pipeline discovered thousands of novel gene-to-trait associations for rice. Download the associations with sentence/abstract-level evidences here.
❏ 3.3 Results Visualization
The relation visualization of plant trait and related gene.
There are 323 red nodes which represent plant trait (not only just rice, all plants included) and the 119 yellow nodes are the genes. Lines between the red and yellow nodes represent the co-occurrence of corresponding plant trait and gene in one sentence at least four times. Also, we have lines link yellow and yellow nodes which implied they may have interaction effect.
4. RTO Web Service
❏ 4.1 RTO Ontology web service
Yufei is developing a webpage to showcase RTO ontology terms. Click to visit the web.
❏ 4.2 Video of the work
Yufei’s talk in Bio-Ontologies COSI, ISMB 2022
◆ Xinzhi Yao, Yun Liu, Qidong Deng, Yusha Liu, Xinchen Ma, Yufei Shen, Qianqian Peng, Zaiwen Feng, Jingbo Xia*. RTO, A Specific Crop Ontology for Rice Trait Concepts. Annual International Conference on International Society for Computational Biology (ISMB), Madison, WI, 10-14 July 2022 (Session Bio-Ontologies COSI). https://doi.org/10.5281/zenodo.6950749
Developer: HZAU BioNLP Team