Piwi-interacting RNAs (piRNAs) are a new class of small non-coding RNAs with 26-31 nucleotides. They perform biological functions via interacting with piwi-subfamily Argonaute proteins. Emerging evidences indicate that piRNAs are involved in various biological process of germ cell and somatic cell including transposon silencing, gene expression, heterochromatin modification, histone modifications and DNA methylation. Furthermore, large number of piRNAs are validated to be associated with multiple diseases and identifying disease associated piRNAs can facilitate the diagnosis and treatment of disease. However, the research on the involvement of piRNAs in human diseases still remains in its infancy.
In this study, we propose an inspiring method called iPiDi-PUL, which identifies novel piRNA-disease associations via positive unlabeled learning. The proposed iPiDi-PUL has following advantages: (i) Three different biological data including piRNA sequence information, disease semantic terms and the experimentally validated piRNA-disease association network are integrated to describe the features of piRNA-disease associations comprehensively. Meanwhile, Principal Component Analysis (PCA) is introduced to eliminate noise and obtain more accurate features for each association. (ii) Multiple negative training association subsets are constructed from unknown piRNA-disease pairs, based on which numerous Random Forests (RFs) are learned to overcome the instability of single predictor; (iii) A web server of iPiDi-PUL has been established at http://bliulab.net/iPiDi-PUL, which makes it convenient for readers to explore related diseases for detected piRNAs. The experimental results on a large benchmark dataset indicate that iPiDi-PUL will be a useful computational method and can identify new piRNA-disease associations effectively.