Abstract:
Microsporidia are spore-forming eukaryotes that are related to fungi but have unique traits
that set them apart. They have compact genomes as a result of evolutionary gene loss associated
with their complete dependency on hosts for survival. Despite having a relatively small number of
genes, a disproportionately high percentage of the genes in microsporidia genomes code for proteins
whose functions remain unknown (hypothetical proteins—HPs). Computational annotation of HPs
has become a more efficient and cost-effective alternative to experimental investigation. This research
developed a robust bioinformatics annotation pipeline of HPs from Vittaforma corneae, a clinically
important microsporidian that causes ocular infections in immunocompromised individuals. Here,
we describe various steps to retrieve sequences and homologs and to carry out physicochemical
characterization, protein family classification, identification of motifs and domains, protein–protein
interaction network analysis, and homology modelling using a variety of online resources. Classification of protein families produced consistent findings across platforms, demonstrating the accuracy of annotation utilizing in silico methods. A total of 162 out of 2034 HPs were fully annotated, with the bulk of them categorized as binding proteins, enzymes, or regulatory proteins. The protein functions
of several HPs from Vittaforma corneae were accurately inferred. This improved our understanding of microsporidian HPs despite challenges related to the obligate nature of microsporidia, the absence of fully characterized genes, and the lack of homologous genes in other systems