<div>Proteins function in living organisms as enzymes, antibodies, sensors, and transporters, among myriad other roles. The understanding of protein functions has great implications for the study of biological and medical sciences. It has been widely accepted that protein functions are largely determined by protein structures, and proteins with similar sequences tend to fold into similar structures. Thus, protein sequences and structures become the primary evidences used to find homologous proteins that share a common ancestry. Moreover, it is known that protein structures are more conserved than protein sequences over the course of evolution. Therefore, finding remote homologous proteins with limited sequence similarities becomes a fundamental yet challenging problem in computational biology, and it is also an indispensable step towards understanding protein functions.
<br>
<br>Here, different novel methods are presented for finding remote homologous proteins with different goals: (a) the PROtein STructure Alignment (PROSTA) family that automatically determines and aligns structures of protein pockets and interaction interfaces; (b) the ContactLib method that scans tens of thousands of protein structures for homologous structures in seconds; (c) the CMsearch method that simultaneously explore the protein sequence space and the protein structure space, and performs cross-modal search for homologous proteins. Multiple experiments on finding homologous proteins and protein structure prediction have been conducted showing significant performance improvements over state-of-the-art methods. Moreover, case studies where our method discovers, for the first time, structural similarities between pairs of functionally related protein-DNA complexes are presented.</div>