A study of structural properties on profiles HMMs
Motivation: Despite profile hidden Markov Models (pHMMs) being an useful tool for detection of the divergent member in protein families, they are not succefully when proteins are in Twilight zone. We have purposed a method that adding structural properties into pHMMs training phrase. To end this, we have introduced a novel sequence weighting algorithm which given a different weigth for each amino acid in the protein, the weigths are in agreement with the structural similarity of amino acid. From a same train set (aligned homology proteins) we build five pHMMs, one for each structural properties. The properties used were: primary, secondary and terciary structures, accessibility and packing residue. We named this HMMER-STRUCT tool. Results: We used the SCOP database to perform our experiments. We performed take-one-family-out cross-validation over superfamilies. MAMMOTH-mult structural tool was used to align training set proteins. Performance was evaluated through ROC curves, and through Precision/Recall curves and we used the paired two tailed t-test to assess the significancy of the our results. We have evaluated our model ability to identify remote homology, only. We have separated our experiments into two segments. Firstly, we have evaluated the performance of models based on structural properties, individually. Next, we combine the models. Significant improvement over HMMER package were found for both experiments.
View on arXiv