The accuracy of the topology prediction of transmembrane proteins can be greatly increased by incorporating data that unambiguously determine the position of the investigated segment relative to the membrane. These data can be either experimentally determined topology data or those protein domains and motifs that can be consistently found on the same side in transmembrane proteins. These data were gathered into the TOPDB and TOPDOM databases.
One of the most important information missed during the structure determination of transmembrane proteins is the position of the proteins relative to the double lipid layers. To make up for this information, we developed the TMDET algorithm, and with the help of this algorithm we determined the positions of all transmembrane proteins relative to the membrane. These data are collected into the PDBTM database.
Moreover, we created the HTP database of the human α-helical transmembrane proteome, containg the predicted and/or experimentally established topology of each transmembrane protein, together with the reliability of the prediction. The CCTOP prediction method was for distinguishing transmembrane proteins in the proteome and predicting topology. Besides predicting topology, reliability of the predictions was estimated as well. It was demonstrated that the prediction accuracies for more than 60 % of the predictions are over 98 % on the benchmark sets and this is probably true for the whole predicted human transmembrane proteome.