Post‐translational processing by limited proteolysis of inactive secretory precursors to produce active proteins and peptides is an ancient mechanism that enables cells to regulate the level of specific bioactive polypeptides and to generate diverse products from precursor molecules. Many secretory proteins and peptides are synthesized as inactive precursors that in addition to signal peptide cleavage undergo post‐translational processing to become biologically active polypeptides. Many biologically active proteins and peptides are initially synthesized as larger, inactive precursors, usually in the form of pre‐pro‐proteins, which are post‐translationally modified to generate the mature molecule.

To illustrate the prediction performance for varying thresholds and prediction sensitivities, we present receiver operating characteristic (ROC) curves, plotting sensitivity on the x‐axis and false‐positive rate on the y‐axis. Fig. 2. The predictive performance shown as sensitivity versus false‐positive rate in a ROC diagram. Fig. 1. Sequence logos of aligned propeptide cleavage sites centered at P1, where cleavage takes place between P1 and P1′. Arginine was by far the most frequent amino acid residue at P1, corresponding to 92%. At P2 the frequencies of R and K were 22 and 43%, respectively, while the frequency of R was 50% at P4. In these cases, a lysine at P7 or two histidines at P7 and P8, respectively, may have compensatory effects. Using a 4‐fold cross‐validated training approach, the values for the optimal symmetric window size/hidden units were 13/2, 19/4, 17/8 and 11/2, respectively, for the four networks. Another caveat when using the furin‐type cleavage site method is that prediction of processing by furin does not mean that the substrate is actually cleaved by furin in vivo.

After implementation of the neural network prediction method for furin‐type cleavage sites, we scanned the published literature for recent reports on furin‐mediated cleavage in proteins, which were not used in our training data. We found three examples reported as furin cleavage sites and correctly predicted by our furin‐type network. Sites predicted by the furin network may therefore, in some cases, be physiological substrates of a furin‐like PC. In mammals, seven members have been identified, with furin being the one first discovered and best characterized. Compensating base pair changes demonstrate the conservation of structure in spite of sequence not being conserved, for example a GC base pair in one sequence being replaced by a homologous AU pair in another sequence.