Introduction to O-Glycosylation
Mucin-type O-glycan biosynthesis is initiated by the transfer of N-acetylgalactosamine (GalNAc) from UDP-GalNAc to the hydroxyl groups of Ser or Thr residues in a polypeptide, catalyzed by a large family of polypeptide N-a-acetylgalactosaminyltransferases (ppGalNAc Ts). In human, 20 isoforms have been identified. Multiple members of the ppGalNAc T family have also been identified in Drosophila, Caenorhabditis elegans, and other single and multicellular organisms. Several show close sequence orthologues across species suggesting that a number of ppGalNAc Ts may be responsible for biologically significant functions, which have been conserved during evolution.
Structurally the ppGalNAc Ts consist of an N-terminal catalytic domain tethered by a short linker to a C-terminal ricin-like lectin domain . Presently the role of the lectin domain and its interactions with the catalytic domain are not well-understood.
Initial studies on the ppGalNAc Ts have revealed a wide range of peptide and glycopeptide substrate properties. For example, ppGalNAc T7 and T10 prefer substrates previously modified with O-linked GalNAc on nearby Ser/Thr residues, hence having so-called glycopeptide or filling-in activities [2-4]. The catalytic domain of ppGalNAc T10, interestingly, has been shown to be solely responsible for its near absolute glycopeptide specificity . In contrast, isoforms such as ppGalNAc T2 and T4 possess altered preferences against glycopeptide substrates [6-9] while others, ppGalNAc T1 and T2, can be partially inhibited by neighboring glycosylation [4,10,11]. These latter transferases, preferring non-glycosylated over glycosylated substrates, have been called early or initiating transferases.
From the large number of ppGalNAc T family members, with diverse properties, it is clear that mucin-type O-glycosylation will not be governed by a simple set of rules as found for the N-glycosylation of Asn residues or the O-Xylosylation of Ser residues of proteoglycans (i.e., Asn-Xaa (not Pro)-Ser/Thr or Acidic-Acidic-Xaa-Ser-Gly-Xaa-Gly respectively [12,13]). Nevertheless, database analysis of known mucin-type O-glycosylation sites has resulted in a number of algorithms [14-16,20] for the approximate prediction of sites of mucin-type O-glycosylation. Not unexpectedly these approaches do not account for the wide range and remarkable reproducibility of the O-glycan site-to-site occupancy observed in the mucins that have been characterized to date [10,11]. Importantly, the predictive approaches can not take into account the different peptide substrate specificities of the various ppGalNAc T isoforms.
Recently a series of oriented random (glyco)peptide substrate libraries of the general form GAGA(X)nT(X)nAGAGK (where X = randomized (glyco)amino acids and n = 4,5) (see Table 1) have been developed for quantitatively determining the amino acid residue preferences (so-called enhancement values) of the catalytic domain of the ppGalNAc-Ts. With these substrates unique substrate preference data for all amino acid residues (except for Thr, Trp, and Cys) but including Ser(Thr)-O-GalNAc have been obtained for a series of ppGalNAc Ts [5,17-19]. Thus, with these substrates it has been shown that peptide sequence, neighboring glycosylation and overall charge will modulate each ppGalNAc T's catalytic domain peptide substrate specificity . It has been further shown that the product of the transferase specific enhancement values correlated with previously reported glycosylation patterns of the ppGalNAc Ts against a series of peptide substrates, demonstrating the potential for predicting isoform specific glycosylation, see . ISOGlyP utilizes these enhancement values to perform its predictive calculations giving the so called enhancement value product (EVP). Enhancement value products greater than one indicate an increased preference for glycosylation by the transferase, while values less than one would suggest disfavored glycosylation by the transferase.
Brief Description of Experimental Approach
Enhancement values were obtained using the random (glyco)peptides listed in Table 1 [5,17-19]. Briefly, for random peptides P-VI-VIII peptides were partially glycosylated by the ppGalNAc T and the random glycopeptide product isolated on a mixed bed lectin. Both the initial random peptide and the isolated random glycopeptide were Edman sequenced to determine the compositions of the X residues of the peptide. Transferase enhancement values were obtained from the ratios of the mole fraction of each residue type (glycopeptide:starting peptide). Enhancement values greater than one indicate an increase preference for the specific residue type by the transferase, while values less than one indicate the residue is disfavored by the transferase. Ser enhancement values were obtained from peptide P-VIII, taking advantage of the observation that Thr residues are much better acceptors than Ser residues for most ppGalNAc T's . Preferences were obtained for glycosylated Ser-O-GalNAc (S*) residues utilizing random glycopeptide GP-II and UDP-GalNAz as the GalNAc donor . Upon biotinylation of the glycopeptide using azido-alkyne ”click” chemistry, the ppGalNAc T glycosylated glycopeptide was isolated on immobilized avidin . Subsequent Edman sequencing revealed its enhancement values as described above.
How to Interpret Enhancement Value Product (EVP) Values
We view the EVP values as reflecting relative rates of glycosylation. The higher the value the faster and more likely a site would be glycosylated by a particular transferase isoform. An EVP value of 1 would indicate the transferase perceives the sequence as relatively neutral, i.e not inhibited or not enhanced, but nevertheless likely to be glycosylated. An EVP value greater than 1 would suggest a higher rate or likelihood of glycosylation, therefore a value of 2 would suggest a 2 fold rate of glycosylation. Very large EVPs would suggest exceptional sites. EVP values less than 1 would suggest the transferase does not prefer that site - but still could conceivably glycosylate the site if given enough time or transferase. Nevertheless, an EVP value of 0.2 would suggest a very poor site not likely to be glycosylated. Keep in mind that we are simply multiplying the EV values to obtain the product - at the present time we don’t know if some positions might be more important (weighted) than others. Further studies will be required to address this issue. Also note that the EVP values do not take in account end-effects, therefore, based on our experience, predictions within 3-5 residues of the N- or C- terminal of a peptide may be too high. Finally, at the present time, the EVP values calculated by ISOGlyP for a Ser or Thr residue in the same flanking peptide sequence are identical and do not reflect the intrinsic lower rates of Ser glycosylation compared to Thr glycosylation. At the present time few systematic studies have been performed on T1 and T2, and none with the other isoforms, quantifying this difference. Therefore, we recommend that the EVP values for Ser residues be roughly decreased by a factor of approximately 10 when comparing to the EVP values for Thr residues.