Amino Acid Nomenclature and Coding Systems
The twenty standard amino acids that constitute the building blocks of peptides and proteins are each designated by two standardized abbreviation systems: a three-letter code and a single-letter code. These nomenclature systems were established by the IUPAC-IUB Joint Commission on Biochemical Nomenclature (JCBN) and are universally used in biochemistry, molecular biology, and peptide chemistry. Familiarity with both coding systems is essential for interpreting peptide sequences on Certificates of Analysis, research publications, and structural databases.
The three-letter codes (e.g., Ala for alanine, Gly for glycine, Phe for phenylalanine) are derived from the first three letters of the amino acid name in most cases, with exceptions for asparagine (Asn, to distinguish from aspartate, Asp), glutamine (Gln, to distinguish from glutamate, Glu), isoleucine (Ile), and tryptophan (Trp). Three-letter codes are the standard format for written descriptions of peptide sequences in COA documentation, product catalogs, and detailed experimental protocols.
Single-letter codes provide a compact representation for longer sequences and are the standard format for bioinformatics databases (UniProt, PDB, NCBI), sequence alignment tools, and mass spectrometry analysis software. The assignment of single-letter codes is partially mnemonic but becomes arbitrary where multiple amino acids share the same first letter. The original assignments, as documented by the IUPAC-IUB Commission (European Journal of Biochemistry, 1984, DOI: 10.1111/j.1432-1033.1984.tb07877.x), use intuitive assignments where possible: A for alanine, G for glycine, L for leucine, V for valine. Others use phonetic or associative mnemonics: F for phenylalanine (sounds like F), W for tryptophan (the indole ring resembles a W), Y for tyrosine (sounds like Y).
In addition to the 20 standard codes, several special symbols are used in sequence notation. The letter B represents Asx (asparagine or aspartate, when the identity is ambiguous), Z represents Glx (glutamine or glutamate), J represents Xle (leucine or isoleucine, which are isobaric and indistinguishable by mass spectrometry), and X represents any amino acid. These ambiguity codes appear primarily in sequence analysis contexts and should not appear on COAs, where the sequence should be fully determined. For a comprehensive database of peptide structures and sequences, see our structural peptide database.
Complete Amino Acid Properties Table
The following table presents the physicochemical properties of all 20 standard amino acids. Molecular weights are given as average isotopic masses (used for gravimetric calculations) and monoisotopic masses (used for high-resolution mass spectrometry data interpretation). Residue masses represent the molecular weight contribution of each amino acid within a peptide chain (amino acid MW minus 18.015 Da for the water molecule lost during peptide bond formation). Hydrophobicity values follow the Kyte-Doolittle scale (Journal of Molecular Biology, 1982, DOI: 10.1016/0022-2836(82)90515-0), where positive values indicate hydrophobic and negative values indicate hydrophilic character. Side chain pKa values are from Pace, Grimsley, and Scholtz (Journal of Biological Chemistry, 2009, DOI: 10.1074/jbc.R800080200).
| Amino Acid | 3-Letter | 1-Letter | Avg MW (Da) | Mono MW (Da) | Residue MW (Da) | pKa (Side) | Hydro (KD) | Classification |
|---|---|---|---|---|---|---|---|---|
| Alanine | Ala | A | 89.094 | 89.048 | 71.037 | — | 1.8 | Non-polar, aliphatic |
| Arginine | Arg | R | 174.203 | 174.112 | 156.101 | 12.48 | −4.5 | Positively charged |
| Asparagine | Asn | N | 132.119 | 132.053 | 114.043 | — | −3.5 | Polar, uncharged |
| Aspartate | Asp | D | 133.104 | 133.038 | 115.027 | 3.65 | −3.5 | Negatively charged |
| Cysteine | Cys | C | 121.159 | 121.020 | 103.009 | 8.18 | 2.5 | Polar, uncharged |
| Glutamate | Glu | E | 147.130 | 147.053 | 129.043 | 4.25 | −3.5 | Negatively charged |
| Glutamine | Gln | Q | 146.146 | 146.069 | 128.058 | — | −3.5 | Polar, uncharged |
| Glycine | Gly | G | 75.032 | 75.032 | 57.021 | — | −0.4 | Non-polar, aliphatic |
| Histidine | His | H | 155.156 | 155.069 | 137.059 | 6.00 | −3.2 | Positively charged |
| Isoleucine | Ile | I | 131.175 | 131.095 | 113.084 | — | 4.5 | Non-polar, aliphatic |
| Leucine | Leu | L | 131.175 | 131.095 | 113.084 | — | 3.8 | Non-polar, aliphatic |
| Lysine | Lys | K | 146.189 | 146.106 | 128.095 | 10.53 | −3.9 | Positively charged |
| Methionine | Met | M | 149.208 | 149.051 | 131.040 | — | 1.9 | Non-polar, aliphatic |
| Phenylalanine | Phe | F | 165.192 | 165.079 | 147.068 | — | 2.8 | Non-polar, aromatic |
| Proline | Pro | P | 115.132 | 115.063 | 97.053 | — | −1.6 | Non-polar, aliphatic |
| Serine | Ser | S | 105.093 | 105.043 | 87.032 | — | −0.8 | Polar, uncharged |
| Threonine | Thr | T | 119.119 | 119.058 | 101.048 | — | −0.7 | Polar, uncharged |
| Tryptophan | Trp | W | 204.229 | 204.090 | 186.079 | — | −0.9 | Non-polar, aromatic |
| Tyrosine | Tyr | Y | 181.191 | 181.074 | 163.063 | 10.07 | −1.3 | Polar, aromatic |
| Valine | Val | V | 117.148 | 117.079 | 99.068 | — | 4.2 | Non-polar, aliphatic |
KD = Kyte-Doolittle hydrophobicity index. Positive values indicate hydrophobic residues; negative values indicate hydrophilic residues. pKa values shown are for isolated amino acids; actual pKa values within a peptide may shift by 1-2 units depending on the local microenvironment.
Molecular Weight Calculations for Peptides
Calculating the theoretical molecular weight of a peptide from its amino acid sequence is a fundamental operation in peptide chemistry. The calculation uses the residue molecular weights from the table above and accounts for the water molecules lost during peptide bond formation through condensation. For a peptide containing n amino acid residues, the molecular weight is calculated as the sum of all n residue masses plus 18.015 Da for the terminal water molecule (representing the H at the N-terminus and the OH at the C-terminus).
Equivalently, the peptide molecular weight can be calculated by summing the full amino acid molecular weights and subtracting (n-1) × 18.015 Da, accounting for the (n-1) water molecules released during the formation of (n-1) peptide bonds. Both approaches yield identical results. For example, the tripeptide Ala-Gly-Val has a monoisotopic molecular weight of: 71.037 (Ala residue) + 57.021 (Gly residue) + 99.068 (Val residue) + 18.015 (terminal water) = 245.141 Da.
Post-translational and chemical modifications alter the calculated molecular weight and must be accounted for when comparing theoretical and observed masses on a COA. Common modifications include: N-terminal acetylation (+42.011 Da), C-terminal amidation (-0.984 Da, replacing OH with NH2), methionine oxidation to sulfoxide (+15.995 Da), disulfide bond formation (-2.016 Da per bond, loss of two hydrogen atoms), and asparagine deamidation (+0.984 Da). The mass of counter-ions (TFA: 113.993 Da; acetate: 59.013 Da) is not included in the peptide molecular weight but contributes to the gross weight of the lyophilized product.
The choice between monoisotopic and average molecular weight depends on the mass spectrometry instrument and the size of the peptide. For peptides below approximately 2,000 Da analyzed on high-resolution instruments (Orbitrap, Q-TOF), the isotope peaks are resolved and the monoisotopic mass is compared against the theoretical monoisotopic value. For larger peptides where individual isotope peaks are not resolved, the centroid of the isotope envelope yields the average molecular weight, which is compared against the theoretical average value. COA mass spectrometry data should specify which value is reported. For additional context on mass spectrometric methods, see our mass spectrometry for peptide analysis article.
Hydrophobicity and Chromatographic Behavior
Amino acid hydrophobicity directly determines the reversed-phase HPLC retention behavior of peptides, making it one of the most practically relevant physicochemical properties for analytical characterization. The Kyte-Doolittle hydrophobicity scale, published in 1982 (Journal of Molecular Biology, DOI: 10.1016/0022-2836(82)90515-0), assigns each amino acid a numerical value reflecting its relative hydrophobicity based on experimentally determined water-to-vapor transfer free energies and the interior/exterior distribution of residues in known protein structures.
On the Kyte-Doolittle scale, the most hydrophobic residues are isoleucine (+4.5), valine (+4.2), and leucine (+3.8), while the most hydrophilic residues are arginine (-4.5), lysine (-3.9), and asparagine/aspartate (-3.5). In reversed-phase HPLC, peptides with higher average hydrophobicity interact more strongly with the C18 stationary phase, requiring higher concentrations of organic solvent (typically acetonitrile) to elute. This means that more hydrophobic peptides have longer retention times and elute later in the gradient.
The relationship between peptide sequence and HPLC retention time has been modeled quantitatively. Mant et al. (Methods in Molecular Biology, 2007) demonstrated that retention time can be predicted with reasonable accuracy by summing the retention coefficients of individual amino acid residues, with corrections for chain length and nearest-neighbor effects. While detailed retention prediction is beyond the scope of this reference, the general principle is useful for evaluating COA chromatograms: a peptide with a highly hydrophobic sequence should elute at a relatively high percentage of organic solvent, and deviations from this expectation may indicate modifications or impurities.
Hydrophobicity also influences peptide solubility and aggregation propensity. Peptides with average Kyte-Doolittle values above +1.0 (calculated as the mean of all residue values in the sequence) tend to have limited aqueous solubility and may require organic co-solvents (DMSO, acetonitrile, or dilute acetic acid) for reconstitution. Such peptides are also more prone to aggregation through hydrophobic self-association. These practical considerations are relevant for peptide storage and handling. For information on how HPLC separates peptides based on these hydrophobic interactions, see our guide on HPLC purity testing.
Ionizable Side Chains and pKa Values
Seven of the twenty standard amino acids have side chains with ionizable functional groups: aspartate (carboxyl, pKa 3.65), glutamate (carboxyl, pKa 4.25), histidine (imidazole, pKa 6.00), cysteine (thiol, pKa 8.18), tyrosine (phenol, pKa 10.07), lysine (amino, pKa 10.53), and arginine (guanidinium, pKa 12.48). In addition, the N-terminal alpha-amino group (pKa approximately 8.0) and C-terminal alpha-carboxyl group (pKa approximately 3.1) are ionizable. These pKa values collectively determine the charge state of a peptide at any given pH.
The theoretical isoelectric point (pI) of a peptide is the pH at which the net charge is zero. It can be calculated from the pKa values of all ionizable groups in the sequence using the Henderson-Hasselbalch equation. At pH values below the pI, the peptide carries a net positive charge; above the pI, it carries a net negative charge. The pI is relevant for predicting solubility (minimum solubility occurs near the pI), electrophoretic mobility, and ion-exchange chromatographic behavior.
In the context of peptide degradation, the pKa of cysteine (8.18) is particularly significant because the thiolate anion (RS-) is the reactive species in disulfide exchange reactions. At pH 7.0, approximately 6% of cysteine side chains are deprotonated (thiolate form), while at pH 8.0, this increases to approximately 40%. This pH dependence explains why disulfide scrambling accelerates dramatically above pH 7 and why acidic storage conditions are recommended for disulfide-containing peptides, as detailed in our article on peptide degradation mechanisms.
Histidine is unique among the ionizable amino acids because its imidazole side chain has a pKa (6.00) near physiological pH, meaning that its protonation state changes significantly over the pH range commonly used in biological buffers. This property makes histidine-containing peptides sensitive to pH-dependent changes in charge, solubility, and chromatographic retention. The protonation of histidine also affects metal coordination, as the unprotonated imidazole nitrogen serves as a ligand for transition metal ions (Cu2+, Zn2+, Ni2+), which is relevant for metal-binding peptides in the research catalog such as copper-binding tripeptides.
Amino Acid Classification and Research Relevance
The twenty standard amino acids are classified based on the physicochemical properties of their side chains into several categories: non-polar aliphatic (Gly, Ala, Val, Leu, Ile, Pro, Met), aromatic (Phe, Trp, Tyr), polar uncharged (Ser, Thr, Cys, Asn, Gln), positively charged at pH 7 (Lys, Arg, His), and negatively charged at pH 7 (Asp, Glu). This classification reflects the chemical behavior of each residue and has direct implications for peptide properties including solubility, stability, chromatographic behavior, and reactivity.
Non-polar aliphatic residues contribute to the hydrophobic core of folded peptides and are the primary determinants of reversed-phase HPLC retention. Proline is unique within this group because its side chain forms a covalent bond with the backbone nitrogen, creating a pyrrolidine ring that constrains the backbone dihedral angles and introduces kinks or turns in the peptide chain. This structural rigidity makes proline a common residue at turn positions in biologically active peptide sequences.
Aromatic residues are significant for UV detection in analytical chemistry. Tryptophan absorbs strongly at both 214 nm (peptide bond) and 280 nm (indole ring, molar absorptivity approximately 5,500 M-1cm-1), while tyrosine absorbs at 280 nm (phenol ring, approximately 1,490 M-1cm-1). Phenylalanine has weak absorption at 280 nm (approximately 1 M-1cm-1). The presence or absence of aromatic residues determines whether a peptide can be detected and quantified at 280 nm. The molar extinction coefficient values were established by Pace et al. (Protein Science, 1995, DOI: 10.1002/pro.5560040423) and are widely used for spectrophotometric concentration determination.
From a stability perspective, certain residues serve as hotspots for specific degradation pathways. Methionine and tryptophan are primary targets for oxidation. Asparagine (especially in Asn-Gly motifs) and glutamine are susceptible to deamidation. Cysteine is involved in disulfide scrambling and air oxidation. Aspartate participates in acid-catalyzed peptide bond cleavage, particularly at Asp-Pro bonds. Knowledge of these susceptibilities, combined with the sequence of a specific research peptide, allows researchers to predict the most likely degradation products and evaluate COA data accordingly.
For additional resources on peptide structure and properties, explore our introduction to peptides, structural peptide database, and research glossary. Browse our full product catalog to view research peptides with complete sequence and molecular weight data documented on batch-specific Certificates of Analysis.





