VP Bioinformatics, Caris Life Sciences, Phoenix, Arizona, United States
Corresponding author details:
Phillip Stafford
VP Bioinformatics
Caris Life Sciences
Arizona,United States
Copyright:
© 2018 Stafford P. This is an openaccess article distributed under the terms of the
Creative Commons Attribution 4.0 international
License, which permits unrestricted use,
distribution and reproduction in any medium,
provided the original author and source are
credited.
in situ peptide synthesis; Photolithography; Random peptide; Immunosignature
There are many ways to manufacture peptide microarrays. For a
review see Gao et al. [27,28]. Light-directed chemical synthesis has
been quite successful, dating back to 1995 (Fodor et al. [29], US Patent
5405783A). Some synthesis methods, direct light on a photoacid [16]
or photobase [30] generator which removes amino acid protecting
groups. Other methods use photoactivatable amino acids [29]. Some
methods use shadow masks [16], some use digital micromirrors [31],
some utilize a peptide ‘toner’ and a 20 ‘color’ laser printer [32], while
still others use electrical circuits to alter the pH in a microwell [33].
Each of these synthesis methods relies on the systematic addition of
amino acids to a growing peptide. For random peptide libraries, all of
these methods can take advantage of reduced synthetic steps using
pseudorandom peptides.
One of the highest throughput methods for the production of
peptide microarrays is shadow-mask lithography [34]. Shadowmasks are pre-made so they are not as flexible as programmable
light-directed arrays, and they require substantial up-front costs.
However, by implementing high-resolution steppers, automated
aligners, synthesizers and large wafers, mask-based peptide
microarrays can be very inexpensive, high density, high precision
and high throughput. These characteristics are important for high
volume/low costproduction.
Random vs. pseudorandom libraries
Many immunological experiments make use of random sequence
peptides [35]. Phage display uses random sequence libraries [36]
and has been used to identify therapeutics [37], epitopes [38], even
enzyme modulators [39]. Panning large libraries can identify very
specific biomarkers, but these methods are neither cheap nor high
throughput. Immunosignatures use a fixed library of a given size
for any sample. A diverse surface of 3D shapes produced by random
sequence peptides has been very successful in many applications
[11,13,21,40-42]. Surprisingly, 10,000 peptides were sufficient to
distinguish over 15 different diseases simultaneously, with >95%
accuracy [14]. However, 10,000 peptides likely had a finite level of
specificity and sensitivity. It was decided to create at least 300,000
peptides for the next immunosignature iteration. When M=AA*L,
synthesis of 300,000 17mers would take over 2 weeks of continuous
synthesis. Altering the library from random to pseudorandom could
drastically reduce manufacturing time.
How in-situ peptide synthesis is affected by amino acid
selection
The simplest algorithm for the number of synthesis steps (or
photomasks) required to produce a peptide of length L is M (number
of masks)=AA (number of amino acids used) * L (length). For a 17mer
using 20 amino acids, 20*17=340 synthetic steps are required. In
mask-based peptide photolithography, 340 steps would take over
2 weeks to complete. For immunosignature peptides, M can be less
than AA*L, but the resulting peptides become pseudorandom. Each
mask costs several thousand dollars. Each synthesis step takes
approximately 15 minutes. It is beneficial to reduce M. Figure 1
illustrates the bias imposed by forcing M to be less than AA*L.
Experimental design
We asked two questions about reducing synthesis steps. First,
how do the chemical characteristics of the peptide library change
as mask numbers are reduced? Second, how does reduced diversity
alter the outcome of a biochemical assay? The first question was
addressed using predicted characteristics of hypothetical peptides.
For the second question, an immunosignature of monoclonal
antibodies and an immunosignature of human sera are used to
measure performance. We first established the starting parameters.
Kuznetsov [43] noted that some amino acids are preferentially used
in immunosignatures [43]. To maintain consistency with the actual
immunosignature arrays that will be manufactured, we eliminated
Methionine, Threonine and Isoleucine because they appear rarely in
peptides that make up diagnostic immunosignatures [43]. Cysteine can form disulfide bonds under aqueous conditions, and was
removed for that reason. Thus our virtual 17mer peptide library used
16 amino acids (no C, T, M, or I). We created two control sets: M=272
uses 16 amino acids and no other mask restrictions. M=323 leaves
out only Cysteine. The M=323 case matches exactly the existing
10,000 peptide spotted peptide library that has been used in several
early publications (NCBI GEO accession GPL17600). We also created
intermediate test sets where M=35, M=70, and M=140.
The first analysis was completed using 100,000 virtual peptides.
For each library, an algorithm (see Supplemental Information) creates
M masks with 100,000 features and ‘synthesizes’ peptides using
these virtual masks. The resulting library is analyzed for chemical
properties using ProtParam ((http://web.expasy.org/protparam/).
The second analysis involved synthesizing a library of actual
peptides corresponding to the same mask restrictions. 384 peptides
were selected at random from each 100,000 virtual peptide libraries
and sent to Sigma Genosys (St. Louis, MO) for synthesis. We already
had several M=323 10,000 peptide microarrays made, so the human
sera experiments for M=323 were done by randomly picking 384
peptides from the 10,000 existing peptides to simulate a 384-peptide
library. Each of these 384-peptide sets (except M=323) were
tested using 51 different commercial monoclonals (Supplemental
Information Table 1). We previously published data on monoclonals
using the 10,000 peptide microarray [7]. Human sera from 8 controls
(healthy) humans and 8 patients diagnosed with Coccidiomycoses
[11] were used for all libraries including M=323.
Figure 3: Display of unique 2mer, 3mer, 4mer, 5mer and 6mers
contained within 100,000 peptides using the M=35, M=70, M=140,
M=272 and M=323 mask restriction sets.
The Y-axis is the proportion of unique nmers, and the X-axis is
increasing length of subsequences.
There were no funding agencies for this work. Data are
freely available at the Gene Expression Omnibus at NCBI. Platforms:
GPL17600, GPL17490, GPL17679, Experimental Series: GSE49217,
GSE50044, GSE50045.
Copyright © 2020 Boffin Access Limited.