Protein array of a plant pathogen effector proteins
An academic team composed of scientists from multiple institutions wanted to characterize the biological function of 1,000 candidate effector proteins of a plant pathogen. They had received a grant from the USDA to develop a protein microarray. This valuable resource would be shared between the different members of the team to perform various functional assays. Upon completion of the project, this resource would be made publicly available and shared with the scientific community.
Since the pathogen genome was available, it was technically possible to order the synthesis of 1,000 genes. However, gene synthesis costs would have been prohibitive. A quick analysis showed that cloning the genes would be cheaper.
While the team had standard molecular biology skills, they were not prepared to handle a project of this magnitude. They quickly realized that knowing how to clone a few genes is not enough to clone 1,000 genes in a short period of time. They were under intense pressure to meet project deadlines. Delays in the cloning phase of the project would have delayed the rest of the scientific program.
Laying out the entire process was key to the project success. This involved customizing the LIMS data model to track all the samples generated by the project. We developed dashboards to track the progress of the project. The status of individual clones was monitored using custom reports.
Automated primer design
In order to succeed, the team needed to automate the design of PCR primers. They had to design 2 000 PCR primers with similar Tm and common overhangs. They also need to design thousands of primers for verifying the clone sequence. Since many genes were fairly long, it was not possible to simply sequence the insert using universal primers.
Automating the cloning workflow
First, individual process steps were optimized to maximize to maximize stability and reproducibility. The goal of this preliminary work was to reduce the rate of failure of PCR and cloning reactions. A high failure rate would have resulted in higher costs and significant delays. Detailed Standard Operating Procedures were developed to allow laboratory technicians with limited molecular biology experience to contribute to the project.
As the project moved forward, it became necessary to automate the analysis of quality control data. At that scale, it was merely not possible to “look” at the data as it would have been excessively time consuming. Visual examination of sequencing traces would also have lacked reproducibility. A rigorous and automated data analysis process was necessary.
The first level of quality control was a capillary electrophoresis performed in an Agilent TapeStation. One the main benefits of using this instrument rather than a traditional agarose-gel electrophoresis is the possibility of exporting electropherograms as spreadsheets. A script was developed to analyze the electropherogram profile and compare it to the expected size of the PCR product.
Clones that passed this first level of quality control were sent to a sequencing facility to verify each of the clone sequences. We developed a script to analyze the sequencing reads and to compare the assembled sequence to the gene sequence.
The team produced a collection of 900 plasmids with more than 90% of the target genes in under 4 months. The plasmid collection was shared with the team collaborators to produce the protein array.
The scripts developed in the context of this project have been integrated in GenoFAB plasmid construction data services.