Sequencing Bioinformatics Analysis
The Translational Genomics Core at Partners Personalized Medicine offers standard bioinformatics analysis on all libraries sequenced in our facility. This standard package includes the return of demultiplexed, unaligned, filtered reads in FASTQ format, as well as a summary statistic report.
|Basic service||Requirements||Deliverable data||Mode of delivery|
|Base calling||None||FASTQ sequence, summary||Partners file transfer|
|Base calling and alignment||Reference sequence/files||FASTQ sequence, export summary||Partners file transfer utility, FTP|
RNA-Seq and DGE analysis
For RNA-Seq and DGE analysis, we offer a standard data package that includes alignment of reads to a reference genome with TopHat, transcript abundance estimation with Cufflinks, and differential expression analysis with Cuffdiff and CummeRbund. The report also provides several quality control (QC) metrics for each sample.
For shRNA methods, we provide the researcher with demultiplexed counts of shRNA targets, which appear in the library. Each batch of the data is aligned to reference templates. Typically, 80–95% of the reads align to the reference sequences. Additional and/or customized analysis should be discussed with us in advance, and a quote will be provided on a per-project basis. The data will be available for delivery to a customer via our secure Web interface, FTP site, or an encrypted hard drive.
Demultiplexing of pooled samples
During analysis of libraries sequenced in our facility, CASAVA demultiplexes pooled samples based on the index sequences provided during order entry. If the Index sequences are incorrect, then all reads will be sent to the “Undetermined Reads” folder, and the libraries will not be demultiplexed. This can happen if the investigator provides the wrong sequence or if the index sequence is not added in the correct orientation. While we always troubleshoot when this happens to find what indexes are actually present within the “Undetermined Reads” folder and correct errors, this could effectively double the time required for data analysis for all customers with libraries on that flow cell. We ask that you use this guide to ensure that your indexes are entered correctly into our system.
If you are using libraries generated with the Illumina TruSeq methods or any method where PCR enrichment is carried out after adaptor ligation, then your index should be added into our Laboratory Information Management System (LIMS) in the forward orientation.