A workflow to estimate telomere length from matched tumor normal whole genome sequencing data (BAMs)¶
Background¶
A workflow to estimate telomere length from matched tumor-normal whole genome sequencing (WGS) data from 25 childhood acute lymphoblastic leukemia cases. WGS data were from Illumina NovaSeq 6000 Sequencing.
Two software (TelomereHunter and TelSeq) were applied to estimate telomere length from matched tumor normal WGS data (BAMs) (processed by GATK Data pre-processing for variant discovery pipeline). A correlation plot was generated to compare the results from the two software.
TelomereHunter¶
Telomere content was quantified using TelomereHunter using ten telomere variant repeats including TCAGGG, TGAGGG, TTGGGG, TTCGGG, TTTGGG, ATAGGG, CATGGG, CTAGGG, GTAGGG and TAAGGG.
TelSeq¶
Workflow¶
step 1: run TelomereHunter
step 2: run TelSeq
step 4: clean aggregated tables and generate final outputs
for telseq, to calculate TL for each sample, we need to take a weighted average of all the read groups within each sample: https://github.com/zd1/telseq/issues/1
Example correlation plot¶
results from TelomereHunter and TelSeq had a high correlation (an example correlation plot generated from step 4)