Go to  Advanced Search

SAGE2Splice : unmapped SAGE tags reveal novel splice junctions

Show full item record

Files in this item

Files Size Format Description   View
ubc_2005-0512.pdf 5.817Mb Adobe Portable Document Format   View/Open
Title: SAGE2Splice : unmapped SAGE tags reveal novel splice junctions
Author: Kuo, Byron Yu-Lin
Degree Master of Science - MSc
Program Genetics
Copyright Date: 2005
Abstract: Serial analysis of gene expression (SAGE) not only is a method for profiling the global expression of genes, but also offers the opportunity for the discovery of novel transcripts. SAGE tags are mapped to known transcripts to determine the source of tags. We hypothesized that tags that map neither to a known transcript nor to the genome span a splice junction, for which the exon combination or exon(s) are unknown. Splice junctions are typically recognized by the pair of highly conserved dinucleotides at each edge of an intron, GT at the 5' end and AG at the 3' end, as well as by other less conserved nucleotides flanking the junctions. In the known transcriptome, between 1.6 to 6.2% of predicted tags span a splice junction. We have developed an algorithm, SAGE2Splice, to efficiently map these unmapped SAGE tags to potential splice junctions in a genome. An evaluation scheme was designed based on position weight matrices to t assess the quality of candidates. Candidates were classified into three types of spliced tags, reflecting the previous annotations of the putative splice junctions. A Type I tag spans a novel junction where the exons are known; a Type 2 tag spans a previously known and an unknown exon; and a Type 3 tag spans two previously unknown exons. Analysis of predicted tags extracted from EST sequences demonstrated that candidate junctions having the splice junction located closer to the centre of the tags are more reliable. Using high sensitivity and high specificity parameters, 7,757 candidates were predicted from 1,639 of 20,000 unmapped tags by SAGE2Splice. We selected 12 r candidates splice junctions and tested them using RT-PCR. Nine of these twelve candidates were validated by RT-PCR and sequencing, and among these, four revealed previously uncharacterized exons. To screen more unmapped SAGE tags, we proposed methods to improve SAGE2Splice in engineering efficiency, program usability, and candidate evaluation methods, as well as to include a high throughput laboratory procedure for testing the predicted candidates. We expect that many more novel transcripts can be discovered using SAGE2Splice. SAGE2Splice is available online at http://www.bcgsc.ca/sage2splice/.
URI: http://hdl.handle.net/2429/16587
Series/Report no. UBC Retrospective Theses Digitization Project [http://www.library.ubc.ca/archives/retro_theses/]

This item appears in the following Collection(s)

Show full item record

All items in cIRcle are protected by copyright, with all rights reserved.

UBC Library
1961 East Mall
Vancouver, B.C.
Canada V6T 1Z1
Tel: 604-822-6375
Fax: 604-822-3893