Home By Sequence By GI By Site Help
GSDS Help Page

  • Citation
  • Sample output
  • Sample input
  • Data implement

    Citation

    Guo AY, Zhu QH, Chen X, Luo JC. GSDS: a gene structure display server. Yi Chuan. 2007 Aug;29(8):1023-6.

    Sample output gene structure diagram:

    The blue line represent 5'-UTR or 3'-UTR, green line represent exons, red line represent
    the assigned marked regions, the thin black line represent the introns
    If an ncbi entry has more than one gene in it, GSDS will list all genes and their positions.
    If you input GenBank gi, GSDS will use the accession number not the gi in output result.


    Sample Input Data

  • GenBank nucleotide accession number or gi:

    IDs can be one line one id or separated by space, tab, comma, and semicolon. eg:
    AY077757
    AY514043
    BK005038
    BK005071
    BK005073
    Please check the gene has CDS column in the FEATURES table of NCBI entry.
    If it has no CDS, there will be no gene structure of this gene.
    Because GSDS aims to draw gene structure diagram on the web browser, please do not upload those gi or accession have too long sequences, such as representing a whole chromosome or a BAC clone. If the entry contains several genes, GSDS can deal with. But if the entry contain too many genes, it may not work.

  • CDS Sequence in FASTA format:

    The CDS sequence and the genomic sequence of the same gene must use the same ID

    >AY077757 CDS sequence
    ATGGGGGCGTTGGAGATATTAGATTACAACAACACTTTAGGAAAGAGAGACAGGGACTAT
    GAAGTGAAGGAAGCGGCATGCATGGGAATACAAAACGCTAGGCAGCTGCTCCAGTCCCTG
    ACGCAGGTGCGATCTCCAGTGGTGGACGAAGAATGCGATGTCATGGCTGGCGCTGCCATA
    TCCAAGTTTCAGAAGGTGGTGTCACTACTGAGTCGCACTGGTCATGCACGGTTTCGTAGG
    AGAACGCGCAACGCTGCTGTTGCCGGTTACGCAGGCGTCTTCTTAGAGAGCTCCAACTTC
    TTCAGAGAAAATTCCCAGGAGACGTCGAGGGACAGAATCGTCTCGTCGGGCCATGCTAGC
    CCATCTCAGTTCACGCCGACGTCCTCGTCCAAGCCTCCTCAGTCACCTGAATTGCAGGCG
    ATCAAATATAAGGTGTTTCCTCAAAGCTCTCGTTCCGCTGATGCGACGCCTGCCTCCAGT
    GACCCTGCTTCAGGAGTCCATCATCCAAAGCCACTTCAGATCCTTCACAGCTCCATGATG
    CAGCAAAGCATTCCAGAACATATACTGCGTCCAGTGGCTAGTGCTGCGTATCGGCCAACT
    GCCCTTCCCCCGAATCCGTTCAACAAACAGGAGGTGGGCAGCAAGGAGGGGGTGAGCGGC
    CACAGTCCGGACAGTTCGTTGAGCTCAGGACCTCCGCAATCAACTACAACGGCGTCGTTC
    CCAACCATGAGTGTGCAGGATGCGAGGATAACGAGCCTGCAGAATATGAAAACAGCCGAG
    CAACCTTCGGCGTTGCCCCCTCGCCCGCAGCCACCAACTCCCAAGAAAAAGTGCTCCGGG
    CAATCCGATGAGAACGGTGCAACTTGCGCAATCCTTGGCCGCTGCCATTGTTCAAAACGC
    AGGAAATTGCGGTTGAAGAGGACAATCACGGTTCGAGCAATCAGCAGCAAGTTGGCTGAT
    ATACCTTCGGATGAGTATTCATGGCGTAAGTATGGCCAGAAGCCTATCAAAGGATCACCA
    CATCCGAGAGGATACTACAAGTGCAGCAGCATACGAGGCTGTCCAGCGAGAAAACACGTA
    GAGCGGTCAATGGAAGACTCATCTATGTTGATTGTGACATACGAAGGCGATCATAACCAT
    CCGCAATCGTCATCTGCTAATGGCGGATTAACAGTGCAGTCGCAATAG
    >AY514043 CDS sequence
    ATGGCCGTCGATCTAATGCGTTTCCCCAAGATAGATGATCAAACGGCTATTCAAGAAGCT
    GCATCGCAAGGTTTACAGAGTATGGAGCATCTGATCCGCGTCCTCTCTAACCGTCCCGAA
    CAACAACACACCGTTGACTGCTCCGAGATCACTGATTTCACCGTTTCCAAATTCAAAACC
    GTCATTTCTCTTCTTAACCGTACCGGTCACGCCCGGTTTAGACGCGGACCTGTTCGCTCA
    TCCCCCGTCGTATCTCCTCCACTCCCACAGATCGTTAAAACTGCTCCGATTGTTTCGCAG
    CCGTTAAGAACAACGACTAATCTTTCTCAAACCGCTCCTCCTCCGTCGAGCTTCGTCCTT
    CCGAGGCAGCCCAGGCGGTCACACTCGGATTTCTCTAAACCGACCATCTTCGGTTCCAAA
    TCCAAAAGCTCCGACCTAGAGTTCTCGAAGGAGAACTTCAGCGTCTCTTTAAACTCTTCC
    TACATGTCGTCGGCGATTACCGGAGACGGCAGCGTCTCAAACGGGAAAATCTTCCTCGCC
    TCTGCTCCGTCGCAGCCAGTTACCTCCTCAGGAAAGCCACCGTTGGCCGGTCATCCTTAC
    AGAAAGAGATGCCTCGAGCACGAGCACTCCGAGAGTTTCTCCGGCAGAGTCTCCGGCTCA
    GGTCACGGGAAATGCCATTGCAAAAAAAGCAGGAAAAACAAGATGAAGAGAACAGTGAGG
    GTACCGGCGATAAGTGCAAAGATCGCCGATATTCCACCGGACGAGTACTCGTGGAGGAAG
    TACGGACAAAAGCCGATCAAAGGCTCACCACACCCACGTGGTTACTACAAGTGCAGTACG
    TATAGAGGATGTCCAGCAAGGAAACACGTGGAACGAGCGTTAGATGATCCAACGATGCTT
    ATCGTTACGTACGAAGGAGAGCACCGTCACAACCAATCCGCGGGGGGAATGCACGAGACT
    ATTTCTTCTTCAGGCGTTAATGATTTAGTGTTTGCCTCGGCTTGA
    >BK005038 CDS sequence
    ATGGAGGAGTGGAAGGATTCCAACCATCGAGGCGCGGATTACCTGATGACGATGCCGATG
    CAGAACTTCCTCGCCGACGCGTTCCCGCCACCGGAGCTCTTGGAGGGAGAAGGCGGGTTC
    GAGAAGCACGGCCTGTCGGTGGCCGTTGGCTCGCCGCCGCCGACGCCGCCGCCTCCGGAG
    GACGGGTGCTCGCCGCTGCCACTGACGCCGCAGTTCGGCCAGAAGTTCGGCTCCGGCGGC
    GGCGGCGGCGGCAGCCTCGCCGACAGGAGGGCGAGAGGCGGGTTCAGCAACGTCGCCAGG
    ATCAGCGTGCCGTACAACCAGCCGGCGGCGGACGTGTCGTCGGCGGGGGCGCCGTCGCCG
    TACGTGACGATCCCGCCCGGCCTGAGCCCGACGACTCTGCTGGAATCGCCGGTCTTCTCC
    AATGCCATGGGCCAGGCCTCGCCGACCACTGGGAAGCTGCACATGCTTGGTGGTGCCAAC
    GACAGCAATCCAATCAGATTTGAATCCCCTCGGATCGAAGAAGGATCTGGTGCATTTTCT
    TTCAAGCCTCTGAATCTCGCATCCTCACACTACGCAGCTGAAGAAAAGACGAAATCTCTA
    CCCAACAACCAGCATCAGTCGCTACCGATTTCTGTCAAGACTGAAGCTACTAGCATTCAA
    ACCGCACAAGATGAAGCAGCAGCCAACCAACTGATGCAGCCGCAGTTCAACGGCGGCAAG
    CGGAGCCGCGCTGCACCTGACAACGGCGGCGACGGCGAGGGCCAGCCGGCGGAGGGCGAC
    GCGAAGGCTGACTCCTCCTCCGGCGCGGCCGCGGTCGCCGTCGTCGCCGCCGCCGCGGCG
    GCGGTGGCGGAGGACGGGTACAGCTGGAGGAAGTACGGGCAGAAGCAGGTGAAGCACAGC
    GAGTACCCGAGGAGCTACTACAAGTGCACGCACGCCAGCTGCGCGGTGAAGAAGAAGGTG
    GAGCGCTCGCACGAGGGCCACGTCACGGAGATCATCTACAAGGGCACCCACAACCACCCC
    AAGCCGGCGGCGAGCCGCCGCCCCCCCGTCCATCCTCCGCCGCCGTCGCCGGCGACGACG
    ACGACGACGCCGCTGCCGCCAGGCGACGCGCAGGCCGACCACGCGCCCGACGGCGGCGGC
    GGCAGTACCCCAGTTGGCGCCGGACAGGCGGGCGCGGAGTGGCACAACGGCGGCGTGGTT
    GGCGGCGAGGGGCTGGTGGACGCGACGTCGTCTCCCTCCGTCCCCGGCGAGCTCTGCGAG
    TCGACGGCGTCGATGCAGGTCCATGAAGGCGCGGCGGCGGCGCAGCTGGGGGAATCCCCC
    GAGGGCGTCGACGTCACGTCTGCGGTGTCCGACGAGGTGGACAGGGATGACAAGGCGACG
    CACGTGTTGCCCCTGGCCGCCGCCGCCGCCGACGGCGAGAGCGACGAGCTGGAGCGAAAG
    AGAAGGAAGCTGGACTCCTGCGCCACCATGGACATGAGCACGGCGTCGAGGGCGGTGCGC
    GAGCCGCGGGTGGTGATCCAGACGACGAGCGAGGTGGACATCCTCGACGACGGCTACCGC
    TGGCGCAAGTACGGGCAGAAGGTCGTCAAGGGGAACCCCAACCCAAGCTCCTCCTCCTCC
    ATGGATGCTGATCGATCTCTCGTCGTCGTCGTCGTGATCAGGAGCTACTACAAGTGCACG
    CACCCGGGGTGCCTGGTGCGGAAGCACGTGGAGCGCGCGTCGCACGACCTCAAGTCGGTG
    ATCACCACGTACGAGGGGAAGCACAACCACGAGGTCCCCGCGGCGAGGAACAGCGGCCAC
    CCGGCGGGCTCGGCTTCGCCCGGCGGCGGCGCGGGGTCGTCGTCGCAGCCCCACGGCGTC
    GGCGTCGGCGGGCGCAGGCCGGAGGTGCCGTCGGTGCAGGAGAGCCTGATGAGGCTCGGC
    GGCGGCTGCGGCGCGGCGCCGTTCCCGCCCCACTTCGGCCTGCACCTGCCGCCGCCGCCG
    CCGAGGGACCCGCTCGCGCCGATGAGCAACTTCCCCTACTCGCTCGGCCACGCGCCGTCG
    CCGGCGCTGCGGGGCCTGCCGCCGCCCCCGCCCCCGCCGCCGTCGGCGTCGGCGCTGGCG
    GTGGCGGGGCTCGGCGGCGTGGTGGAGGGGCTCAAGTACCCGATGCTGGCGCCGCCGTCG
    GTGCACTCGCTGCTGAGGCACCGCCAGGGCGGCGGCATGGAGGCGGTGGTGGTCCCCAAG
    GCGGAGGTGAAGCAGGAGGCGATGCGGCCCGCCGCCGCCGTCGCCGGCGCGGGGCGCGGC
    GCGGCGGTGTATCAGCAGGCGATGAGCAGGGTGTCGCTGGGGAATCAACTGTAG
    >BK005071 CDS sequence
    ATGGCCGTGGACCTGATGGGCTGCTACGCCCCGCGCCGCGCAGACGACCAGCTCGCCATC
    CAGGAGGCGGCCACCGCCGGCCTCCGCAGCCTGGAGATGCTCGTGTCGTCCCTCTCCTCC
    TCCTCTCAGGCCGCCGGGGCTCACAAGGCCTCGCCGCAGCAGCAGCCGTTCGGCGAGATC
    GCCGACCAGGCCGTCTCCAAGTTCCGCAAGGTCATCTCCATCCTCGACCGCACCGGCCAC
    GCCCGCTTCCGCCGCGGCCCGGTCGAGTCGTCTGCTCCCGCCGCCCCCGTCGCTGCTGCT
    CCCCCTCCTCCTCCTCCACCACCGGCGCCGGTCGCTGCCGCCCTCGCGCCGACCTCCTCG
    CAGCCGCAGACCCTGACGCTGGACTTCACGAAGCCGAACCTGACCATGTCGGCCGCGACG
    TCCGTGACATCCACGTCGTTCTTCTCGTCGGTGACGGCCGGCGAGGGAAGCGTTTCCAAG
    GGCCGGAGCCTGCTCTCCTCCGGCAAGCCGCCGCTGTCTGGGCACAAGCGGAAGCCCTGC
    GCCGGCGGCCACTCCGAGGCCACCGCCAACGGCGGCCGCTGCCACTGCTCCAAGAGAAGG
    AAGAACCGGGTGAAGAGGACGATCCGAGTGCCGGCAATCAGCTCGAAGATCGCCGACATC
    CCGCCGGACGAATACTCGTGGAGGAAGTACGGCCAGAAACCCATCAAGGGCTCCCCTTAC
    CCACGGGGCTACTACAAGTGCAGCACAGTGAGAGGATGCCCCGCGCGGAAGCACGTGGAG
    CGCGCCACCGACGACCCGGCGATGCTGGTCGTGACCTACGAGGGCGAGCACCGCCACACG
    CCGGGGCCCCTCCCGGCGCCACCCGCCGCCGCCGCCGTCGCCGCGATGCCGGTGTCCGTC
    GCCGTGTCCACCGGCAACGGACATGTCTAA
    >BK005073 CDS sequence
    ATGACCGCCGCGCCGGGGAGCCTCCCGCTGGTGAACTCGAGGCCCGTCTCCCTCTCCTTG
    GCGGCGAGCAGGTCGTCCTTCTCCAGCCTGCTCAGTGGCGGCGCCGGCTCGTCGTTGAAC
    CTCATGACGCCGCCGTCTTCTCTCCCGCCGTCGTCGCCGTCGTCCTACTTCGGCGGCGTC
    TCGTCCTCCGGGTTTCTCGACTCGCCGATCCTCCTCACGCCCAGTTTATTCCCATCGCCG
    ACGACGACGGGCGCATTGTTCAGCTGGATTACGACGGCGACGGCGACGGCGGCGATAGCG
    CCGGAGAGCCAGGTGCAAGGAGGGGTCAAGGACGAGCAGCAACAGTACTCGGACTTCACG
    TTCCTGCCGACGGCGTCCACGGCGCCGGCGACGACGATGGCCGGAGCCACCGCGACGACG
    TCCAACTCCTTCATGCAGGACTCCATGCTAATGGCTCCATTGGGAGGGGACCCGTACAAT
    GGCGAGCAGCAGCAGCCATGGAGCTACCAAGAACCGACCATGGACGCTGACACTAGGCCA
    GCGGAATTCACCTCGTCGGCGGCGGCGGGTGACGTGGCCGGGAACGGCAGCTACAGCCAG
    GTGGCGGCGCCGGCGGCGGCCGGCGGCTTCCGTCAGCAGAGCCGGCGGTCGTCGGACGAC
    GGCTACAACTGGCGCAAGTACGGGCAGAAGCAGATGAAGGGGAGCGAGAACCCGCGCAGC
    TACTACAAGTGCACCTTCCCTGGCTGCCCGACCAAGAAGAAGGTGGAGCAGTCGCCGGAC
    GGCCAGGTCACCGAGATCGTCTACAAGGGCGCGCACAGCCACCCCAAGCCGCCGCAGAAC
    GGCCGCGGCCGCGGCGGCTCCGGCTACGCGCTGCATGGCGGCGCCGCCAGCGACGCATAC
    TCCTCCGCCGACGCGCTCTCCGGCACGCCGGTGGCGACGCCCGAGAACTCGTCGGCGTCG
    TTCGGGGACGACGAGGCGGTCAACGGCGTCAGCTCGTCGCTGCGGGTCGCCTCTAGCGTC
    GGCGGCGGCGAGGATCTCGACGACGACGAGCCTGATTCCAAGAGGTGGAGGAGAGACGGC
    GGCGACGGCGAGGGCGTCTCGCTGGTGGCCGGCAACCGGACGGTGCGTGAGCCGAGGGTG
    GTTGTGCAGACGATGAGCGACATCGACATCCTCGACGACGGCTACCGGTGGCGCAAGTAC
    GGGCAGAAGGTGGTCAAGGGCAACCCAAACCCAAGGAGCTACTACAAGTGCACGACGGCC
    GGGTGCCCCGTGCGGAAGCACGTGGAGCGCGCGTCCAACGACCTGCGCGCGGTGATCACC
    ACGTACGAGGGCAAGCACAACCACGACGTGCCCGCGGCGCGCGGGAGCGCCGCCGCCGCG
    CTCTACCGCGCCACGCCGCCGCCGCAGGCGAGCAACGCCGGCATGATGCCCACCACGGCG
    CAGCCCTCGAGCTACCTGCAGGGCGGCGGCGGCGTCCTTCCGGCCGGCGGGTACGGCGCG
    TCGTACGGCGGCGCGCCGACGACGACGCAGCCCGCGAACGGCGGTGGCTTCGCCGCCCTG
    TCCGGCCGGTTCGACGACGACGCGACGGGAGCGTCTTACTCTTACACGAGCCAGCAGCAG
    CAGCAGCCGAACGACGCGGTGTACTACGCGTCGAGAGCCAAGGATGAGCCGAGAGACGAC
    GGCATCATGTCGTTCTTTGAGCAGCCGCTGCTGTTTTGA

  • Genomic sequence in FASTA format:

    >AY077757 Gene Sequence
    CAGGATCGTTTCCAAGGCTGAGACACAGCTTGAGGTTTTATAAGCGGCATATCTTCATGA
    GCGGCGCAGCAGCAACAGCGGAAGCACATGAAATGAGATCTCTGGGATAACCATGCGGCC
    GCAACTAGAGTAACGACGCGGCGCGGTGAGCAGGTAGATCTTGATCTCCAATTCAACCCA
    TAGCTACGATCTGCGGGGATTCTGAAGCTCTTAGACTCCACAAGGTCGTGTTCTTAAATT
    GCATGTGTAAAGAGTTGCCGTCTACCGGGTGGTCTCGTGTGAATGGTCCAGTTCCAAACC
    ATCCCCGAGCGCAATCGTCGCACTGTTTTTCTCCCAGCCGGTGACCAGTGAGAAACTGGC
    CAACAGCTGATTGTTTAAACTCTATCTCCAAATCTATACACGGTGCAGTTCAGTCCAAGC
    CTGTTAGGACAGAATCATGCCGCACCCCGACCTTCAAATCAAAGTTACAGAGCTAGAAAC
    AGTAACCTTAAAACATGTGAGCCAGACATTGAAACTGTTGCTTCCATGCTGATCTAGTGT
    GTTGCGCGTGTCATATGTTCAACAGATGTGCCTCTTACGAACTGCTACAGAAGCTTCATG
    AATCACACAGCAATTGGCCTTTAAATCGTATGGCTTAACTTTTGATAGCAACCCTTCTAC
    AAGAGTGGAGTGCTTAATGAAAGTACGCCAATAAACGTAGTTCCTGCGACGTCTTCCCAG
    CGAACATGGGGGCGTTGGAGATATTAGATTACAACAACACTTTAGGAAAGAGAGACAGGG
    ACTATGAAGTGAAGGAAGCGGCATGCATGGGAATACAAAACGCTAGGCAGCTGCTCCAGT
    CCCTGACGCAGGTGCGATCTCCAGTGGTGGACGAAGAATGCGATGTCATGGCTGGCGCTG
    CCATATCCAAGTTTCAGAAGGTGGTGTCACTACTGAGTCGCACTGGTCATGCACGGTTTC
    GTAGGAGAACGCGCAACGCTGCTGTTGCCGGTTACGCAGGCGTCTTCTTAGAGAGCTCCA
    ACTTCTTCAGAGAAAATTCCCAGGAGACGTCGAGGGACAGAATCGTCTCGTCGGGCCATG
    CTAGCCCATCTCAGTTCACGCCGACGTCCTCGTCCAAGCCTCCTCAGTCACCTGAATTGC
    AGGCGATCAAATATAAGGTATAATATCATAACCCTTCAAAGTTCTCTTTAAAGCAACCCA
    AAGCTCGAAATCCGTCACTTCAATTGGTTATTTCAAATGTAATTTTTACTGTATCTGACC
    GCGTTAGGGTGATTCTCAATTTTTCAGGTGTTTCCTCAAAGCTCTCGTTCCGCTGATGCG
    ACGCCTGCCTCCAGTGACCCTGCTTCAGGAGTCCATCATCCAAAGCCACTTCAGATCCTT
    CACAGCTCCATGATGCAGCAAAGCATTCCAGAACATATACTGCGTCCAGTGGCTAGTGCT
    GCGTATCGGCCAACTGCCCTTCCCCCGAATCCGTTCAACAAACAGGAGGTGGGCAGCAAG
    GAGGGGGTGAGCGGCCACAGTCCGGACAGTTCGTTGAGCTCAGGACCTCCGCAATCAACT
    ACAACGGCGTCGTTCCCAACCATGAGTGTGCAGGATGCGAGGATAACGAGCCTGCAGAAT
    ATGAAAACAGCCGAGCAACCTTCGGCGTTGCCCCCTCGCCCGCAGCCACCAACTCCCAAG
    AAAAAGTGCTCCGGGCAATCCGATGAGAACGGTGCAACTTGCGCAATCCTTGGCCGCTGC
    CATTGTTCAAAACGCAGGTATTTGGTCCACATTCTCTGACCTGAAAACACAACATCGTAA
    TTTACTGCGTCGTAGCTGGACCTAGAAAAGTGAAAGAGGCAATTCTGAAATTGAATTTGG
    TTGACAGGAAATTGCGGTTGAAGAGGACAATCACGGTTCGAGCAATCAGCAGCAAGTTGG
    CTGATATACCTTCGGATGAGTATTCATGGCGTAAGTATGGCCAGAAGCCTATCAAAGGAT
    CACCACATCCGAGGTACAGATGTGATCTTTGCTTGAACCCCATGTGTATGCTGAATGTCG
    GTCCCTGACTGCATGTATTGTTTCATGGCTTAGGAACTGGTGTATTGAGACTTTGCTTAT
    GCAGACCATAGTTCTAAATTTGGGGACGTTTCTGGATGTGCAGAGGATACTACAAGTGCA
    GCAGCATACGAGGCTGTCCAGCGAGAAAACACGTAGAGCGGTCAATGGAAGACTCATCTA
    TGTTGATTGTGACATACGAAGGCGATCATAACCATCCGCAATCGTCATCTGCTAATGGCG
    GATTAACAGTGCAGTCGCAATAGACAACACGCACGTACATTGCCTTCGCATTATCGCCTA
    GTAATGAGGAAAGCACAAACTCCTCTCAATGGCTTACGCGTGAGGATGTCTGCAAGCATT
    TCAAGTTTTTGCCCAGTTTGTGCTCCATGTTTTTTTGTTAGGACATTACCTATGGCACAA
    TGCCCCCGTCCGACGAAGCCCGTGACTTATGTTCTGTAGCAATGTTCTCATGCGTGATTG
    GCTAGAGAAGTGTGCTCGACGAGTCAGGAACATTAACCTCCTAGGTGTGCCCCCAAAGTT
    GGAAGCGTTCTGCTTATCAGGGATCAAGAGGTACGCACAGACGAAGATATCTACAGGTGA
    TGCCTTTTAATTCTTCGGTGCTCAATGGCTCCACCTCTGGAGCGGAGAGAGAGAAGAATG
    AATATGAATGCAAGTATCCTTCGCGATGCGC
    >AY514043 Gene Sequence
    ATGGCCGTCGATCTAATGCGTTTCCCCAAGATAGATGATCAAACGGCTATTCAAGAAGCT
    GCATCGCAAGGTTTACAGAGTATGGAGCATCTGATCCGCGTCCTCTCTAACCGTCCCGAA
    CAACAACACACCGTTGACTGCTCCGAGATCACTGATTTCACCGTTTCCAAATTCAAAACC
    GTCATTTCTCTTCTTAACCGTACCGGTCACGCCCGGTTTAGACGCGGACCTGTTCGCTCA
    TCCCCCGTCGTATCTCCTCCACTCCCACAGATCGTTAAAACTGCTCCGATTGTTTCGCAG
    CCGTTAAGAACAACGACTAATCTTTCTCAAACCGCTCCTCCTCCGTCGAGCTTCGTCCTT
    CCGAGGCAGCCCAGGCGGTCACACTCGGATTTCTCTAAACCGACCATCTTCGGTTCCAAA
    TCCAAAAGCTCCGACCTAGAGTTCTCGAAGGAGAACTTCAGCGTCTCTTTAAACTCTTCC
    TACATGTCGTCGGCGATTACCGGAGACGGCAGCGTCTCAAACGGGAAAATCTTCCTCGCC
    TCTGCTCCGTCGCAGCCAGTTACCTCCTCAGGAAAGCCACCGTTGGCCGGTCATCCTTAC
    AGAAAGAGATGCCTCGAGCACGAGCACTCCGAGAGTTTCTCCGGCAGAGTCTCCGGCTCA
    GGTCACGGGAAATGCCATTGCAAAAAAAGGTATTGTTACGTTACGTTACGTCGCCCGCCT
    GTCGCTTTTAACAAACTTACTCAAGTGACTTCCGTTATTTTTAATTTCGATATATTCCAA
    CCCCTTGGTTGGCTATTATTACCCTCCTCGATACATCATTGATTAAATTACTACTTAATT
    ATTCAATTAGGTAAACCGTTAACATTATTCCCGGTTTAGTCAATAGTTATATAGGTTTAG
    CTCGCCGACAACTACTTTTAAAAACCTGGGTTTTGACCATTGACTTTTTAAATCCGAACC
    AGCTCATTAATTGATTGTTAATTTTTATATGAATGAAGCAGGAAAAACAAGATGAAGAGA
    ACAGTGAGGGTACCGGCGATAAGTGCAAAGATCGCCGATATTCCACCGGACGAGTACTCG
    TGGAGGAAGTACGGACAAAAGCCGATCAAAGGCTCACCACACCCACGGTAACTATCGTCT
    ATTTATCCACCGTTGATAAATAAATTAATTCCCATTGCAATCTATAAAGATCTAACGGTG
    GTTATTTGTTTATGATCGATGCAGTGGTTACTACAAGTGCAGTACGTATAGAGGATGTCC
    AGCAAGGAAACACGTGGAACGAGCGTTAGATGATCCAACGATGCTTATCGTTACGTACGA
    AGGAGAGCACCGTCACAACCAATCCGCGGGGGGAATGCACGAGACTATTTCTTCTTCAGG
    CGTTAATGATTTAGTGTTTGCCTCGGCTTGA
    >BK005038 Gene Sequence
    ATGGAGGAGTGGAAGGATTCCAACCATCGAGGCGCGGATTACCTGATGACGATGCCGATG
    CAGAACTTCCTCGCCGACGCGTTCCCGCCACCGGAGCTCTTGGAGGGAGAAGGCGGGTTC
    GAGAAGCACGGCCTGTCGGTGGCCGTTGGCTCGCCGCCGCCGACGCCGCCGCCTCCGGAG
    GACGGGTGCTCGCCGCTGCCACTGACGCCGCAGTTCGGCCAGAAGTTCGGCTCCGGCGGC
    GGCGGCGGCGGCAGCCTCGCCGACAGGAGGGCGAGAGGCGGGTTCAGCAACGTCGCCAGG
    ATCAGCGTGCCGTACAACCAGCCGGCGGCGGACGTGTCGTCGGCGGGGGCGCCGTCGCCG
    TACGTGACGATCCCGCCCGGCCTGAGCCCGACGACTCTGCTGGAATCGCCGGTCTTCTCC
    AATGCCATGGTATCCATCATCAATTCATCACTTGGTCTGTGGTTACGATTTCTAGCTTGA
    TAAGCACACCGAACCTTGTGTTTGTGTGTGCATATAATGTTAGTAAGTTTTAGTGTTAGA
    TTCGATTGCTTTTTGTGTCCATGTAATGTTAAGCTGGCAGGCAAGAATGTTTGCTTTACT
    CTACTGAACTTTTGACTTCTCCGGGAGTTCGTACTTGCTGCTCAGACTTCCTCCATCATA
    CTTGCTGCTTTGTTTTTAAAGATACTTATTTTTTTTCTTTCAGACAAATGAACCGCTACA
    GTTGCACTGTGGATGCATTATGCATATGCATAATAATCCAGCATTATGCGCTAAACACTG
    TCCCAATGTATATGTACCTAAGCCTAAATTAAAATTAGGCGTTAATCGTGTTCTTATTCA
    TATATATCAATGAACTCTAAGAATGCAGTGTCTTAGGGTGGTCAACTGAACTCATTTCCC
    AACGTTATGCGTTGGTGATACTTTTAAAATTAAGTTCACAAAAGTATGCAAACCCTGAAG
    CTTTGGACTAACCTGACCGGACAAAATGCTTTCTGTGTATGGGATTTTGCAGGGCCAGGC
    CTCGCCGACCACTGGGAAGCTGCACATGCTTGGTGGTGCCAACGACAGCAATCCAATCAG
    ATTTGAATCCCCTCGGATCGAAGAAGGATCTGGTGCATTTTCTTTCAAGCCTCTGAATCT
    CGCATCCTCACACTACGCAGCTGAAGAAAAGACGGTGAGCAGAGTTAACTGAAAAATTAT
    TCATACTCTTCGTACCTGTTCTTCCTGTTACTACTGATTCTGGTAGCTCCTAGTCACTGA
    ACTGTTTTTCAGGTTTCAAAATGACTTGATTGGCCAAGTTACTTCTGAATTCTGACTGAA
    CCAGCAATTTCTTTCCTGAAATTTTTTGTTCTTAACGTAGTAGTATCTGAAGAGATTTTG
    TGAGCATATAATGCCATTTTTTTTTGGTTTTGGCTGCAGAAATCTCTACCCAACAACCAG
    CATCAGTCGCTACCGATTTCTGTCAAGACTGAAGCTACTAGCATTCAAACCGCACAAGAT
    GAAGCAGCAGCCAACCAACTGATGCAGCCGCAGTTCAACGGCGGCAAGCGGAGCCGCGCT
    GCACCTGACAACGGCGGCGACGGCGAGGGCCAGCCGGCGGAGGGCGACGCGAAGGCTGAC
    TCCTCCTCCGGCGCGGCCGCGGTCGCCGTCGTCGCCGCCGCCGCGGCGGCGGTGGCGGAG
    GACGGGTACAGCTGGAGGAAGTACGGGCAGAAGCAGGTGAAGCACAGCGAGTACCCGAGG
    AGCTACTACAAGTGCACGCACGCCAGCTGCGCGGTGAAGAAGAAGGTGGAGCGCTCGCAC
    GAGGGCCACGTCACGGAGATCATCTACAAGGGCACCCACAACCACCCCAAGCCGGCGGCG
    AGCCGCCGCCCCCCCGTCCATCCTCCGCCGCCGTCGCCGGCGACGACGACGACGACGCCG
    CTGCCGCCAGGCGACGCGCAGGCCGACCACGCGCCCGACGGCGGCGGCGGCAGTACCCCA
    GTTGGCGCCGGACAGGCGGGCGCGGAGTGGCACAACGGCGGCGTGGTTGGCGGCGAGGGG
    CTGGTGGACGCGACGTCGTCTCCCTCCGTCCCCGGCGAGCTCTGCGAGTCGACGGCGTCG
    ATGCAGGTCCATGAAGGCGCGGCGGCGGCGCAGCTGGGGGAATCCCCCGAGGGCGTCGAC
    GTCACGTCTGCGGTGTCCGACGAGGTGGACAGGGATGACAAGGCGACGCACGTGTTGCCC
    CTGGCCGCCGCCGCCGCCGACGGCGAGAGCGACGAGCTGGAGCGAAAGAGAAGGTCGCAC
    ACCCATCAGTATCACTGTCCTCTCTGTTCCTTCCATGGCATTGCACGCGAAATCGTCTGA
    ATCGTGTGAAATGAATGAATGCTTCAGGAAGCTGGACTCCTGCGCCACCATGGACATGAG
    CACGGCGTCGAGGGCGGTGCGCGAGCCGCGGGTGGTGATCCAGACGACGAGCGAGGTGGA
    CATCCTCGACGACGGCTACCGCTGGCGCAAGTACGGGCAGAAGGTCGTCAAGGGGAACCC
    CAACCCAAGGTTCGTCTCGAGCCTCTCCTCCATTTTCTCAGAGCTCCTCCTCCTCCATGG
    ATGCTGATCGATCTCTCGTCGTCGTCGTCGTGATCAGGAGCTACTACAAGTGCACGCACC
    CGGGGTGCCTGGTGCGGAAGCACGTGGAGCGCGCGTCGCACGACCTCAAGTCGGTGATCA
    CCACGTACGAGGGGAAGCACAACCACGAGGTCCCCGCGGCGAGGAACAGCGGCCACCCGG
    CGGGCTCGGCTTCGCCCGGCGGCGGCGCGGGGTCGTCGTCGCAGCCCCACGGCGTCGGCG
    TCGGCGGGCGCAGGCCGGAGGTGCCGTCGGTGCAGGAGAGCCTGATGAGGCTCGGCGGCG
    GCTGCGGCGCGGCGCCGTTCCCGCCCCACTTCGGCCTGCACCTGCCGCCGCCGCCGCCGA
    GGGACCCGCTCGCGCCGATGAGCAACTTCCCCTACTCGCTCGGCCACGCGCCGTCGCCGG
    CGCTGCGGGGCCTGCCGCCGCCCCCGCCCCCGCCGCCGTCGGCGTCGGCGCTGGCGGTGG
    CGGGGCTCGGCGGCGTGGTGGAGGGGCTCAAGTACCCGATGCTGGCGCCGCCGTCGGTGC
    ACTCGCTGCTGAGGCACCGCCAGGGCGGCGGCATGGAGGCGGTGGTGGTCCCCAAGGCGG
    AGGTGAAGCAGGAGGCGATGCGGCCCGCCGCCGCCGTCGCCGGCGCGGGGCGCGGCGCGG
    CGGTGTATCAGCAGGCGATGAGCAGGGTGTCGCTGGGGAATCAACTGTAG
    >BK005071 Gene Sequence
    ATGGCCGTGGACCTGATGGGCTGCTACGCCCCGCGCCGCGCAGACGACCAGCTCGCCATC
    CAGGAGGCGGCCACCGCCGGCCTCCGCAGCCTGGAGATGCTCGTGTCGTCCCTCTCCTCC
    TCCTCTCAGGCCGCCGGGGCTCACAAGGCCTCGCCGCAGCAGCAGCCGTTCGGCGAGATC
    GCCGACCAGGCCGTCTCCAAGTTCCGCAAGGTCATCTCCATCCTCGACCGCACCGGCCAC
    GCCCGCTTCCGCCGCGGCCCGGTCGAGTCGTCTGCTCCCGCCGCCCCCGTCGCTGCTGCT
    CCCCCTCCTCCTCCTCCACCACCGGCGCCGGTCGCTGCCGCCCTCGCGCCGACCTCCTCG
    CAGCCGCAGACCCTGACGCTGGACTTCACGAAGCCGAACCTGACCATGTCGGCCGCGACG
    TCCGTGACATCCACGTCGTTCTTCTCGTCGGTGACGGCCGGCGAGGGAAGCGTTTCCAAG
    GGCCGGAGCCTGCTCTCCTCCGGCAAGCCGCCGCTGTCTGGGCACAAGCGGAAGCCCTGC
    GCCGGCGGCCACTCCGAGGCCACCGCCAACGGCGGCCGCTGCCACTGCTCCAAGAGAAGG
    TAAACAAACTCCCACGCCACTTCACTTCTCGAACGCCTCGAGCAACAATTTCCTCTCTAT
    TTCGTGTCACCTTCTCACGGTGGATTGGATTCCGTGTGCTTCCGCAGGAAGAACCGGGTG
    AAGAGGACGATCCGAGTGCCGGCAATCAGCTCGAAGATCGCCGACATCCCGCCGGACGAA
    TACTCGTGGAGGAAGTACGGCCAGAAACCCATCAAGGGCTCCCCTTACCCACGGTAAATC
    TCTTCCTCCTGCTTGAACACGAAATTCTCCCAAGAAATTCATCGCGGTTTGCAACGAACC
    TGACGTGTTCTGCGATTCGATTTTCAGGGGCTACTACAAGTGCAGCACAGTGAGAGGATG
    CCCCGCGCGGAAGCACGTGGAGCGCGCCACCGACGACCCGGCGATGCTGGTCGTGACCTA
    CGAGGGCGAGCACCGCCACACGCCGGGGCCCCTCCCGGCGCCACCCGCCGCCGCCGCCGT
    CGCCGCGATGCCGGTGTCCGTCGCCGTGTCCACCGGCAACGGACATGTCTAA
    >BK005073 Gene Sequence
    ATGACCGCCGCGCCGGGGAGCCTCCCGCTGGTGAACTCGAGGCCCGTCTCCCTCTCCTTG
    GCGGCGAGCAGGTCGTCCTTCTCCAGCCTGCTCAGTGGCGGCGCCGGCTCGTCGTTGAAC
    CTCATGACGCCGCCGTCTTCTCTCCCGCCGTCGTCGCCGTCGTCCTACTTCGGCGGCGTC
    TCGTCCTCCGGGTTTCTCGACTCGCCGATCCTCCTCACGCCCAGTGTAAGCAAGCACGGC
    CGCCGTAGCCGATCGAGCATCCCATTTCCTTTGCCAAATTGCACCGGCATGTGTGGCTGA
    TTATTGAGCTCCATCGTGTTTGTTTCAGTTATTCCCATCGCCGACGACGACGGGCGCATT
    GTTCAGCTGGATTACGACGGCGACGGCGACGGCGGCGATAGCGCCGGAGAGCCAGGTGCA
    AGGAGGGGTCAAGGACGAGCAGCAACAGTACTCGGACTTCACGTTCCTGCCGACGGCGTC
    CACGGCGCCGGCGACGACGATGGCCGGAGCCACCGCGACGACGTCCAACTCCTTCATGCA
    GGACTCCATGCTAATGGCTCCATTGGTAAGACGAACATCAAGCTCAACTTCTAAATTATT
    AGTGCAGAGCTAAGAAATTACGAACAGGGACTGACATGTGGGCCAGTAGTCGCATGGTAG
    TGAGTGATTTATCACAAACTTTTTTACTAAAAAAAAGTAATGATAAGCTTATCCGGATCA
    ACGACGTCCTTGCATTATTACGAATTTCTTGTGGATTGGGAGGTTCTCTTTTGACTTTTA
    TGCCAAAGCATACATATATGATGAAATGCATTAATATCTCGCGGCCAAGATATCTTAATC
    AACATGTTTTTTTAATGATAAGGGACACTCAGGCTAGATTTGCAAAAGCAAACTTCATTT
    TCACGTTCTCCATGAAAACCTAAGCACATACCATTAGTCAAGCTTGCTACAGTTCTAAAC
    TAGATCAATTAGTCAAAACTCCCTTGGAAAAAATAATTACCCTGCATGTGATGCACTCCA
    AAAAAAAGATGCATGTATCATTGATATTGTTGTGTTTGCAATATTGTAGGGAGGGGACCC
    GTACAATGGCGAGCAGCAGCAGCCATGGAGCTACCAAGAACCGACCATGGACGCTGACAC
    TAGGCCAGCGGAATTCACCTCGTCGGCGGCGGCGGGTGACGTGGCCGGGAACGGCAGCTA
    CAGCCAGGTGGCGGCGCCGGCGGCGGCCGGCGGCTTCCGTCAGCAGAGCCGGCGGTCGTC
    GGACGACGGCTACAACTGGCGCAAGTACGGGCAGAAGCAGATGAAGGGGAGCGAGAACCC
    GCGCAGCTACTACAAGTGCACCTTCCCTGGCTGCCCGACCAAGAAGAAGGTGGAGCAGTC
    GCCGGACGGCCAGGTCACCGAGATCGTCTACAAGGGCGCGCACAGCCACCCCAAGCCGCC
    GCAGAACGGCCGCGGCCGCGGCGGCTCCGGCTACGCGCTGCATGGCGGCGCCGCCAGCGA
    CGCATACTCCTCCGCCGACGCGCTCTCCGGCACGCCGGTGGCGACGCCCGAGAACTCGTC
    GGCGTCGTTCGGGGACGACGAGGCGGTCAACGGCGTCAGCTCGTCGCTGCGGGTCGCCTC
    TAGCGTCGGCGGCGGCGAGGATCTCGACGACGACGAGCCTGATTCCAAGAGGTGGAGGAG
    AGACGGCGGCGACGGCGAGGGCGTCTCGCTGGTGGCCGGCAACCGGACGGTGCGTGAGCC
    GAGGGTGGTTGTGCAGACGATGAGCGACATCGACATCCTCGACGACGGCTACCGGTGGCG
    CAAGTACGGGCAGAAGGTGGTCAAGGGCAACCCAAACCCAAGGTACGTTGCATGCGTGCG
    TAAACATATATCGATCTGTCACGTAGGTGTTCGACGCGTGTACGTGTGGGCTGACATGCA
    TCTGTGCTCTATCTGCAGGAGCTACTACAAGTGCACGACGGCCGGGTGCCCCGTGCGGAA
    GCACGTGGAGCGCGCGTCCAACGACCTGCGCGCGGTGATCACCACGTACGAGGGCAAGCA
    CAACCACGACGTGCCCGCGGCGCGCGGGAGCGCCGCCGCCGCGCTCTACCGCGCCACGCC
    GCCGCCGCAGGCGAGCAACGCCGGCATGATGCCCACCACGGCGCAGCCCTCGAGCTACCT
    GCAGGGCGGCGGCGGCGTCCTTCCGGCCGGCGGGTACGGCGCGTCGTACGGCGGCGCGCC
    GACGACGACGCAGCCCGCGAACGGCGGTGGCTTCGCCGCCCTGTCCGGCCGGTTCGACGA
    CGACGCGACGGGAGCGTCTTACTCTTACACGAGCCAGCAGCAGCAGCAGCCGAACGACGC
    GGTGTACTACGCGTCGAGAGCCAAGGATGAGCCGAGAGACGACGGCATCATGTCGTTCTT
    TGAGCAGCCGCTGCTGTTTTGA

  • CDS in exon position format

    BK005038 1 429
    BK005038 1013 1174
    BK005038 1420 2273
    BK005038 2368 2529
    BK005038 2564 3290
    BK005071 1 599
    BK005071 708 833
    BK005071 928 1132
    BK005073 1 225
    BK005073 329 565
    BK005073 1070 1842
    BK005073 1939 2422
    AY077757 726 1157
    AY077757 1288 1757
    AY077757 1868 1993
    AY077757 2144 2303
    AY514043 1 689
    AY514043 999 1127
    AY514043 1225 1411

  • GENE Length file format

    AY077757 2731
    AY514043 1411
    BK005038 3290
    BK005071 1132
    BK005073 2422

  • Domain Position in Gene:

    AY484394 77 176
    AY077757 1929 1993
    AY077757 2144 2261
    AY514043 1063 1127
    AY514043 1225 1342
    BK005038 154 252
    BK005038 1678 1851
    BK005038 2465 2529
    BK005038 2564 2729
    BK005071 769 833
    BK005071 928 1045
    BK005073 1259 1435
    BK005073 1778 1842
    BK005073 1939 2050

  • Output ID Order:

    If user supply the output id order file, GSDS will order the gene structure diagram as the supplied id order. IDs can be one line one id or separated by space, tab, comma, and semicolon. eg:
    BK005038
    BK005071
    BK005073
    AY077757
    AY514043


    GSDS Data Implement

    GSDS can use the following 3 types data to draw the gene structure schematic diagram.

    1.CDS and Genomic Sequences
    When users upload CDS and genomic sequences, GSDS use the
    est2genome program in Emboss package to align the CDS on the genomic sequence. Then extract the 5'UTR,exon,intron and 3'UTR information to draw diagram.

    2.GenBank Accession Number or GI
    When users upload accession number or gi, GSDS use the
    EFetch program to get genbank format file from NCBI. Then we use BioPerl to extract the CDS region and gene region.
    If the entry has no CDS region marked in its' FEATURE column or the id has no entry in NCBI, we will report it in output.
    If the entry has marked gene region, we use the gene length as the whole length of the diagram.
    If the entry has no marked gene region, we use the source length as the whole length of the diagram.
    If one entry has more than one annotated gene region, we will list the this information below the diagram.

    3.Exon position
    When users upload exon position and gene length information, GSDS directly use these information to draw diagram.

  • Image Format

    PNG is a bitmap picture
    SVG is the Scalable Vector Graphics, it can be edited in Illustrator. To browse SVG image on web browser, user need to install
    SVG Viewer. For more information about SVG, please refer here. For more chinese information about SVG, please refer here.

  • Citation: Guo AY, Zhu QH, Chen X, Luo JC.
    GSDS: a gene structure display server. Yi Chuan. 2007 Aug;29(8):1023-6.

    © Center for Bioinformatics(CBI), Peking University. Contact: