PEDE Cluster Viewer Help

Description

This page serves as the viewer for the assembled EST sequences of our oligo-capped pig cDNA libraries. Search results provide information on similarity with UniGene clusters (human, mouse, cattle and pig), sequences in RefSeq (human and mouse), the draft sequence of the human genome, and cSNP (SNP in cDNA sequences). Also, coverage of the open reading frame of the assembled sequences and the longest clones is provided in the result pages. Searches can be done according to assembly release, cluster name, properties of cDNA libraries, keywords in BLAST search results, and information on human gene loci.

Usage

Items with a check box are enabled during the search only when they are checked.

Release

EST assembling is performed approximately bimonthly, and the latest version is released as soon as possible. Data in previous releases can be accessed by changing the release number accordingly.

Cluster Name

Search is by the name of an assembly. Names are determined by the following rule:

[Release name][Contig or Singlet]-[Cluster number]

Examples:
20030531C-000305
20030430S-000028

Release names are determined by 8 digits describing the date of assembly. "Contig" versus "singlet" is discriminated by the one-letter code "C" or "S". Cluster numbers start at 000001 and have 6 digits.

You should use this item only when you know the cluster name(s) for your sequence(s) of interest.

Library Name

This search yields the contigs and singlets derived from the reads from a particular library.

Clone Name

This search provides the cluster that contains all read(s) derived from a particular clone. Read names are defined as follows:

[Library name]_[Plate name]_[Well].[Trailed character(s)]

Examples:
LNG01_0086_A07.b
OVRM1_0032_G11.b

Some characters are trailed to the name labeled by the above scheme in order to specify the direction and location of the reads in the plasmid inserts. The trailed characters essentially follow the St. Loius naming scheme. The search will be done by forward match.

Search on BLAST result

This search is based on the results of BLAST searches of the cluster sequences in the database, performed by using NCBI UniGene data as queries. Human, mouse, cattle, and pig UniGene clusters are used in the search, and the UniGene database can be selected to limit the result to a particular species. The search can be done with the following items as queries:

UniGene ID
Accession number of the UniGene cluster
GenBank ID of the UniGene cluster
Keyword or locus name of the gene
Human chromosome where loci of human genes with high similarity are localized
BLAST score threshold for the homology search (This parameter affects the results of the keyword or locus name and human chromosome searches)

Clusters in the results can be restricted to those estimated as full-length CDS according to their counterparts in the UniGene clusters or RefSeq sequences (see below). The search based on the results of BLAST searches using human and mouse RefSeq protein sequences in NCBI is also available.

Search clusters including cSNP(s)

This option limits the clusters with putative SNP(s) within cDNA sequences in the search. Because reverse transcriptase has a high error rate, single-base mutations occur frequently and complicate the identification of actual SNPs. Therefore, alleles of mutations must occur in at least 2 reads to be considered putative SNPs. These putative SNPs have partly been confirmed in porcine genomic sequences, and their distribution in various breeds has been investigated (see Suppplemental figures and tables).

FL (full-length) cDNA

This search returns the contigs and singlets estimated to include the entire CDS region of the corresponding UniGene cluster or RefSeq sequence. In the "Cluster Information" page, presumption of a full-length CDS is omitted if the BLAST score is less than 50.