Search for

Submitting Data

This site allows users to deposit their data into NCBI Probe Database. Currently our public submission tools are under development, however, we would be happy to assist users in deposition of large-scale data or isolated probes and associated experimental data.

For simple submissions please download corresponding templates below and send completed templates to probe-admin@ncbi.nlm.nih.gov.

Note: in future submission formats and submission procedure can change to adjust to submitters' feedback and software development.

Due to tremendous variability of existing probe types and complexity of modern technologies it is impossible to develop simple and exhaustive templates for all of them. Therefore, if you did not find right template or have any questions please contact probe-admin@ncbi.nlm.nih.gov for additional templates and explanations.

Submission overview

Required files

  1. "[some name]_gen_info" - file with contact information and information that is common for all probes in the submission (text or document format)
  2. "[some name]_seq_info" - file with individual probes' data (tab-delimited or spreadsheet format)

Mandatory fields for all types of probes

  • marked by asterisk (*) in contact and general information file
  • #TRACKING - probe's unique identifier
  • #PROBE_NAME - probe's name as it appears on the probe's title
  • #PROBE_TYPE - registered probe types are found on Probe's home page http://www.ncbi.nlm.nih.gov/sites/entrez?db=probe
  • #DESIGN_ACCESSION - source or target sequence accession in GenBank (preferred), DDBJ, or the EMBL on which probe design was based

Optional fields

All other fields are optional. However, we expect reasonable number of fields to be filled with data. There are two types of optional fields:

  • fields that are defined in the database such as #TGT_ACC (target accession), #VALIDATION, #TGT_TAXID and others
  • fields that do not have special place in the database, for example, #PROTEIN_KNOCKDOWN%; such fields are concatenated and displayed in "Result" or other appropriate section of the probe's report page

Basic rules

  • sequences' directionality should be from 5' to 3'
  • no fancy formatting in spreadsheets (uniform font, no borders, no colors other than default black)
  • all fields in "general information" file (except for contact information) can be moved to data file and provided individually for each probe
  • fields such as #MARKER_ALIASES, #PMID, #PRODUCT_SIZE, etc. can accept multiple values separated by simicolon
  • any number of arbitrary fields can be included in the submission; examples: #ALLELE_NUMBER, #AMPLIFIES_IN_ORG, etc.; we reserve right to archive and display these fields at our discretion; please explain meaning of these fields in email text or *_gen_info file
  • fill empty cells with word 'NaN'

Additional comments

  • probe's entry can contain more than one sequence
  • sequence can have several features; obvious features, for example, universal primers can be briefly described in methodology text, however, some features do not have fixed position and are different for each probe (for example, "variation" feature); please contact probe-admin@ncbi.nlm.nih.gov for a template which will include features for sequences in your submission
  • #VALIDATION field accepts values "comp fail", "comp success", "wet lab fail", "wet lab success".

The NCBI dbSTS and UniSTS databases

  • dbSTS is a division of GenBank for archiving genetic markers that contain #AMPLICON sequences; please visit dbSTS submission page for more information: Submitting sequences to dbSTS
  • UniSTS is a database of genetic markers associated with such information as genetic map and genomic position; please visit UniSTS submission page for more information: Submitting Map Data to UniSTS
  • interface that will allow to submit data in dbSTS, UniSTS and Probe simultaneously is currently under development

Available templates

Template General info file (.doc) Sequence info file (.txt) Comment
siRNA siRNA_gen_info.doc siRNA_seq_info.txt
  • for #TARGET_ACC provide RefSeq (RefSeq Project) accession of any mRNA transcript of the targeted gene (for example, NM_000461.4), preferably with version
  • if #TARGET_ACC is provided #ORGANISM_TAXID and #GENE_ID become unnecessary
  • if sequences cannot be disclosed but mapping to target gene is desired, please contact probe-admin@ncbi.nlm.nih.gov for options
  • #SENSE_OH_* and #ANTISENSE_OH_* mean "sense overhang" and "antisense overhang", respectively
shRNA shRNA_gen_info.doc shRNA_seq_info.txt
  • for #TARGET_ACC provide RefSeq (RefSeq Project) accession of any mRNA transcript of the targeted gene (for example, NM_000461.4), preferably with version
  • if #TARGET_ACC is provided #ORGANISM_TAXID and #GENE_ID become unnecessary
  • if sequences cannot be disclosed but mapping to target gene is desired, please contact probe-admin@ncbi.nlm.nih.gov for options
  • #HAIRPIN_SEQ is a sequence of "straightened out" hairpin; usually it contains sense and antisense sequences separated by a spacer (loop); example: Pr008816793; note: for software position count starts from 0, i. e. #SENSE_POS1 and #SENSE_POS2 should be submitted as "0" and "20" and #ANTISENSE_POS1 and #ANTISENSE_POS2 should be submitted as "32" and "52", respectively
dsRNA or esiRNA dsRNA_gen_info.doc dsRNA_seq_info.txt
  • for #TARGET_ACC provide RefSeq (RefSeq Project) accession of any mRNA transcript of the targeted gene (for example, NM_000461.4), preferably with version
  • #DESIGN_ACC is GenBank accession of sequence that was used for primer design
  • if #TARGET_ACC or #DESIGN_ACC are known #ORGANISM_TAXID is not necessary
  • #PRIMER_F2_* and #PRIMER_R2_* fields are for information about nested primers (if applicable)
antisense or morpholino antisense_gen_info.doc antisense_seq_info.txt
  • for #TARGET_ACC provide RefSeq (RefSeq Project) accession of any mRNA transcript of the targeted gene (for example, NM_000461.4), preferably with version
  • if #TARGET_ACC is provided #ORGANISM_TAXID and #GENE_ID become unnecessary
  • if sequences cannot be disclosed but mapping to target gene is desired, please contact probe-admin@ncbi.nlm.nih.gov for options
Primer set PrimerSet_gen_info.doc PrimerSet_seq_info.txt
  • #PRIMER_F2_* and #PRIMER_R2_* fields are for information about nested primers (if applicable)
  • please c
TaqMan Taqman_gen_info.doc Taqman_seq_info.txt
  • #PRIMER_F2_* and #PRIMER_R2_* fields are for information about nested primers (if applicable)
genetic marker: STS, SSR, RFLP, AFLP, SSLP, SSCP marker_gen_info.doc marker_seq_info.txt
  • if your markers are associated with sequenced PCR products (#AMPLICON field) and/or mapping data (map, chromosome/linkage group, map position, etc.) please contact probe-admin@ncbi.nlm.nih.gov for options (you might want to submit the sequenced amplicons and/or mapping data to GenBank or UniSTS databases, respectively)
FISH (Fluorescence In Situ Hybridization) FISH_gen_info.doc FISH_seq_info.txt
  • #TARGET_TAXID and #NON_TARGET_TAXID fields can accept multiple values delimited by semi-colon
  • please provide any other additional fields as necessary
  • please use #COMMENT field to provide any additional information about this probe

Questions or Comments?
E-mail the NCBI Service Desk

| NIH | NLM | NCBI | Disclaimer | Privacy Statement | Accessibility |