INSDC

Controlled vocabulary for /regulatory_class

A new qualifier was introduced in version 10.4 (November 2014) of the Feature table definitions to be supported from 15-DEC-2014: /regulatory_class

This page was last updated in December 2017.

The text below outlines the format and the present list of allowed controlled vocabulary. Please note that mappings to Sequence Ontology (SO) are given in brackets () but should not be used as part of the value format. Please see below for examples.

Qualifier: /regulatory_class

Definition: a structured description of the classification of transcriptional, translational, replicational and chromatin structure related regulatory elements in a sequence           
Value format: “TYPE”
where TYPE is one of the following: attenuator, CAAT_signal, enhancer, enhancer_blocking_element, GC_signal, imprinting_control_region, insulator, locus_control_region, minus_35_signal, minus_10_signal, polyA_signal_sequence, promoter, response_element, ribosome_binding_site, riboswitch, silencer, TATA_box, terminator, other
Examples:
/regulatory_class=”promoter”
/regulatory_class=”enhancer”
/regulatory_class=”ribosome_binding_site”

Comment: TYPE is a term taken from the INSDC controlled vocabulary for regulatory classes:

attenuator (SO:0000140): 1) region of DNA at which regulation of termination of transcription occurs, which controls the expression of some bacterial operons;
2) sequence segment located between the promoter and the first structural gene that causes partial termination of transcription.

CAAT_signal (SO:0000172): CAAT box; part of a conserved sequence located about 75 bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C or T)CAATCT [1,2].

References: [1] Efstratiadis, A.  et al.  Cell 21, 653-668 (1980); [2]  Nevins, J.R.  “The pathway of eukaryotic mRNA formation” Ann Rev Biochem 52, 441-466 (1983)

DNase_I_hypersensitive_site (SO:0000685): DNA region representing open chromatin structure that is hypersensitive to digestion by DNase I.

enhancer (SO:0000165): a cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.

enhancer_blocking_element: a transcriptional cis regulatory region that when located between an enhancer and a gene’s promoter prevents the enhancer from modulating the expression of the gene. Sometimes referred to as an insulator but may not include the barrier function of an insulator.

GC_signal (SO:0000173): GC box; a conserved GC-rich region located upstream of the start point of eukaryotic transcription units and which may occur in multiple copies or in either orientation; consensus=GGGCGG

imprinting_control_region: a regulatory region that controls epigenetic imprinting and affects the expression of target genes in an allele- or parent-of-origin-specific manner. Associated regulatory elements may include differentially methylated regions and non-coding RNAs.

insulator (SO:0000627): a chromatin boundary element or barrier that can block the encroachment of condensed chromatin from an adjacent region. May also include enhancer-blocking activity.

locus_control_region (SO:0000037): a DNA region that includes DNase hypersensitive sites located 5′ to a gene or gene cluster, and which confers high-level, position-independent, and copy number-dependent expression on that gene or gene cluster.

matrix_attachment_region (SO:0000036): DNA region that is required for the binding of chromatin to the nuclear matrix.

minus_35_signal (SO:0000176): a conserved hexamer about 35 bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA

minus_10_signal (SO:0000175): Pribnow box; a conserved region about 10 bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT [1,2,3,4]

References: [1] Schaller, H., Gray, C., and Hermann, K.  Proc Natl Acad Sci USA 72, 737-741 (1974); [2] Pribnow, D.  Proc Natl Acad Sci USA 72, 784-788 (1974); [3] Hawley, D.K. and McClure, W.R.  “Compilation and analysis of Escherichia coli promoter DNA sequences” Nucl Acid Res 11, 2237-2255 (1983); [4] Rosenberg, M. and Court, D.  “Regulatory sequences involved in the promotion and termination of RNA transcription”  Ann Rev Genet 13, 319-353 (1979)

polyA_signal_sequence (SO:0000551): the recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA or ATTAAA.

promoter (SO:0000167): region on a DNA molecule involved in RNA polymerase binding to initiate transcription.

recoding_stimulatory_region (SO:1001268): site in an mRNA sequence that stimulates the recoding of a region in the same mRNA; for annotating a region of an mRNA that controls a recoding event, e.g., regulates stop codon translational readthrough, selenocysteine incorporation or ribosomal slippage.

recombination_enhancer (SO:0002059): a regulatory region that promotes or induces the process of recombination.

replication_regulatory_region (SO:0001682): region that is involved in the control of the process of nucleotide replication but is not the origin of replication.

response_element: a regulatory element that acts in response to a stimulus, usually via transcription factor binding.

ribosome_binding_site (SO:0000552): ribosome binding site

riboswitch (SO:0000035): a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5′ end of an mRNA, that acts as a direct sensor of metabolites.

silencer (SO:0000625): a regulatory region which upon binding of transcription factors suppresses the transcription of the gene(s) it controls.

TATA_box (SO:0000174): Goldberg-Hogness box; a conserved AT-rich septamer found about 25 bp before the start point of some eukaryotic RNA polymerase II transcript unit which may be involved in positioning the enzyme for correct initiation; consensus=TATA(A or T)A(A or T) [1,2];

References: [1] Efstratiadis, A.  et al.  Cell 21, 653-668 (1980); [2] Corden, J., et al.  “Promoter sequences of eukaryotic protein-encoding genes”  Science 209, 1406-1414 (1980)

terminator (SO:0000141): sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.

transcriptional_cis_regulatory_region (SO:0001055): modulates the transcription of a gene or genes.

uORF (SO:0002027): a short open reading frame that is found in the 5′ untranslated region of an mRNA and plays a role in translational regulation. 

other: /regulatory_class not included in any other term.

regulatory classes not yet in the INSDC /regulatory_class controlled vocabulary can be annotated by entering /regulatory_class=”other” with /note=”[brief explanation of novel regulatory_class]”;