TPA Submission Guidelines

Beginning in January 2025, TPA-Exp and TPA-Inf submission types will no longer be accepted as new submissions. Please see here for more information.

Third PArty data (TPA) are submitted to the International Nucleotide Sequence Databases as part of the process of publishing biological studies that include the assembly and/or annotation of existing INSDC reads and primary sequences. Publicly accessible TPA data are therefore linked to a publication or publications that document the derivation of the data supported by peer-reviewed scientific evidence.

All TPA records belong to one of these classes: TPA:experimental, TPA:inferential, TPA:specialist_db, or TPA:assembly.

TPA:experimental describes records that include functional annotation derived at least in part from peer-reviewed wet-lab experimental investigation.

TPA:inferential describes records that include functional annotation derived from peer-reviewed bioinformatic investigation.

TPA:specialist_db  describes records whose sequences are submitted from an existing authoritative public database that is built using INSDC sequence data and is described in an accepted peer-reviewed publication.  The existing database is therefore recognized to be comprehensive, to have added value, and to be maintained long term.

TPA:assembly describes records reporting assembly or reassembly, for which the generation, whether it is purely informatic or informed by experimentation, has been subject to peer review. Annotation may or may not be available and is not required to be part of the peer review for this TPA class. A further requirement for TPA:assembly is for submitters to cite the original INSDC-accessioned reads.

TPA records are clearly labeled with keywords indicating their TPA status and their class. Constructed genomes where no experimental evidence is presented (in TPA:assembly) are permitted to include only annotation relating to genes of known function (as opposed to hypothetical proteins, for example). Submissions containing neither assembly information nor annotation that has resulted from peer-reviewed in vivo, in vitro or in silico experimentation are not accepted in TPA. The outputs of computational tools, feature identification algorithms, and homology search tools alone are not sufficient for TPA.

Below is a list of typical TPA entry types and the class to which they belong. Please note that this list is neither exhaustive (there may be further entry types that would be acceptable) nor defining of the complete set of requirements (other constraints will be applied that are documented outside this list).

Record TypeTPA TierDescription

1

ExperimentalCDS and related annotation applied to a sequence derived from existing genomic, EST and/or mRNA primary records with wet laboratory experimental evidence for existence of at least part of the transcript (eg. RT-PCR, Northern).

2

ExperimentalCDS and related annotation applied to a sequence derived from existing genomic, EST and/or mRNA primary records, in addition to novel sequencing, with wet laboratory experimental evidence for existence of at least part of the transcript (eg. RT-PCR, Northern).

3

ExperimentalCDS and related annotation applied to a sequence derived from existing genomic, EST and/or mRNA primary records with experimental evidence for the presence of the product (eg. antibody staining, biochemical assay).

4

ExperimentalCDS and related annotation applied to a sequence derived from existing genomic, EST and/or mRNA primary records, in addition to novel sequencing, with experimental evidence for the presence of the product (eg. antibody staining, biochemical assay).

5

ExperimentalRe-assignment of the product name and/or function of a coding gene where there is no change to existing annotated exon, mRNA and CDS locations and wet laboratory experimental evidence is presented

6

ExperimentalAnnotation of non-coding transcripts, such as antisense regulators, with wet laboratory experimental evidence for their existence and/or function.

7

ExperimentalAnnotation of repeat features in association with transposon, retrotransposon, integron, iteron and insertion sequences with wet laboratory experimental evidence.

8

ExperimentalAnnotation of functional RNA genes, such as tRNAs, scRNAs, etc. with wet laboratory experimental evidence.

 

9

ExperimentalA record submitted as part of a collection of annotated members of a gene family, where wet laboratory experimental evidence exists for the annotation.

10

ExperimentalA record or set of records representing a novel assembly or reassembly of primary read and sequence data that exist in INSDC, for which wet laboratory experimental functional annotation data are presented and have been subject to peer review associated with the linked publication.

11

InferentialCDS and related annotation applied to a sequence derived from existing genomic, EST and/or mRNA primary records with reported wet laboratory experimental evidence for a homologous molecule, but no direct wet laboratory experimental evidence. The reported experimental evidence must have been generated by the submission group and must comply with TPA requirements for peer review.

12

InferentialCDS and related annotation applied to a sequence derived from existing genomic, EST and/or mRNA primary records, in addition to novel sequencing, with no wet laboratory experimental evidence. If novel sequence is used to bridge two pieces of sequence, experimental evidence for a homologous molecule should exist.

13

InferentialRecord of sequence and annotation concepts covered in a review paper or discussion section, where wet laboratory experimental evidence is reported, but not generated by the TPA submitter.

14

InferentialAnnotation of non-coding genes and transcripts with no wet laboratory experimental evidence for their existence and/or function, when submitted as part of a collection of sequences with experimental evidence for at least one member of the collection.

15

InferentialAnnotation of pseudogenes with no wet laboratory experimental evidence, when submitted as part of a study that includes TPA records of functional homologues of the pseudogene.

16

InferentialAnnotation of pseudogenes that are not part of a study for which there exists experimental evidence.

 

17

InferentialA record submitted as part of a collection of annotated members of a gene family, where wet laboratory experimental evidence does not exist for the annotation. One or more other members of the set should have experimental evidence and should have been submitted to TPA:experimental or to the INSDC primary database.

18

InferentialA record representing a completely sequenced genome, or completely sequenced naturally occurring extrachromosomal element, comprising features, most of which have assigned gene symbols or product identifiers, where the annotated features may be a mix of experimentally and inferentially determined data.

19

InferentialA record or set of records representing a novel assembly or reassembly of primary read and sequence data that exist in INSDC, for which inferred functional annotation data are presented and have been subject to peer review associated with the linked publication.

20

Specialist DBA record submitted as part of a comprehensive collection of annotations from a given class or set of classes of gene that are derived using a published standard operating procedure that has undergone peer review. These submissions will come only from authoritative public databases of standing (such as those meeting the criteria for repeated publications in the NAR Annual Database Issue) for which a genuine case can be made for the use of TPA.

21

AssemblyA record or set of records representing a novel assembly or reassembly of primary read and sequence data that exist in INSDC, for which no functional annotation data are presented, but for which the assembly has been assessed in the peer review process associated with the linked publication.

22

AssemblyA record or set of records representing a novel assembly or reassembly, with functional annotation, of primary read and sequence data that exist in INSDC, for which the assembly but not the functional annotation has been assessed in the peer review process associated with the linked publication.

Below is a list of entry types that are not suitable for inclusion in the TPA dataset. Please note that this list is not exhaustive.

Record TypeTPA ClassDescription
ANot acceptedAnnotation of repeat (and no other) features
BNot acceptedAnnotation that has arisen from an automated tool, such as GeneMark, tRNA scan or ORF finder, where no further evidence, experimental or otherwise, is presented for the annotation. The annotation in these cases has not been the subject of the peer review of the publication.
CNot acceptedA record representing a completely sequenced genome including only features that have not been assigned gene symbols or product identifiers, for which none has wet laboratory experimental evidence.