DDBJ Database BioInformatics Notes
DDBJ Database BioInformatics Notes
Introduction
Databases are like information banks which are used for storing and retrieving sequence
information. DNA Databank of Japan (DDBJ) is one of three nucleotide databases that
together with the National Centre for Biotechnology Information (NCBI) and European
Bioinformatics Institute (EMBL), form a consortium known as International Nucleotide
Sequence Database Collaboration (INSDC). DDBJ is the only nucleotide sequence databank
of Asian origin and mainly collects sequences from Japanese researchers. It is a primary
nucleotide database; it collects data directly from the researchers; hence the data is freely
accessible by everyone. On accepting a nucleotide sequence, DDBJ issues an accession
number to the submitter which has international recognition.
History
DDBJC was established in the year 1986 at the National Institute of Genetics (NIG), Japan
with support from the Japanese Ministry of Education, Culture, Sports, Science and
Technology (MEXT). Later, for its efficient functioning, the Center for Information Biology
(CIB) was established at NIG in 1995. In 2004, NIG was made a member of Research
Organization of Information and Systems.
The functioning and maintenance of DDBJ is monitored by an international advisory
committee consisting of 9 members from Japan, Europe and USA. The committee reviews
the functioning of DDBJ and reports the progress of DDBJ in database issue of Nucleic Acid
Research Journal every year. Since its inception there has been a tremendous increase in the
number of sequences submitted to DDBJ.
Figure: Growth of DDBJ/EMBL/NCBI in terms of nucleotide data submitted over the years.
Roles of DDBJ
As a member of INSDC, the primary objective of DDBJ is to collect sequence data from
researchers all over the world and to issue a unique accession number for each entry. The
data collected from the submitters is made publicly available and anyone can access the data
through data retrieval tools available at DDBJ. Everyday data submitted at either DDBJ or
EMBL or NCBI is exchanged, therefore at any given time these three databases contain same
data
.
Activities of DDBJ
Following are the activities of DDBJ:
Collection of sequence
The sequences collected from the submitters are stored in the form of an entry in the
database. Each entry consists of a nucleotide sequence, author information, reference,
organism from which the sequence is determined, properties of the sequence etc.
Figure: Snapshots of steps taken to retrieve sequence from DDBJ using getentry tool.
Software development
DDBI team continuously focuses on developing new software which can be used for data
analysis. For example, WINA (A Window Analysis Program for the number of synonymous
and nonsynonymous nucleotide substitutions) has been developed by DDB) It is tool which
helps in visualizing the difference in accumulation of both synonymous and nonsynonymous
nucleotide substitutions.
Training courses
DDBJ also focuses on providing teaching assistance on bioinformatics. It conducts
Bioinformatics training course which teaches analysis of data,
Data Updates
Once sequence is submitted the submitter receives an accession number, but after some time
the submitter feels the need to do some modification or updation in the sequence then the
option of data updation is used. Only the original data submitter is authorized to do data
updation.
Database Services
DDBJ Omics Archive
DDBJ Omics Archive (DOR) contains quantitative genomics data from DNA microarray and
next-generation sequencing platforms. It exchanges data with EBI ArryExpress and imilar to
it DOR also works according to MINSEQE (Minimum Information about a High-Throughput
Sequencing Experiment) and MIAME (Minimum Information about a Microarray
Experiment). DOR accepts unprocessed as well as processed data from the researchers.