PROTEIN DATABASES
Exploring proteins without the right databases is like navigating a foreign city with no map. Thankfully, the bioinformatics world offers a vibrant collection of tools, each designed to reveal a different layer of protein identity, structure, function, evolution, and interactions. Whether you're hunting for high-quality curated sequences, predicting 3D structures, tracing domain families, mapping metabolic pathways, or uncovering interaction networks, these databases act as your ultimate scientific toolkit. Together, they turn raw protein data into meaningful biological insight, making research faster, smarter, and far more exciting
Major Protein Databases
Database
Description
Link
1
UniProtKB/Swiss-Prot
Manually curated, high-quality protein sequence & functional information (domains, PTMs, variants), non‐redundant section of UniProt.
https://www.uniprot.org/
2
UniProtKB/TrEMBL
Automatically annotated supplement of UniProtKB (for sequences not yet in Swiss-Prot).
https://www.uniprot.org/
3
PDB
Protein Data Bank: experimentally determined 3D structures of proteins, nucleic acids, and complexes.
https://www.rcsb.org/
4
GenBank Protein
NCBI repository of protein sequences derived from translations of annotated coding regions in GenBank.
https://www.ncbi.nlm.nih.gov/protein/
5
RefSeq
Curated collection of non-redundant sequences representing reference standard for genomes, transcripts, and proteins.
https://www.ncbi.nlm.nih.gov/refseq/
6
Pfam
Protein families & domains database; provides multiple sequence alignments & hidden Markov models for domain detection.
https://pfam.xfam.org/
7
InterPro
Integration of protein families, domains, functional sites from multiple member databases (including Pfam, PROSITE etc.) and, useful for functional annotation.
https://www.ebi.ac.uk/interpro/
8
PROSITE
Database of protein domains, families and functional sites described by patterns/profiles. Useful for motif scanning.
https://prosite.expasy.org/
9
ModBase
Database of comparative (homology) protein structure models (for proteins without experimental structure).
http://salilab.org/modbase/
10
SCOPe
Structural Classification of Proteins: organizes protein domain structures into hierarchies (class, fold, superfamily, family) for evolutionary context.
https://scop.berkeley.edu/
11
CATH
Another structural classification database (Class, Architecture, Topology, Homology). it is an alternative to SCOPe.
https://www.cathdb.info/
12
STRING
Known & predicted protein-protein interaction (PPI) networks (physical + functional associations) across many organisms.
https://string-db.org/
13
BioGRID
Repository of experimentally determined protein and genetic interactions; valuable for microbial interactome studies.
https://thebiogrid.org/
14
BRENDA
Comprehensive enzyme information system (reaction, kinetics, substrate/product data). It bridges proteins → enzymatic function/pathway.
https://www.brenda-enzymes.org/
15
KEGG Genes/Proteins
Genome‐wide gene/protein sets annotated with pathways, orthology, networks (including bacterial species). Integral to mapping proteins into metabolic pathways.
https://www.genome.jp/kegg/genes.html
16
UniRef
Clusters of UniProt sequences at different identity thresholds (100%, 90%, 50%), helps reduce redundancy and speed searches.
https://www.uniprot.org/uniref/
17
DisProt
Database of intrinsically disordered proteins / regions.
https://disprot.org/
18
Human Protein Atlas
While human-centric, it’s useful as a reference of subcellular localization, expression; analogous bacterial localization databases exist, but this gives idea of protein context.
https://www.proteinatlas.org/
Comments
Post a Comment