alphafold.ebi.ac.uk Open in urlscan Pro
34.149.152.8  Public Scan

URL: https://alphafold.ebi.ac.uk/
Submission: On September 03 via manual from JP — Scanned from JP

Form analysis 1 forms found in the DOM

Name: global-searchGET /ebisearch/search.ebi

<form id="global-search" name="global-search" action="/ebisearch/search.ebi" method="GET" class="">
  <div>
    <div class="input-group"><label class="vf-form__label vf-u-sr-only" for="global-searchbox">Search all of EMBL-EBI</label><input type="text" name="query" id="global-searchbox" class="input-group-field" placeholder="Search all of EMBL-EBI">
      <div class="input-group-button"><input type="submit" name="submit" value="Search" class="button"><input type="hidden" name="db" value="allebi" checked="checked"><input type="hidden" name="requestFrom" value="masthead-black-bar"
          checked="checked"></div>
    </div>
  </div>
</form>

Text Content

×
Search all of EMBL-EBI

 * EMBL-EBI
 * Search
 * About us
 * Training
 * Research
 * Services
 * EMBL-EBI home


ALPHAFOLD PROTEIN STRUCTURE DATABASE


ALPHAFOLD
PROTEIN STRUCTURE DATABASE

 * Home
 * About
 * FAQs
 * Downloads
 * API


ALPHAFOLD
PROTEIN STRUCTURE DATABASE

Developed by DeepMind and EMBL-EBI

SearchBETA
Search
Examples: Free fatty acid receptor 2At1g58602Q5VSL9E. coliSee search help
AlphaFold DB provides open access to over 200 million protein structure
predictions to accelerate scientific research.


BACKGROUND

AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D
structure from its amino acid sequence. It regularly achieves accuracy
competitive with experiment.

DeepMind and EMBL’s European Bioinformatics Institute (EMBL-EBI) have partnered
to create AlphaFold DB to make these predictions freely available to the
scientific community. The latest database release contains over 200 million
entries, providing broad coverage of UniProt (the standard repository of protein
sequences and annotations). We provide individual downloads for the human
proteome and for the proteomes of 47 other key organisms important in research
and global health. We also provide a download for the manually curated subset of
UniProt (Swiss-Prot).



Q8I3H7: May protect the malaria parasite against attack by the immune system.
Mean pLDDT 85.57.

View protein

In CASP14, AlphaFold was the top-ranked protein structure prediction method by a
large margin, producing predictions with high accuracy. While the system still
has some limitations, the CASP results suggest AlphaFold has immediate potential
to help us understand the structure of proteins and advance biological research.

Let us know how the AlphaFold Protein Structure Database has been useful in your
research, or if you have questions not answered in the FAQs, at
alphafold@deepmind.com.

If your use case isn’t covered by the database, you can generate your own
AlphaFold predictions using DeepMind’s Colab notebook or open source code. Both
resources also support multimer prediction.

Q8W3K0: A potential plant disease resistance protein. Mean pLDDT 82.24.

View protein


FIND OUT MORE

Methodology Human proteome predictions Downloads
About AlphaFold DB DeepMind EMBL-EBI



WHAT’S NEXT?

We plan to continue updating the database with structures for newly discovered
protein sequences, and to improve features and functionality in response to user
feedback. Please follow DeepMind's and EMBL-EBI’s social channels for updates.


LICENCE AND ATTRIBUTION

All of the data provided is freely available for both academic and commercial
use under Creative Commons Attribution 4.0 (CC-BY 4.0) licence terms. If you use
this resource, please cite the following papers:

Jumper, J et al. Highly accurate protein structure prediction with AlphaFold.
Nature (2021).

Varadi, M et al. AlphaFold Protein Structure Database: massively expanding the
structural coverage of protein-sequence space with high-accuracy models. Nucleic
Acids Research (2022).

The structures provided in this resource are predictions with varying levels of
confidence and should be interpreted carefully.


FAQS


HOW DO I SEARCH THE DATABASE?

 The search bar at the top of the page accepts queries based on protein name
(e.g. Free fatty acid receptor 2), gene name (e.g. At1g58602), UniProt accession
(e.g. Q5VSL9), or organism name (e.g. E. coli).

BLAST / sequence-based searching is not currently supported directly on this
site. However, you can use the EBI Protein Similarity Search tool to search
AlphaFold DB based on a query sequence. Other external tools support
structure-based search against AlphaFold DB, including FoldSeek, Dali and
3D-AF-Surfer. 

--------------------------------------------------------------------------------


WHICH PROTEINS ARE INCLUDED?

 

AlphaFold DB has grown in several stages:

 * July 2021: included 20 model organism proteomes, with sequences taken from
   the “one sequence per gene” reference proteomes provided in UniProt 2021_02.
 * December 2021: added Swiss-Prot, from UniProt 2021_04.
 * January 2022: added a collection of proteomes relevant to global health,
   taken from priority lists compiled by the World Health Organisation.
   Sequences were again taken from the “one sequence per gene” reference
   proteomes provided in UniProt 2021_04.
 * July 2022: added most of the remaining UniProt 2021_04. As part of this
   release we have also included an additional tar file on the AFDB download
   page and FTP, containing predictions in MANE select.
 * Nov 2022: updated a set of structures affected by a temporary numerical bug
   (miscompilation) in the previous July release (list of affected accessions,
   N.B. 160 MiB). This temporary issue resulted in low accuracy predictions with
   correspondingly low pLDDT for ~4% of the total structure predictions
   available in the database. This release includes:
   * Updated coordinates for affected structures. You can still access all old
     coordinates as v3 files, and easily compare v3 and v4 coordinates
   * Minor metadata changes in the mmCIF files for the rest of the structures
     (these files are released as v4). Please refer to our changelog for more
     details. Note that as part of this release we’ve also removed predictions
     with Ca-Ca >10A.

The wider UniProt predictions are the output of a single model, while Swiss-Prot
/ proteomes entries represent the most confident prediction out of 5 model runs.
Internal benchmarking on CASP14 shows that the model used for UniProt
(“model_2_ptm”) is insignificantly less accurate (-1 GDT versus five models),
and that there is a slight bias toward lower confidence (-1 pLDDT) due to the
effect of using one model rather than selecting from 5.

Not all sequences are covered; the most common reasons for a missing sequence
are:

 * It is outside our length range. The minimum length is 16 amino acids, while
   the maximum is 2,700 for proteomes / Swiss-Prot and 1,280 for the rest of
   UniProt. For the human proteome only, our download includes longer proteins
   segmented into fragments.
 * It contains non-standard amino acids (e.g. X).
 * It is not in the UniProt reference proteome “one sequence per gene” Fasta
   file.
 * It has been added or modified by UniProt in a more recent release.
 * It is a viral protein. These are currently excluded, pending improved support
   for polyproteins.

We plan to continue updating the database. In the meantime, if your sequence(s)
aren’t included, you can generate your own AlphaFold predictions using
DeepMind’s Colab notebook and open source code, which also support multimer
predictions.

 

--------------------------------------------------------------------------------


WHAT IF I CAN’T FIND THE PROTEIN I’M INTERESTED IN?

 

If you can’t find the structure you’re looking for, here are some suggestions to
improve your search results:

 * Try searching by protein or gene name rather than specific UniProt accession.
 * If you don't see the sequence you are looking for, try searching for it using
   the EBI Protein Similarity Search tool against the sequences in AlphaFold DB.
   This may provide access to a structure prediction if the query sequence is
   already available in AlphaFold DB. If the query sequence is not available
   then a structure prediction for a similar sequence to the query may be
   available.
 * For human proteins longer than 2,700 amino acids, check the whole proteome
   download. This contains longer proteins predicted as overlapping fragments.
   For example, Titin has predicted fragment structures named as Q8WZ42-F1
   (residues 1–1400), Q8WZ42-F2 (residues 201–1600), etc.
 * Check that the protein isn’t excluded by any of the criteria covered in the
   previous FAQ.


The AlphaFold source code and Colab notebook can be used to predict the
structures of proteins not in AlphaFold DB. Both resources have been updated to
support predicting multimer structures.

If you experience any issues with search, please contact afdbhelp@ebi.ac.uk. 

--------------------------------------------------------------------------------


HOW CAN I DOWNLOAD AND USE THE PREDICTED ALIGNED ERROR (PAE) FILE?

 

The PAE is displayed as an image for each of the structure predictions. If you
need the raw data with PAE for all residue pairs, you can download the PAE as a
JSON file using the button at the top of the structure page.

This file is in a custom format and it isn't supported by any existing software
– you will have to use Python or another programming language to analyse or plot
the information that is contained in it.

For a protein of length num_res, the JSON file has the following structure of
arrays format:
[
    {
        "predicted_aligned_error": [[0, 1, 4, 7, 9, ...], ...], # Shape:
(num_res, num_res).
        "max_predicted_aligned_error": 31.75
    }
]



The fields in the JSON file are:

 * predicted_aligned_error: The PAE value of the residue pair, rounded to the
   closest integer. For the PAE value at position (i, j), i is the residue on
   which the structure is aligned, j is the residue on which the error is
   predicted.
 * max_predicted_aligned_error: A number that denotes the maximum possible value
   of PAE. The smallest possible value of PAE is 0.

We updated the PAE JSON file format on 28th July 2022 to reduce file size by 4x.
Please ensure you read the 2D matrix of PAE values from the
predicted_aligned_error field instead of the removed 1D "distances" field and
avoid using the old "residue1" and "residue2" fields.

If you are using a script or third party tool to read the PAE JSON file
programmatically and you are seeing errors (e.g. missing field "distance"),
check with the author of the program whether the latest PAE JSON format is
supported.

 

--------------------------------------------------------------------------------


WHO SHOULD I CONTACT WITH ENQUIRIES?

 

For questions and feedback about the AlphaFold DB website, please contact
afdbhelp@ebi.ac.uk.

For sharing feedback on structure predictions, please use the feedback buttons
on each structure page.

For other questions about AlphaFold not directly related to the database, please
contact the AlphaFold team at alphafold@deepmind.com. Please do not share
anything confidential with DeepMind.

For press enquiries, please contact press@deepmind.com or comms@ebi.ac.uk.

 

--------------------------------------------------------------------------------

View all frequently asked questions

Developed by


EMBL-EBI

Intranet for staff

SERVICES

 * Data resources and tools
 * Data submission
 * Support and feedback
 * Licensing

RESEARCH

 * Publications
 * Research groups
 * Postdocs and PhDs

TRAINING

 * Live training
 * On-demand training
 * Support for trainers
 * Contact organisers

INDUSTRY

 * Members Area
 * Contact Industry team

ABOUT

 * Contact us
 * Events
 * Jobs
 * News
 * People and groups

EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK. +44
(0)1223 49 44 44

Copyright © EMBL 2022 | EMBL-EBI is part of the European Molecular Biology
Laboratory | Terms of use | License and Disclaimer

This website requires cookies, and the limited processing of your personal data
in order to function. By using the site you are agreeing to this as outlined in
our Privacy Notice and Terms of Use.
I agree, dismiss this banner