alphafold.ebi.ac.uk
Open in
urlscan Pro
34.149.152.8
Public Scan
URL:
https://alphafold.ebi.ac.uk/
Submission: On September 03 via manual from JP — Scanned from JP
Submission: On September 03 via manual from JP — Scanned from JP
Form analysis
1 forms found in the DOMName: global-search — GET /ebisearch/search.ebi
<form id="global-search" name="global-search" action="/ebisearch/search.ebi" method="GET" class="">
<div>
<div class="input-group"><label class="vf-form__label vf-u-sr-only" for="global-searchbox">Search all of EMBL-EBI</label><input type="text" name="query" id="global-searchbox" class="input-group-field" placeholder="Search all of EMBL-EBI">
<div class="input-group-button"><input type="submit" name="submit" value="Search" class="button"><input type="hidden" name="db" value="allebi" checked="checked"><input type="hidden" name="requestFrom" value="masthead-black-bar"
checked="checked"></div>
</div>
</div>
</form>
Text Content
× Search all of EMBL-EBI * EMBL-EBI * Search * About us * Training * Research * Services * EMBL-EBI home ALPHAFOLD PROTEIN STRUCTURE DATABASE ALPHAFOLD PROTEIN STRUCTURE DATABASE * Home * About * FAQs * Downloads * API ALPHAFOLD PROTEIN STRUCTURE DATABASE Developed by DeepMind and EMBL-EBI SearchBETA Search Examples: Free fatty acid receptor 2At1g58602Q5VSL9E. coliSee search help AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research. BACKGROUND AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment. DeepMind and EMBL’s European Bioinformatics Institute (EMBL-EBI) have partnered to create AlphaFold DB to make these predictions freely available to the scientific community. The latest database release contains over 200 million entries, providing broad coverage of UniProt (the standard repository of protein sequences and annotations). We provide individual downloads for the human proteome and for the proteomes of 47 other key organisms important in research and global health. We also provide a download for the manually curated subset of UniProt (Swiss-Prot). Q8I3H7: May protect the malaria parasite against attack by the immune system. Mean pLDDT 85.57. View protein In CASP14, AlphaFold was the top-ranked protein structure prediction method by a large margin, producing predictions with high accuracy. While the system still has some limitations, the CASP results suggest AlphaFold has immediate potential to help us understand the structure of proteins and advance biological research. Let us know how the AlphaFold Protein Structure Database has been useful in your research, or if you have questions not answered in the FAQs, at alphafold@deepmind.com. If your use case isn’t covered by the database, you can generate your own AlphaFold predictions using DeepMind’s Colab notebook or open source code. Both resources also support multimer prediction. Q8W3K0: A potential plant disease resistance protein. Mean pLDDT 82.24. View protein FIND OUT MORE Methodology Human proteome predictions Downloads About AlphaFold DB DeepMind EMBL-EBI WHAT’S NEXT? We plan to continue updating the database with structures for newly discovered protein sequences, and to improve features and functionality in response to user feedback. Please follow DeepMind's and EMBL-EBI’s social channels for updates. LICENCE AND ATTRIBUTION All of the data provided is freely available for both academic and commercial use under Creative Commons Attribution 4.0 (CC-BY 4.0) licence terms. If you use this resource, please cite the following papers: Jumper, J et al. Highly accurate protein structure prediction with AlphaFold. Nature (2021). Varadi, M et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Research (2022). The structures provided in this resource are predictions with varying levels of confidence and should be interpreted carefully. FAQS HOW DO I SEARCH THE DATABASE? The search bar at the top of the page accepts queries based on protein name (e.g. Free fatty acid receptor 2), gene name (e.g. At1g58602), UniProt accession (e.g. Q5VSL9), or organism name (e.g. E. coli). BLAST / sequence-based searching is not currently supported directly on this site. However, you can use the EBI Protein Similarity Search tool to search AlphaFold DB based on a query sequence. Other external tools support structure-based search against AlphaFold DB, including FoldSeek, Dali and 3D-AF-Surfer. -------------------------------------------------------------------------------- WHICH PROTEINS ARE INCLUDED? AlphaFold DB has grown in several stages: * July 2021: included 20 model organism proteomes, with sequences taken from the “one sequence per gene” reference proteomes provided in UniProt 2021_02. * December 2021: added Swiss-Prot, from UniProt 2021_04. * January 2022: added a collection of proteomes relevant to global health, taken from priority lists compiled by the World Health Organisation. Sequences were again taken from the “one sequence per gene” reference proteomes provided in UniProt 2021_04. * July 2022: added most of the remaining UniProt 2021_04. As part of this release we have also included an additional tar file on the AFDB download page and FTP, containing predictions in MANE select. * Nov 2022: updated a set of structures affected by a temporary numerical bug (miscompilation) in the previous July release (list of affected accessions, N.B. 160 MiB). This temporary issue resulted in low accuracy predictions with correspondingly low pLDDT for ~4% of the total structure predictions available in the database. This release includes: * Updated coordinates for affected structures. You can still access all old coordinates as v3 files, and easily compare v3 and v4 coordinates * Minor metadata changes in the mmCIF files for the rest of the structures (these files are released as v4). Please refer to our changelog for more details. Note that as part of this release we’ve also removed predictions with Ca-Ca >10A. The wider UniProt predictions are the output of a single model, while Swiss-Prot / proteomes entries represent the most confident prediction out of 5 model runs. Internal benchmarking on CASP14 shows that the model used for UniProt (“model_2_ptm”) is insignificantly less accurate (-1 GDT versus five models), and that there is a slight bias toward lower confidence (-1 pLDDT) due to the effect of using one model rather than selecting from 5. Not all sequences are covered; the most common reasons for a missing sequence are: * It is outside our length range. The minimum length is 16 amino acids, while the maximum is 2,700 for proteomes / Swiss-Prot and 1,280 for the rest of UniProt. For the human proteome only, our download includes longer proteins segmented into fragments. * It contains non-standard amino acids (e.g. X). * It is not in the UniProt reference proteome “one sequence per gene” Fasta file. * It has been added or modified by UniProt in a more recent release. * It is a viral protein. These are currently excluded, pending improved support for polyproteins. We plan to continue updating the database. In the meantime, if your sequence(s) aren’t included, you can generate your own AlphaFold predictions using DeepMind’s Colab notebook and open source code, which also support multimer predictions. -------------------------------------------------------------------------------- WHAT IF I CAN’T FIND THE PROTEIN I’M INTERESTED IN? If you can’t find the structure you’re looking for, here are some suggestions to improve your search results: * Try searching by protein or gene name rather than specific UniProt accession. * If you don't see the sequence you are looking for, try searching for it using the EBI Protein Similarity Search tool against the sequences in AlphaFold DB. This may provide access to a structure prediction if the query sequence is already available in AlphaFold DB. If the query sequence is not available then a structure prediction for a similar sequence to the query may be available. * For human proteins longer than 2,700 amino acids, check the whole proteome download. This contains longer proteins predicted as overlapping fragments. For example, Titin has predicted fragment structures named as Q8WZ42-F1 (residues 1–1400), Q8WZ42-F2 (residues 201–1600), etc. * Check that the protein isn’t excluded by any of the criteria covered in the previous FAQ. The AlphaFold source code and Colab notebook can be used to predict the structures of proteins not in AlphaFold DB. Both resources have been updated to support predicting multimer structures. If you experience any issues with search, please contact afdbhelp@ebi.ac.uk. -------------------------------------------------------------------------------- HOW CAN I DOWNLOAD AND USE THE PREDICTED ALIGNED ERROR (PAE) FILE? The PAE is displayed as an image for each of the structure predictions. If you need the raw data with PAE for all residue pairs, you can download the PAE as a JSON file using the button at the top of the structure page. This file is in a custom format and it isn't supported by any existing software – you will have to use Python or another programming language to analyse or plot the information that is contained in it. For a protein of length num_res, the JSON file has the following structure of arrays format: [ { "predicted_aligned_error": [[0, 1, 4, 7, 9, ...], ...], # Shape: (num_res, num_res). "max_predicted_aligned_error": 31.75 } ] The fields in the JSON file are: * predicted_aligned_error: The PAE value of the residue pair, rounded to the closest integer. For the PAE value at position (i, j), i is the residue on which the structure is aligned, j is the residue on which the error is predicted. * max_predicted_aligned_error: A number that denotes the maximum possible value of PAE. The smallest possible value of PAE is 0. We updated the PAE JSON file format on 28th July 2022 to reduce file size by 4x. Please ensure you read the 2D matrix of PAE values from the predicted_aligned_error field instead of the removed 1D "distances" field and avoid using the old "residue1" and "residue2" fields. If you are using a script or third party tool to read the PAE JSON file programmatically and you are seeing errors (e.g. missing field "distance"), check with the author of the program whether the latest PAE JSON format is supported. -------------------------------------------------------------------------------- WHO SHOULD I CONTACT WITH ENQUIRIES? For questions and feedback about the AlphaFold DB website, please contact afdbhelp@ebi.ac.uk. For sharing feedback on structure predictions, please use the feedback buttons on each structure page. For other questions about AlphaFold not directly related to the database, please contact the AlphaFold team at alphafold@deepmind.com. Please do not share anything confidential with DeepMind. For press enquiries, please contact press@deepmind.com or comms@ebi.ac.uk. -------------------------------------------------------------------------------- View all frequently asked questions Developed by EMBL-EBI Intranet for staff SERVICES * Data resources and tools * Data submission * Support and feedback * Licensing RESEARCH * Publications * Research groups * Postdocs and PhDs TRAINING * Live training * On-demand training * Support for trainers * Contact organisers INDUSTRY * Members Area * Contact Industry team ABOUT * Contact us * Events * Jobs * News * People and groups EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK. +44 (0)1223 49 44 44 Copyright © EMBL 2022 | EMBL-EBI is part of the European Molecular Biology Laboratory | Terms of use | License and Disclaimer This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our Privacy Notice and Terms of Use. I agree, dismiss this banner