golgi.sandbox.google.com
Open in
urlscan Pro
2a00:1450:400c:c0b::451
Public Scan
Submitted URL: http://alphafoldserver.com/
Effective URL: https://golgi.sandbox.google.com/welcome
Submission: On May 10 via api from US — Scanned from DE
Effective URL: https://golgi.sandbox.google.com/welcome
Submission: On May 10 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
ALPHAFOLD SERVER BETA Server About FAQs feedbacklight_mode Server About FAQs ALPHAFOLD SERVER POWERED BY ALPHAFOLD 3 googleContinue with Google AlphaFold 3 model is a Google DeepMind and Isomorphic Labs collaboration HOW DOES ALPHAFOLD SERVER WORK? AlphaFold Server is a web-service that can generate highly accurate biomolecular structure predictions containing proteins, DNA, RNA, ligands, ions, and also model chemical modifications for proteins and nucleic acids in one platform. It’s powered by the newest AlphaFold 3 model. TAKE A LOOK AT SOME EXAMPLES Protein-RNA-Ion: PDB 8AW3 Protein-Glycan-Ion: PDB 7BBV Protein-DNA-Ion: PDB 7RCE TERMS OF USE AND ATTRIBUTION AlphaFold Server is for non-commercial use only, subject to AlphaFold Server Terms of Service. AlphaFold Server output cannot be used in docking or screening tools or to train machine learning models or related technology for biomolecular structure prediction similar to AlphaFold Server. If you use an AlphaFold Server prediction, please cite our paper: Abramson, J et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature (2024). FREQUENTLY ASKED QUESTIONS WHAT BIOLOGICAL MOLECULE TYPES CAN BE MODELED WITH ALPHAFOLD SERVER? expand_more You can model a structure consisting of one or more of the following biological molecule types: * Proteins * DNA * RNA * Biologically common ligands: ATP, ADP, AMP, GTP, GDP, FAD, NADP, NADPH, NDP, heme, heme C, myristic acid, oleic acid, palmitic acid, citric acid, chlorophylls A and B, bacteriochlorophylls A and B * Biologically common ions: Ca2+, Co2+, Cu2+, Fe3+, K+, Mg2+, Mn2+, Na+, Zn2+, Cl- * Biologically common post-translational modifications (PTMs) of amino acid residues * Phosphorylation of serine, threonine, tyrosine and histidine residues * Acetylation of lysine residues * Methylation of lysine and arginine residues * Malonylation of cysteine residues * Hydroxylation of proline, lysine and asparagine residues * Palmitoylation of cysteine residues * Succinylation of asparagine residues * S-nitrosylation of cysteine residues * Formylation of tryptophan residues * Crotonylation of lysine residues * Citrullination of lysine and arginine residues * Glycan chains (including branched chains) composed of certain sugars: alpha/beta-D-glucose, alpha/beta-D-mannose, alpha-L-fucose, beta-D-galactose, N-acetyl-beta-D-glucosamine * Biologically common chemical modifications of the nucleic acids: * DNA * Methylation of cytosine, guanine, and adenine * Carboxylation of cytosine * Oxidation of guanine * Formylation of cytosine * RNA * Methylation of cytosine, guanine, adenine, and uracil * Isomerisation of uridine into pseudouridine * Formylation of cytosine The modeled structure can be composed of multiple proteins, nucleic acids, ligands, and ions. Each protein and nucleic acid chain can have any number of chemical modifications, subject to the token limit. WHAT IS THE MAXIMUM JOB SIZE ALLOWED? expand_more The total size of the job is limited by the number of 'tokens' in the structure - the limit is 5,000 tokens. Tokens are counted in the following way: * Proteins: 1 token per standard amino acid residue * DNA, RNA: 1 token per input nucleotide base * Ligands: 1 token per atom in the ligand * Ions: 1 token per ion * Modifications (excluding glycans): 1 token per atom for all atoms of the modified amino acid residue or nucleotide * Glycan PTMs: 1 token per atom in the glycan (in addition to the 1 token for the residue the glycan is attached to) Note that each protein chain and nucleotide chain must contain at least 4 amino acids or nucleotides, respectively. HOW MANY JOBS CAN I RUN ON ALPHAFOLD SERVER? expand_more 10 jobs per day. If you don’t have enough quota you can save your job and submit it when your quota refreshes. We plan to explore other approaches for quota allocation in the future, including weekly or monthly allocations. HOW SHOULD I DEFINE INPUTS FOR ALPHAFOLD SERVER? expand_more * For a protein, enter the single-letter amino acid sequence or paste in the contents of a FASTA file including a comment line(s). Use only standard single letter codes (nonstandard ones like B, J, O, U and X are unsupported). * For DNA, enter the single-letter nucleotide sequence in standard notation (5’-3’). Use only standard single letter codes (A, C, T, and G). For double-stranded DNA, please select the “+ Reverse complement” option from more_vert of your DNA entry to add the complementary strand. * For RNA, similarly enter the single-letter nucleotide sequence in standard notation (5’-3’). Use only standard single letter codes for RNA nucleotides (A, C, U, and G). * For ligands, ions, and post-translational modifications, select the desired entity from the list of supported types. The three letter codes displayed in the UI come from Protein Data Bank’s Chemical Component Dictionary. * If multiple copies of an entity are present (for example, a homomultimeric protein), indicate this by setting the number of copies in the corresponding field. ARE THERE ANY RESTRICTIONS ON THE PROTEIN SEQUENCES THAT ARE ALLOWED? expand_more * Yes, we are currently restricting sequences from a small number of viral pathogens. If you run a job that encounters the filter and have any questions please get in touch with the AlphaFold team feedback. * Based on our detailed external consultations we consider the release of AlphaFold 3 capabilities through the Server to be low risk, and we are using AlphaFold Server as a test-bed to explore how to develop robust filters for future biological AI models. * The current restricted sequences are not meant to cover a comprehensive set of all possible pathogens, instead it is a sample used to develop a filter. We plan to evolve this, including possible changes to what we restrict in the Server, through our active engagement with experts and the community. We share more details on Google DeepMind’s commitment to biosafety and responsible deployment of AlphaFold in our blog. HOW CAN I INTERPRET CONFIDENCE METRICS TO CHECK THE ACCURACY OF STRUCTURES? expand_more Similar to AlphaFold2 and AlphaFold-Multimer, outputs include confidence metrics. The main metrics are: * pLDDT: a per-atom confidence estimate on a 0-100 scale where a higher value indicates higher confidence. pLDDT aims to predict a modified LDDT score that only considers distances to polymers. For proteins this is similar to the lDDT-Cα metric but with more granularity as it can vary per atom not just per residue. For ligand atoms the modified LDDT considers the errors only between the ligand atom and polymers not other ligand atoms, and for DNA/RNA a wider radius of 30A is used for the modified LDDT instead of 15A. The pLDDT is shown as color outputs in the image of the structure, using the same value to color mapping as in AFDB. * PAE (predicted aligned error): estimate of the error in the relative position and orientation between two tokens in the predicted structure. Higher values indicate higher predicted error and therefore lower confidence. For proteins and nucleic acids, PAE score is essentially the same as AlphaFold2, where the error is measured relative to frames constructed from the protein backbone. For small molecules and post-translational modifications, a frame is constructed for each atom from its closest neighbors from a reference conformer. * pTM and ipTM scores: the predicted template modeling (pTM) score and the interface predicted template modeling (ipTM) score are both derived from a measure called the template modeling (TM) score. This measures the accuracy of the entire structure (Zhang and Skolnick, 2004; Xu and Zhang, 2010). A pTM score above 0.5 means the overall predicted fold for the complex might be similar to the true structure. ipTM measures the accuracy of the predicted relative positions of the subunits within the complex. Values higher than 0.8 represent confident high-quality predictions, while values below 0.6 suggest likely a failed prediction. ipTM values between 0.6 and 0.8 are a gray zone where predictions could be correct or incorrect. TM score is very strict for small structures or short chains, so pTM assigns values less than 0.05 when fewer than 20 tokens are involved; for these cases PAE or pLDDT may be more indicative of prediction quality. For detailed description of these confidence metrics see our paper. For protein components, the AlphaFold: A Practical guide course for structures provides additional tutorials on the confidence metrics. If you are interested in a specific entity or interaction, then there are confidences available in the downloadable outputs that are specific to each chain or chain-pair, as opposed to the full complex. See json section for more details on all the confidence metrics that are returned. HOW MANY PREDICTIONS ARE RETURNED WHEN I RUN A JOB? expand_more The model samples five predictions per seed. The top ranked prediction is displayed on the result page and all samples along with their associated confidences are available to download as a zip file, via the Download button. For ranking of the full complex use the ranking_score (higher is better). This score uses overall structure confidences (pTM and ipTM), but also includes terms that penalize clashes and encourage disordered regions not to have spurious helices - these extra terms mean the score should only be used to rank structures. If you are interested in a specific entity or interaction, you may want to rank by a metric specific to that chain or chain-pair, as opposed to the full complex. In that case, use the per chain or per chain-pair confidence metrics described in the json section for ranking. HOW DO I INTERPRET ALL THE OUTPUTS IN THE DOWNLOADED JSON FILES? expand_more For each predicted sample we provide two JSON files. One contains summary metrics - summaries for either the whole structure, per chain or per chain-pair - and the other contains full 1D or 2D arrays. Summary outputs: * ptm: A scalar in the range 0-1 indicating the predicted TM-score for the full structure. * iptm: A scalar in the range 0-1 indicating predicted interface TM-score (confidence in the predicted interfaces) for all interfaces in the structure. * fraction_disordered: A scalar in the range 0-1 that indicates what fraction of the prediction structure is disordered, as measured by accessible surface area, see our paper for details. * has_clash: A boolean indicating if the structure has a significant number of clashing atoms (more than 50% of a chain, or a chain with more than 100 clashing atoms). * ranking_score: A scalar in the range [-100, 1.5] that can be used for ranking predictions, it incorporates ptm, iptm, fraction_disordered and has_clash into a single number with the following equation: 0.8 × ipTM + 0.2 × pTM + 0.5 × disorder − 100 × has_clash * chain_pair_pae_min: A [num_chains, num_chains] array. Element (i, j) of the array contains the lowest PAE value across rows restricted to chain i and columns restricted to chain j. This has been found to correlate with whether two chains interact or not, and in some cases can be used to distinguish binders from non-binders. * chain_pair_iptm: A [num_chains, num_chains] array. Off-diagonal element (i, j) of the array contains the ipTM restricted to tokens from chains i and j. Diagonal element (i, i) contains the pTM restricted to chain i. Can be used for ranking a specific interface between two chains, when you know that they interact, e.g. for antibody-antigen interactions. * chain_ptm: A [num_chains] array. Element i contains the pTM restricted to chain i. Can be used for ranking individual chains when the structure of that chain is most of interest, rather than the cross-chain interactions it is involved with. * chain_iptm: A [num_chains] array that gives the average confidence (interface pTM) in the interface between each chain and all other chains. Can be used for ranking a specific chain, when you care about where the chain binds to the rest of the complex and you do not know which other chains you expect it to interact with. This is often the case with ligands. Full array outputs: * full_pae: A [num_tokens, num_tokens] array. Element (i, j) indicates the predicted error in the position of token j, when the prediction is aligned to the ground truth using the frame of token i. * atom_plddts: A [num_atoms] array, element i indicates the predicted local distance difference test (pLDDT) for atom i in the prediction. * contact_probs: A [num_tokens, num_tokens] array. Element (i, j) indicates the predicted probability that token i and token j are in contact (8Å between the representative atom for each token), see our paper for details. * token_chain_ids: A [num_tokens] array indicating the chain ids corresponding to each token in the prediction. * atom_chain_ids: A [num_atoms] array indicating the chain ids corresponding to each atom in the prediction. WHAT SHOULD I DO IF I HAVE UNKNOWN RESIDUES OR NUCLEOTIDES IN MY PROTEIN, DNA OR RNA SEQUENCE? expand_more AlphaFold Server was not designed to model unknown residues or nucleotides (e.g. X for the unknown residues and N for unknown nucleotides). Please substitute by one of the standard residues or nucleotides that is appropriate for your particular case. In general, consider following substitutions: * Proteins: replace unknown protein residues with alanine (A) * DNA: replace unknown nucleotides by poly-T (T), but other nucleotides are also suitable * RNA: replace unknown nucleotides by poly-U (U), but other nucleotides are also suitable WHAT ARE SEEDS AND HOW ARE THEY SET? expand_more The model uses a 'seed' for internal random number generation. Normally this seed is sampled automatically, and will be resampled when cloning a job. Running multiple different seeds of the model and ranking over all the predictions can lead to improved accuracy. The seed used is saved into the output information per run. To set a specific seed, turn off auto seed selection in the preview screen (after clicking the 'Continue and Preview job' button). The seed can be any integer between 0 and 4,294,967,295. When cloning a job where the seed was set, the seed will return to being automatically chosen by default. CAN I IMPORT JOB FILES INTO ALPHAFOLD SERVER? expand_more Yes, we support efficiently importing multiple draft jobs by uploading JSON files with up to 100 jobs per file. Please note that you have a storage capacity of up to 500 saved drafts in your history, so be mindful to manage your uploads to stay within the limit. To create a JSON file: please refer to this example for the JSON file syntax. Inside each .zip file with modeling results, you'll find a JSON file named ’job_name_job_request.json’ containing the job inputs. These files offer a convenient starting point for generating new jobs as they are easily editable in standard text editors or in a programming system like Google Colab notebooks. Once your file is prepared click the 'Upload JSON’ button to upload your JSON files. Imported jobs will appear as saved drafts in your job history and you can click on more_vert of your job to edit or run them. HOW CAN I RUN A MODELING JOB AGAIN? expand_more Select the “Clone and reuse” option in the more_vert of your job history. This option also allows further modification of the job before running. Or, alternatively, upload the JSON file “job_name_job_request.json” that is part of the .zip file containing modeling results. Press the “Upload JSON” button and specify the corresponding JSON file; the imported job will appear as a saved draft in your job history. The JSON files can be shared with other users who want to reproduce your job on the Server. Note that exact reproducibility is not guaranteed over time, due to changes in underlying compiler optimisations. WHAT IS DIFFERENT ABOUT THE NEW ALPHAFOLD 3 MODEL COMPARED TO ALPHAFOLD2? expand_more AlphaFold 3 can predict many biomolecules in addition to proteins. AlphaFold 2 predicts structures of proteins and protein-protein complexes. AlphaFold 3 can generate predictions containing proteins, DNA, RNA, ions,ligands, and chemical modifications. The new model also improves the protein complex modelling accuracy. Please refer to our paper for more information on performance improvements. AlphaFold 2 generally produces looping “ribbon-like” predictions for disordered regions. AlphaFold 3 also does this, but will occasionally output segments with secondary structure within disordered regions instead, mostly spurious alpha helices with very low confidence (pLDDT) and inconsistent position across predictions. WHAT ARE SOME LIMITATIONS OF THE ALPHAFOLD 3 MODEL? expand_more The accuracy of the model varies across biomolecules and interface types; model confidence outputs are correlated with prediction accuracy, and the strength of the correlation varies per molecule type. In some cases, optimal model performance can only be achieved by running multiple seeds and taking the top ranked sample; this is particularly the case for antibody-antigen interactions. See our paper for more details on the model and its limitations. The model occasionally produces overlapping atoms in the predictions, and in some cases homomers where entire chains have been observed to overlap. Clashes mostly occur for protein-nucleic acid complexes with both greater than 100 nucleotides and greater than 2,000 residues in total. The model can produce spurious structural order in disordered regions. These regions are typically marked as very low confidence, but they can lack the distinctive ribbon-like appearance that AlphaFold 2 produces in disordered regions. The presence of disordered regions affects nearby pLDDT - removing disordered tails can give a clearer picture of confidence in ordered regions. Model outputs do not always have the correct chirality but this will vary across predictions, making it possible to select predictions with correct chirality in most cases. WHAT MOLECULE TYPES ARE NOT SUPPORTED VIA ALPHAFOLD SERVER? expand_more AlphaFold Server does not support ligands, ions and modifications that are not in the molecule list section above. Additionally, AlphaFold Server is not capable of predicting water molecules or hydrogen atoms, and is not aware of membrane planes for membrane proteins WHAT TERMS OF USE APPLY TO ALPHAFOLD SERVER PREDICTIONS? expand_more AlphaFold Server predictions are provided for non-commercial use only, under and subject to AlphaFold Server Output Terms of Use. * You cannot use AlphaFold Server outputs in docking or screening tools or to train machine learning models or related technology for biomolecular structure prediction. * You can publish, share and adapt AlphaFold Server output in accordance with AlphaFold Server Terms of Service, including the requirement to provide clear notice that ongoing use is subject to AlphaFold Server Output Terms of Use and of any modifications you make. HOW SHOULD I CITE ALPHAFOLD SERVER? expand_more Please reference our paper: Abramson, J et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature (2024) WHO SHOULD I CONTACT WITH ENQUIRIES AND FEEDBACK? expand_more Please get in touch with the AlphaFold Server team via the feedback button and we’ll be happy to assist you with questions. Reporting an issue from the result page automatically includes the associated job ID. We're working hard to answer all inquiries but there may be a short delay in our response due to the high volume we are receiving. WHAT IS THE DIFFERENCE BETWEEN ALPHAFOLD SERVER AND THE ALPHAFOLD DATABASE? expand_more AlphaFold Database is a large collection of precomputed protein predictions, generated with the AlphaFold2 model. It covers a significant proportion of the proteins in UniProt, and entries can be quickly downloaded including in bulk. However, the predictions are always single chains (even if the protein forms multimers in nature) and contain only the protein part (no ligands or co-factors). AlphaFold Database is free for commercial and non-commercial use, and requires no registration to access the structures. AlphaFold Server is a web-service that offers customized biomolecular structure prediction. It makes several newer AlphaFold 3 capabilities available, including support for a wider range of molecule types (DNA, RNA, ions, ligands, chemical modifications). The service is free for non-commercial use and requires a simple sign up involving accepting non-commercial use terms. HOW CAN I INCREASE THE DIVERSITY OF MY PREDICTIONS? expand_more Run again with different seeds (it will be chosen automatically if not set). Users of AlphaFold2 have had success in generating diverse predictions by customizing MSA and/or template inputs to the model - this is not currently possible in the server but we hope to provide the ability to do similar customisations soon. CAN I MODEL GLYCOSYLATED PROTEINS? expand_more To describe the glycan chains, we are using 3-letter CCD codes (Chemical Components in the PDB) of the corresponding glycans. Please note that stereoisomers are described by different CCD codes, e.g. mannose (C6H12O6) could be described as MAN for alpha-D-mannose and BMA for beta-D-mannose * The Server supports the following glycan residues to be attached to a protein residue * N (Asparagine): BGC, BMA, GLC, MAN, NAG * T (Threonine): BGC, BMA, FUC, GLC, MAN, NAG * S (Serine): BGC, BMA, FUC, GLC, MAN, NAG * Branched glycans can be constructed in the form of a tree with either one or two downstream connections per glycan, attached to a protein residue. Up to 8 glycan residues in total are supported. Here are some examples that demonstrate how to input branching glycans: * NAG: NAG is a single glycan residue. * NAG(BMA): NAG has a single child which is BMA. * NAG(BMA(BGC)): NAG has 1 child which is BMA; BMA has one child which is BGC. * NAG(FUC)(NAG): NAG has 2 children which are FUC and NAG. * NAG(NAG(MAN(MAN(MAN)))): linear glycan chain. * NAG(NAG(MAN(MAN(MAN)(MAN(MAN)(MAN))))): branched ligand chain. Glycan - glycan connections should also be chemically valid. For example, GLC(NAG)(MAN) is not a valid branched glycan because NAG and MAN can’t form glycosidic bonds to GLC. The Server assumes that glycosidic bonds are formed between atoms at positions that have the highest frequency of occurrence in similar bonds from the PDB - this might lead to different bond positions in the modeled structure than expected. Specifying exact atoms for the glycosidic bond is not currently supported. RELATED POSTS DOMAIN-SPECIFIC TECHNOLOGY ALPHAFOLD 3 PREDICTS THE STRUCTURE AND INTERACTIONS OF ALL OF LIFE’S MOLECULES Introducing AlphaFold 3, an AI model developed by Google DeepMind and Isomorphic Labs. By accurately predicting the structure of proteins, DNA, RNA, ligands and more, and how they interact, we expect it to transform our understanding of the biological world and drug discovery. arrow_forward TECHNOLOGY ACCELERATING RESEARCH IN NEARLY EVERY FIELD OF BIOLOGY By solving a decades-old scientific challenge, our AI system is helping to solve crucial problems like treatments for disease or breaking down single-use plastics. One day, it might even help unlock the mysteries of how life itself works. arrow_forward About Google Google products Terms Output Terms Privacy Prohibited use policy