Sequence file (in FASTA format)
As part of its algorithm, AbLIFT runs BLAST(Basic Local Alignment Search Tool) to collect homologous sequences.
When collecting sequences, the algorithm needs to use the full protein sequence and not an artifact sequence resulting from lack of density in its structure.
Any mismatch between the sequence in the sequence file and the sequence in the PDB file that is not due to lack of density may cause the run to fail or end up with ambiguous results.
In summary: the only purpose of the sequence file is to represent regions that are unseen in the PDB file due to lack of density.
When uploading your own structure you must also provide a matching FASTA format sequence file obeying the following rules:
- The file should include a single sequence.
- It should be a plain text file (not word nor rich text file (rtf)). We recommend using TextWrangler or similar editors to create and edit the file.
- The sequence title should start with a > sign followed by some name. For example: >Cool_protein_chain_A
- In the amino-acid sequence (of the designed chain):
- Make sure that there aren't any point mutations between the uploaded PDB file and the uploaded FASTA file.
- Include missing loops. In many PDB files there is lack of density (usually at the N/C termini or in loops that did not crystallize). Make sure you include the missing residues in the sequence file (it is essential to include missing residues from the middle of the protein structure).
We strongly suggest to do the following after the PDB and FASTA files preparation (example is given on pdb entry 1AHW chain B):
- Extract the sequence from the pdb file (command line example for Pymol: save 1AHW_B.fasta, 1AHW and chain B).
- Align the sequence from the FASTA file and the pdb-extracted sequence to make sure that the FASTA sequence is equal or longer than the pdb sequence (i.e. that gaps in the alignment (if exist) are seen only in the pdb-extracted sequence and that no point mutations are seen (easy alignment: BLAST2SEQ).
**If you experience problems with uploading your sequence file and can't find any obvious reasons based on what is written above, try opening your file in a different text editor to see if there are signs that were invisible in your original text editor. Make sure to delete such signs. If you are using windows, go to a friend who uses Mac and check your file on a Mac plain text editor.
Example for a valid sequence file: