How to Convert a TXT File to FASTA

How to Convert a TXT File to FASTA thumbnail
Amino acid sequences are analyzed in clinical studies.

Clinical studies are performed to analyze protein sequence data and find treatments for illnesses. Protein sequence data is put in the FASTA (fast-all) format so that software programs understand how to process the data sequence. The FASTA format has up to 80 characters per sequence data line and uses the IUB/IUPAC (International Union of Biochemistry/International Union of Pure and Applied Chemistry) code standard. Converting a TXT (plain text) file to FASTA format involves editing or adding FASTA-formatted sequence data to an existing text file with protein sequence data lines. Text editor programs like Notepad make this simple to do.

Instructions

    • 1

      Open the protein sequence text file you want to edit in a text editing program such as Notepad.

    • 2

      Edit or add the description line to follow the FASTA format. For example, >gi|129295|sp|P01013|OVAX_CHICK GENE X PROTEIN (OVALBUMIN-RELATED) is a valid FASTA description line. This line provides a unique description for the sequence data lines that follow. The FASTA format requires the use of the greater than symbol (>) so the software program can identify the unique descriptive information and avoid processing the description as a protein data sequence line.

    • 3

      Press the "Enter" key to insert a line break once the description line is edited.

    • 4

      Edit or add the protein sequence data line format to conform to the IUB/IUPAC standard codes. The IUB/IUPAC standard uses the letters of the alphabet to represent acceptable codes or query sequences for amino acids or nucleic acids in the FASTA format. For example, QIKDLLVSSSTDLDTTLVLVNAIYFKGMWKTAFNAEDTREMPFHVTKQESKPVQMMCMNNSFNVATLPAE
      represents one line of valid sequence data since it starts with the letter "Q," representing glutamine, and ends with the letter "E," representing glutamate.

    • 5

      Add more sequence data lines, edit existing sequence data lines or add line breaks after 80 characters as needed. Adhering to the FASTA sequence data line standards and line breaks ensures that the program follows the instructions related to glutamine, glutamate and other letter codes. The letters in the IUB/IUPAC standard are simply instructions to the software program that processes FASTA formatted data.

    • 6

      Click "File," select "Save" then click the "Save" button. Your TXT file is now in FASTA format.

Related Searches:

References

Resources

  • Photo Credit dna image by chrisharvey from Fotolia.com

Comments

You May Also Like

  • How to Convert Sequence to Fasta

    One common goal in medical research involves identification of errors, or mutations, in DNA sequence that could cause genetic related disease. Technology...

  • How to Convert a PDB to a FASTA

    The PDB file extension stands for "protein data bank" and contains within it text describing the three-dimensional structure of molecular proteins. It...

  • How to Convert to 3GP Format

    3GP, which stands for Third Generation Player, is a format used for creating, delivering and playing back multimedia audio and video files...

  • How to Convert a WAV File to Text

    Wav to Text software converts WAV audio files into text by use of its intelligible human voice synthesis capabilities. This technology was...

  • How to Convert a WMA File to Text

    A program doesn't exist that can convert a Windows Media Audio (WMA) file into text. However, voice-recognition programs can transcribe the audio...

  • How to Determine a Protein Structure by DNA Sequencing

    The determination of a protein's multidimensional structure through DNA sequencing is also known as protein structure prediction and requires conversion of the...

  • How to Convert Text to ANSI Format

    Text files often consist of characters defined in the ASCII (American Standard Code for Information Interchange) set. The standard ASCII set has...

  • How to Convert a CSV File to a Text File

    Spreadsheets can create comma-separated value (CSV) files to allow transfer of information among number and database programs. Unfortunately, other programs, such as...

Related Ads

Featured