Computing sequence length from a fasta file
Often, one wants to calculate the sequence length from a FASTA file.
Example
Say you have a FASTA file called my_sequence.fasta
:
>my_sequence
ATCGATCGATCG
Here, you’d like to compute “12” as the length of the ATCGATCGATCG
sequence.
One-liner
To achieve this on a bigger FASTA file, run this code in your Shell:
awk '/^>/ {print; next; } { seqlen = length($0); print seqlen}' my_sequences.fasta
This will return results to the stdout (your screen):
12
You can execute this code on a multi-line FASTA file.
FASTQ to FASTA
If needed, first convert your FASTQ to FASTA with seqtk.
seqtk seq -a in.fq.gz > out.fa
Written on June 2, 2021