Description
Database preprocessing: removal of duplicated sequences and of sequences with illegal symbols. To preferentially keep one duplicate sequence over another, place preferred sequences first.
Epilog
Semidán Robaina Estévez (srobaina@ull.edu.es), 2021
Usage:
usage: metatag preprocess [-h] [args]
Arguments
short | long | default | help |
---|---|---|---|
-h |
--help |
show this help message and exit | |
--in |
None |
path to fasta file or directory containing fasta files | |
--dna |
declare if sequences are nucleotides. Defaults to peptide sequences. | ||
--translate |
choose whether nucleotide sequences are translated with prodigal | ||
--export-duplicates |
choose whether duplicated sequences are exported to file (same directory than outfile) | ||
--outfile |
None |
path to output fasta file | |
--relabel |
relabel record IDs with numeral ids | ||
--relabel_prefix |
None |
prefix to be added to sequence IDs |