Previously seen on Slack
It would be great if there was some way to convert an alignment into a SAM/BAM.Record so that it could be exported to the filesystem for future use. Note that I will only refer to SAM.Record from here on, but the same would be expected for BAM.Record, along the lines of #25.
Expected Behavior
Ideally, this would simply be a new method for the SAM.Record constructor. Something like
XAM.SAM.Record(aln::PairwiseAlignment, kwargs...)
There would need to be other information (possibly with defaults?) added, as a SAM.Record contains more fields than
Current Behavior
There is no such behavior.
Possible Solution / Implementation
The challenge here is that SAM.Records store more information than PairwiseAlignments. Here's a map of what I think could be inferred vs. what needs to be passed as an argument.
| Column |
Field name |
PairwiseAlignment accessor |
Possible default |
| 1 |
QNAME |
N/A |
N/A |
| 2 |
FLAG |
N/A* |
N/A |
| 3 |
RNAME |
N/A |
N/A |
| 4 |
POS |
N/A |
1** |
| 5 |
MAPQ |
unknown |
255 |
| 6 |
CIGAR |
cigar(aln.a.aln) |
N/A |
| 7 |
RNEXT |
N/A |
N/A*** |
| 8 |
PNEXT |
N/A |
N/A*** |
| 9 |
TLEN |
length(aln.a.seq) |
N/A |
| 10 |
SEQ |
aln.a.seq |
N/A |
| 11 |
QUAL |
N/A |
"*" |
| 12 |
NM**** |
count_mismatches(aln) |
N/A |
Notes
*
It might be possible to compute some of the flags based on the sequence, but I don't know how to do so.
**
In my experience, BioAlignments.cigar generates an alignment that always starts at position 1, and then pads it with deletions.
***
These fields are context-sensitive. Maybe generating records should be stateful 🤷?
****
Technically, the Number of Mismatches field is optional, but it is included in the recommended practices, and some tools fail if it isn't included.
Context
I was working on generating very specific alignments for variant calling purposes, and found that I couldn't export the result to another tool. Having this functionality would remedy that.
It would be great if there was some way to convert an alignment into a
SAM/BAM.Recordso that it could be exported to the filesystem for future use. Note that I will only refer toSAM.Recordfrom here on, but the same would be expected forBAM.Record, along the lines of #25.Expected Behavior
Ideally, this would simply be a new method for the
SAM.Recordconstructor. Something likeThere would need to be other information (possibly with defaults?) added, as a
SAM.Recordcontains more fields thanCurrent Behavior
There is no such behavior.
Possible Solution / Implementation
The challenge here is that
SAM.Records store more information thanPairwiseAlignments. Here's a map of what I think could be inferred vs. what needs to be passed as an argument.PairwiseAlignmentaccessorNotes
*
It might be possible to compute some of the flags based on the sequence, but I don't know how to do so.
**
In my experience,
BioAlignments.cigargenerates an alignment that always starts at position 1, and then pads it with deletions.***
These fields are context-sensitive. Maybe generating records should be stateful 🤷?
****
Technically, the Number of Mismatches field is optional, but it is included in the recommended practices, and some tools fail if it isn't included.
Context
I was working on generating very specific alignments for variant calling purposes, and found that I couldn't export the result to another tool. Having this functionality would remedy that.