API: Reads
Exported functions
Making reads
Pseudoseq.Sequencing.paired_reads
— Functionpaired_reads(p::MoleculePool, flen::Int, rlen::Int = flen)
Create a set of paired-end reads from a pool of DNA molecules p
.
flen
sets the length of forward read, and rlen
sets the length of the reverse read. If you only provide flen
, then the function sets rlen = flen
.
If a molecule in the pool is not long enough to create a forward and/or reverse read, then that molecule will simply be skipped.
Pseudoseq.Sequencing.unpaired_reads
— Functionunpaired_reads(p::Molecules{T}, len::Int) where {T<:AbstractSequencingView}
Create a set of single-end reads from a pool of DNA molecules p
.
len
sets the length of the reads.
The end (strand) from which the reading begins for each DNA molecule in the pool is determined at random for each molecule, with 50:50 probability.
If you don't provide a value for len
, then the function will read each DNA molecule in it's entirety.
If a molecule in the pool is not long enough to create a forward and/or reverse read, then that molecule will simply be skipped.
Introducing errors
Pseudoseq.Sequencing.edit_substitutions
— Functionedit_substitutions(f::Function, reads::Reads)
Add or remove sequencing errors to reads in the form of single base substitutions, according to so some error generating or deleting function f
.
The function provided must be a function that accepts two arguments: 1. A vector of Substitution structs, which the function will mutate. 2. The nucleotide sequence of the read.
The function needs not return anything, and anything it does return will not be used.
Some acceptable functions already provided with Pseudoseq include FixedProbSubstitutions
and ClearSubstitutions
. If you want to develop your own function, their implementations provide a simple example of how it can be done. You can of course also just create an anonymous function.
Pseudoseq.Sequencing.FixedProbSubstitutions
— TypeA simple function object that can be used with edit_substitutions
or edit_substitutions!
.
Randomly applies sequencing errors - substitutions - to all reads according to a fixed, uniform, per-base probability.
Pseudoseq.Sequencing.ClearSubstitutions
— TypeA simple function object that can be used with edit_substitutions
or edit_substitutions!
.
Simply clears all sequencing errors - substitutions - for all reads.
Generating FASTQ files
Pseudoseq.Sequencing.generate
— Functiongenerate(R1name::String, R2name::String, reads::Reads{Paired,<:AbstractSequencingView})
This method only works for paired reads. Instead of interleaving R1 and R2 reads in a single FASTQ file, R1 and R2 reads are partitioned into two seperate FASTQ files.
generate(filename::String, reads::Reads)
Write the reads
out to a FASTQ formatted file with the given filename
.
If this method is used with a paired-end read type, then the FASTQ file will be interleaved; all R1 reads will be odd records, and all R2 reads will be even records in the file.
Reads are named according to the sequence in the input genome they came from. e.g. @Reference_1_R1
means the first sequence in the genome, and @Reference_2_R1
means the second sequence in the genome.