API: Reads

Exported functions

Making reads

Pseudoseq.Sequencing.paired_readsFunction
paired_reads(p::MoleculePool, flen::Int, rlen::Int = flen)

Create a set of paired-end reads from a pool of DNA molecules p.

flen sets the length of forward read, and rlen sets the length of the reverse read. If you only provide flen, then the function sets rlen = flen.

Note

If a molecule in the pool is not long enough to create a forward and/or reverse read, then that molecule will simply be skipped.

source
Pseudoseq.Sequencing.unpaired_readsFunction
unpaired_reads(p::Molecules{T}, len::Int) where {T<:AbstractSequencingView}

Create a set of single-end reads from a pool of DNA molecules p.

len sets the length of the reads.

The end (strand) from which the reading begins for each DNA molecule in the pool is determined at random for each molecule, with 50:50 probability.

If you don't provide a value for len, then the function will read each DNA molecule in it's entirety.

Note

If a molecule in the pool is not long enough to create a forward and/or reverse read, then that molecule will simply be skipped.

source

Introducing errors

Pseudoseq.Sequencing.edit_substitutionsFunction
edit_substitutions(f::Function, reads::Reads)

Add or remove sequencing errors to reads in the form of single base substitutions, according to so some error generating or deleting function f.

The function provided must be a function that accepts two arguments: 1. A vector of Substitution structs, which the function will mutate. 2. The nucleotide sequence of the read.

The function needs not return anything, and anything it does return will not be used.

Some acceptable functions already provided with Pseudoseq include FixedProbSubstitutions and ClearSubstitutions. If you want to develop your own function, their implementations provide a simple example of how it can be done. You can of course also just create an anonymous function.

source

Generating FASTQ files

Pseudoseq.Sequencing.generateFunction
generate(R1name::String, R2name::String, reads::Reads{Paired,<:AbstractSequencingView})

This method only works for paired reads. Instead of interleaving R1 and R2 reads in a single FASTQ file, R1 and R2 reads are partitioned into two seperate FASTQ files.

source
generate(filename::String, reads::Reads)

Write the reads out to a FASTQ formatted file with the given filename.

If this method is used with a paired-end read type, then the FASTQ file will be interleaved; all R1 reads will be odd records, and all R2 reads will be even records in the file.

Note

Reads are named according to the sequence in the input genome they came from. e.g. @Reference_1_R1 means the first sequence in the genome, and @Reference_2_R1 means the second sequence in the genome.

source