API: Reads

Exported functions

Making reads

Pseudoseq.Sequencing.paired_reads — Function

paired_reads(p::MoleculePool, flen::Int, rlen::Int = flen)

Create a set of paired-end reads from a pool of DNA molecules p.

flen sets the length of forward read, and rlen sets the length of the reverse read. If you only provide flen, then the function sets rlen = flen.

Note

If a molecule in the pool is not long enough to create a forward and/or reverse read, then that molecule will simply be skipped.

source

Pseudoseq.Sequencing.unpaired_reads — Function

unpaired_reads(p::Molecules{T}, len::Int) where {T<:AbstractSequencingView}

Create a set of single-end reads from a pool of DNA molecules p.

len sets the length of the reads.

The end (strand) from which the reading begins for each DNA molecule in the pool is determined at random for each molecule, with 50:50 probability.

If you don't provide a value for len, then the function will read each DNA molecule in it's entirety.

Note

If a molecule in the pool is not long enough to create a forward and/or reverse read, then that molecule will simply be skipped.

source

Introducing errors

Pseudoseq.Sequencing.edit_substitutions — Function

edit_substitutions(f::Function, reads::Reads)

Add or remove sequencing errors to reads in the form of single base substitutions, according to so some error generating or deleting function f.

The function provided must be a function that accepts two arguments: 1. A vector of Substitution structs, which the function will mutate. 2. The nucleotide sequence of the read.

The function needs not return anything, and anything it does return will not be used.

Some acceptable functions already provided with Pseudoseq include FixedProbSubstitutions and ClearSubstitutions. If you want to develop your own function, their implementations provide a simple example of how it can be done. You can of course also just create an anonymous function.

source

Pseudoseq.Sequencing.FixedProbSubstitutions — Type

A simple function object that can be used with edit_substitutions or edit_substitutions!.

Randomly applies sequencing errors - substitutions - to all reads according to a fixed, uniform, per-base probability.

source

Pseudoseq.Sequencing.ClearSubstitutions — Type

A simple function object that can be used with edit_substitutions or edit_substitutions!.

Simply clears all sequencing errors - substitutions - for all reads.

source

Generating FASTQ files

Pseudoseq.Sequencing.generate — Function

generate(R1name::String, R2name::String, reads::Reads{Paired,<:AbstractSequencingView})

This method only works for paired reads. Instead of interleaving R1 and R2 reads in a single FASTQ file, R1 and R2 reads are partitioned into two seperate FASTQ files.

source

generate(filename::String, reads::Reads)

Write the reads out to a FASTQ formatted file with the given filename.

If this method is used with a paired-end read type, then the FASTQ file will be interleaved; all R1 reads will be odd records, and all R2 reads will be even records in the file.

Note

Reads are named according to the sequence in the input genome they came from. e.g. @Reference_1_R1 means the first sequence in the genome, and @Reference_2_R1 means the second sequence in the genome.

source