Molecule Pool

API: MoleculePool

Exported functions

Making a pool of molecules

Pseudoseq.makepoolFunction.
makepool(gen::Vector{BioSequence{DNAAlphabet{2}}}, ng::Int = 1, iscircular::Bool = false)

Create a pool of ng copies of a genome defined by the gen vector of sequences.

Note

The argument iscircular is currently not used.

source
makepool(rdr::FASTA.Reader, ng::Int = 1, iscircular::Bool = false)

Create a pool of ng copies of the genome read in from the FASTA.Reader.

Note

The argument iscircular is currently not used.

source
makepool(file::String, ng::Int, iscircular::Bool = false)

Create a pool of ng copies of the genome in the fasta formatted file.

Note

The argument iscircular is currently not used.

source

Molecule pool transformations

Fragment

Pseudoseq.fragmentMethod.
fragment(p::MoleculePool, meansize::Int)

Create a new pool by breaking up the DNA fragments in an input pool.

This method breaks up a DNA molecule in a pool p, such that the average length of the fragments is approximately meansize.

It fragments a molecule by scattering an appropriate number of breakpoints across the molecule, before cutting the molecule at those breakpoints.

Note

Breakpoints are scattered entirely at random across a molecule. No two or more breakpoints can fall in exactly the same place, as those positions are sampled without replacement.

Note

The appropriate number of breakpoints to scatter across a molecule is calculated as:

\[\frac{L}{S} - 1\]

Where $L$ is the length of the molecule being fragmented, and $S$ is the desired expected fragment size. This calculation assumes breakpoints fall randomly across the molecule (see above note).

Note

If a DNA molecule being fragmented is smaller than the desired meansize, then it will not be broken, it will simply be included in the new pool.

source

Subsampling

subsample(p::MoleculePool, n::Int)

Create a new pool by sampling an input pool.

Note

DNA molecules in the input pool p are selected according to the uniform distribution; no one molecule is more or less likely to be selected than another.

Note

Sampling is done without replacement, so it is impossible for the new pool that is created to recieve one molecule the input pool twice.

source

Tagging

Pseudoseq.tagMethod.
tag(u::MoleculePool, ntags::Int)

Create a pool of tagged DNA molecules from some input pool.

The new tagged pool has the same DNA molecules as the input pool. However, each DNA molecule in the new tagged pool will be assigned a tag in the range of 1:ntags.

For any tagged molecules in a pool, any other molecules that are derived from that tagged molecule will inherit the same tag. For example, if a DNA fragment in a pool is tagged, and then it is subsequently fragmented during a fragment transform, then all the smaller fragments derived from that long fragment will inherit that long fragment's tag.

Note

Which fragment gets a certain tag is completely random. It is possible for two distinct DNA molecules in a pool to be assigned the same tag. The likelihood of that happening depends on the size of the tag pool (ntags), and the number of fragments in the pool.

source