SequenceDistanceGraph

class SequenceDistanceGraph : public DistanceGraph

The most important features of this class are:

  • Nodes start from 1.

  • Nodes are signed to represent the direction in which they are being considered.

Public Functions

void load_from_gfa(std::string filename)

GFA loading function, it detects the format of the GFA file (1,2) and loads it appropriately.

Parameters
  • filename: Path of the gfa file to load

void write(std::ofstream &output_file)

Todo:

: move to DistanceGraph

sgNodeID_t add_node(Node n)

Adds a new node to the graph

Return

Returns the ID of the added node

Parameters
  • n: Node object to add

sgNodeID_t add_node(std::string seq)

Adds a new node to the graph from a string

Return

Returns the ID of the added node

Parameters
  • n: Node object to add

bool is_sane() const

Graph sanity check, makes sure the graph abides to the expected structure

Return

Whether the graph is valid or not

void remove_node(sgNodeID_t n)

Todo:

: –> enable extra breaks in repeats

Delete a node
Parameters
  • n: ID of the node

void join_path(const SequenceDistanceGraphPath p, bool consume_nodes = true)

This creates a new node with the sequence of the full path, and connects to the same end connections as path. Optionally removes the nodes that only participate in this path.

Parameters
  • p:

  • consume_nodes:

std::vector<SequenceDistanceGraphPath> get_all_unitigs(uint16_t min_nodes)

Todo:

: deprecate and replace/merge with get_all_lines

void expand_node(sgNodeID_t nodeID, std::vector<std::vector<sgNodeID_t>> bw, std::vector<std::vector<sgNodeID_t>> fw)

Makes multiple copies of a node to expand as repeat, connects to the bw and fw as specified, removes those connections from the original node.

Parameters
  • nodeID:

  • bw:

  • fw:

std::vector<SequenceSubGraph> get_all_bubbly_subgraphs(uint32_t maxsubgraphs = 0)

Todo:

: deprecate, please

void print_bubbly_subgraph_stats(const std::vector<SequenceSubGraph> &bubbly_paths)

Todo:

: deprecate, please

std::vector<sgNodeID_t> oldnames_to_nodes(std::string _oldnames)

From a list of names get a list of node IDs from the graph

Return

List of node IDs from the graph

Parameters
  • _oldnames: String containing old names

const std::string &nodeID_to_name(sgNodeID_t id) const

Get Name of a node

Return

Name of the node

Parameters
  • id: Node ID in the graph

Public Members

std::string filename

Contains the actual nodes from the graph, nodes are generally accesed using its IDs on to this structure.

std::vector<std::string> oldnames

Name of the files containing the graph and the fasta.

std::unordered_map<std::string, sgNodeID_t> oldnames_to_ids

Mapping structure IDs to input names.

WorkSpace &ws

Mapping structure from input names -> IDs.