Skip to content

Read alignment #57

@tmaklin

Description

@tmaklin

Ideas for adding read alignment to kbo.

Design

  • Implement a separate align command, keep call and find assembly-only.
  • Output from align should probably be a .sam file.

Questions

  • align should take prebuilt indexes?
  • Long reads only or also short? The former is much like mapping assemblies, so the theory should translate well.
  • Should map allow reads as input?
  • Design kbo read alignment mainly for cases where we're mapping only against a single reference? Multi reference --> themisto
  • Pangenome indexes and kbo? This would work well considering we can also identify the genome jumps via the 'RR' output from kbo.

Problems/considerations

  • The index loses information about which sequence a k-mer belongs to ("colors")
  • -> Multi reference alignment requires indexing everything separately.

SBWT and colors

  • We don't have an efficient algorithm for extracting matching statistics on colored SBWTs.
  • SBWT rust crate does not support colors although these could be ported over from the C++ code.

Algorithms

  • map relies on indexing the query and streaming the reference, this may not scale to (short) reads.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions