December 21, 2010
New de novo assembler
- Scaffolding is integrated into the assembly. This means better resolution of contigs and insertion of Ns when two contigs cannot be joined in sequence but there is pair information that connects them.
- -Automatic paired distance estimation: Using the -e option, the de novo assembler will estimate the fragment size of your paired data .
- Improved use of unpaired reads for resolving ambiguities in the de Bruijn Graph.
- Various improvements of the assembly quality.
- New parameter for specifying the maximum bubble size. There is a default value which is automatically calculated based on the input data.
- New white paper with benchmarks and results from quality control.
- Bug fix: Fixed a bug in the de novo assembler which caused an increased number of N's in the results, because the sequence of the read that spanned contigs was not looked up correctly. The de novo assembler now produces much fever N's for low coverage assemblies.
New read mapper
- Great improvement of speed for mapping (see whitepaper for more details on speed and quality)
- Support for complex genomes with many repeats
- The previous read mapper is still included as a legacy version to allow color space mapping which is not supported in the new mapper.
- The forward only mode of the clc_mapper now also works for paired reads.
Updated naming of tools
We have updated the names of the tools to be more consistent, and to reflect the use of "mapping" rather than "assembly" throughout the software. We have provided a helper script to assist updating existing scripts based on the old naming scheme. Read more here
A new license tool is included that will make it very easy to:
- Download a license based on a license order ID. This would previously require some email exchange with CLC bio but can now be done in one go without involvement of CLC bio.
- Request and download an evaluation license directly.
Furthermore, it is now checked if the license is valid for the particular version of the CLC Assembly Cell.
A new restriction has been added for running the CLC Assembly Cell on large computers: if the system has more than 64 cores (hyper threaded cores), it will not be able to run with a static license. In this case, a network license is needed.
You can now trim adapters from sequencing reads prior to assembly or mapping. Read more here
- Added support for read group information in castosam.
- Added support for non-specific reads in castosam
- Added progress on castosam and samtocas
- sort_pairs auto detects input files. Now supports for solid paired end and ion torrent files.
- Proper out of memory error messages are shown if a tool runs out of memory
- Various bug fixes
CLC Assembly Cell 4.0 Beta (in beta test selected customers)
- Comprehensive redesign of de novo assembly algorithm, providing higher quality results
- Contigs are reported as scaffolds when paired data is used