Researchers have released "Helixer," a new deep-learning tool that eliminates the bottleneck in genome analysis. The software identifies genes directly from raw DNA sequences—without experimental reference data or prior biological knowledge.
In genomics, structural annotation—the mapping of genes, introns, and UTRs (untranslated regions)—has traditionally been considered the most computationally and data-intensive step. While sequencing has long been automated, the precise localization of functional units often required months of manual curation or comparison with related species.
Key Technical Facts for the IT Infrastructure:
- Deep learning approach: Helixer detects start/stop signals and complex structures (CDS, introns) directly within the sequence.
- Cross-species capability: The first tool that works reliably across taxonomic boundaries (plants, fungi, insects, vertebrates).
- Performance: It achieves prediction accuracy that is nearly on par with manually curated reference annotations; for plants, it significantly outperforms established prediction tools.
- Efficiency: Reduces the time required for analysis from months to a fraction of that time, significantly accelerating high-throughput workflows in bioinformatics.
“Helixer demonstrates that modern AI methods can help overcome this bottleneck,” says Prof. Björn Usadel. Helixer is already being actively used in biotechnology and plant breeding. The bioinformatics pipeline thus bridges the technological gap between automated data generation and functional interpretation.
Source: Forschungszentrum Jülich (01/2026)
