RRID | Searching in Community Resources

TALLYMER

RRID:SCR_001244

http://www.zbh.uni-hamburg.de/?id=211

A collection of flexible and memory-efficient software programs for k-mer counting and indexing of large sequence sets. It is based on enhanced suffix arrays which gives a much larger flexibility concerning the choice of the k-mer size. It can process large data sizes of several billion bases.

Proper citation: TALLYMER (RRID:SCR_001244) Copy

Source: SciCrunch Registry

ADMIXTURE

RRID:SCR_001263

https://dalexander.github.io/admixture/download.html

A software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm. It uses a block relaxation approach to alternately update allele frequency and ancestry fraction parameters. Each block update is handled by solving a large number of independent convex optimization problems, which are tackled using a fast sequential quadratic programming algorithm. Convergence of the algorithm is accelerated using a novel quasi-Newton acceleration method., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.

Proper citation: ADMIXTURE (RRID:SCR_001263) Copy

Source: SciCrunch Registry

frappe

RRID:SCR_001264

http://med.stanford.edu/tanglab/software/frappe.html

Software using a f frequentist approach for estimating individual ancestry proportion.

Proper citation: frappe (RRID:SCR_001264) Copy

Source: SciCrunch Registry

RetroSeq

RRID:SCR_005133

https://github.com/tk2/RetroSeq

A tool for discovery and genotyping of transposable element variants (TEVs) (also known as mobile element insertions) from next-gen sequencing reads aligned to a reference genome in BAM format. The goal is to call TEVs that are not present in the reference genome but present in the sample that has been sequenced. It should be noted that RetroSeq can be used to locate any class of viral insertion in any species where whole-genome sequencing data with a suitable reference genome is available. RetroSeq is a two phase process, the first being the read pair discovery phase where discorandant mate pairs are detected and assigned to a TE class (Alu, SINE, LINE, etc.) by using either the annotated TE elements in the reference and/or aligned with Exonerate to the supplied library of viral sequences.

Proper citation: RetroSeq (RRID:SCR_005133) Copy

Source: SciCrunch Registry

VirusFinder

RRID:SCR_005205

http://bioinfo.mc.vanderbilt.edu/VirusFinder/

Software tool for efficient and accurate detection of viruses and their integration sites in host genomes through next generation sequencing data. Specifically, it detects virus infection, co-infection with multiple viruses, virus integration sites in host genomes, as well as mutations in the virus genomes. It also facilitates virus discovery by reporting novel contigs, long sequences assembled from short reads that map neither to the host genome nor to the genomes of known viruses. VirusFinder 2 works with both paired-end and single-end data, unlike the previous 1.x versions that accepted only paired-end reads. The types of NGS data that VirusFinder 2 can deal with include whole genome sequencing (WGS), whole transcriptome sequencing (RNA-Seq), targeted sequencing data such as whole exome sequencing (WES) and ultra-deep amplicon sequencing.

Proper citation: VirusFinder (RRID:SCR_005205) Copy

Source: SciCrunch Registry

NGS-SNP

RRID:SCR_005182

http://stothard.afns.ualberta.ca/downloads/NGS-SNP/

A collection of command-line scripts for providing rich annotations for SNPs identified by the sequencing of transcripts or whole genomes from organisms with reference sequences in Ensembl. Included among the annotations, several of which are not available from any existing SNP annotation tools, are the results of detailed comparisons with orthologous sequences. These comparisons allow, for example, SNPs to be sorted or filtered based on how drastically the SNP changes the score of a protein alignment. Other fields indicate the names of overlapping protein domains or features, and the conservation of both the SNP site and flanking regions. NCBI, Ensembl, and Uniprot IDs are provided for genes, transcripts, and proteins when applicable, along with Gene Ontology terms, a gene description, phenotypes linked to the gene, and an indication of whether the SNP is novel or known. A ?Model_Annotations? field provides several annotations obtained by transferring in silico the SNP to an orthologous gene, typically in a well-characterized species.

Proper citation: NGS-SNP (RRID:SCR_005182) Copy

Source: SciCrunch Registry

Ergatis

RRID:SCR_005377

http://ergatis.sourceforge.net/

A web interface and scalable software system for bioinformatics workflows that is used to create, run, and monitor reusable computational analysis pipelines. It contains pre-built components for common bioinformatics analysis tasks. These components can be arranged graphically to form highly-configurable pipelines. Each analysis component supports multiple output formats, including the Bioinformatic Sequence Markup Language (BSML). The current implementation includes support for data loading into project databases following the CHADO schema, a highly normalized, community-supported schema for storage of biological annotation data. Ergatis uses the Workflow engine to process its work on a compute grid. Workflow provides an XML language and processing engine for specifying the steps of a computational pipeline. It provides detailed execution status and logging for process auditing, facilitates error recovery from point of failure, and is highly scalable with support for distributed computing environments. The XML format employed enables commands to be run serially, in parallel, and in any combination or nesting level.

Proper citation: Ergatis (RRID:SCR_005377) Copy

Source: SciCrunch Registry

CREST

RRID:SCR_005257

http://toolshed.g2.bx.psu.edu/repository/display_tool?repository_id=5d0de444b1f9ac52&tool_config=database%2Fcommunity_files%2F000%2Frepo_136%2Fcrest.xml&changeset_revision=4f6952e0af48

An algorithm for detecting genomic structural variations at base-pair resolution using next-generation sequencing data. CREST uses pieces of DNA called soft clips to find structural variations. Soft clips are the DNA segments produced during sequencing that fail to properly align to the reference genome as the sample genome is reassembled. CREST uses the soft clips to precisely identify sites of chromosomal rearrangement or where pieces of DNA are inserted or deleted.

Proper citation: CREST (RRID:SCR_005257) Copy

Source: SciCrunch Registry

MolBioLib

RRID:SCR_005372

http://sourceforge.net/projects/molbiolib/

A compact, portable, and extensively tested C++11 software framework and set of applications tailored to the demands of next-generation sequencing data and applicable to many other applications. It is designed to work with common file formats and data types used both in genomic analysis and general data analysis. A central relational-database-like Table class is a flexible and powerful object to intuitively represent and work with a wide variety of tabular datasets, ranging from alignment data to annotations. MolBioLib includes programs to perform a wide variety of analysis tasks such as computing read coverage, annotating genomic intervals, and novel peak calling with a wavelet algorithm. This package assumes fluency in both UNIX and C++.

Proper citation: MolBioLib (RRID:SCR_005372) Copy

Source: SciCrunch Registry

BioExtract

RRID:SCR_005397

http://www.bioextract.org/GuestLogin

An open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet.

Proper citation: BioExtract (RRID:SCR_005397) Copy

Source: SciCrunch Registry

SPLITREAD

RRID:SCR_005264

http://splitread.sourceforge.net/

Software for detecting INDELs (small insertions and deletion with size less than 50bp) as well as large deletions that are within the coding regions from the exome sequencing data. It also can be applied to the whole genome sequencing data.

Proper citation: SPLITREAD (RRID:SCR_005264) Copy

Source: SciCrunch Registry

Hydra

RRID:SCR_005260

http://code.google.com/p/hydra-sv/

Software that detects structural variation (SV) breakpoints by clustering discordant paired-end alignments whose signatures corroborate the same putative breakpoint. Hydra can detect breakpoints caused by all classes of structural variation. Moreover, it was designed to detect variation in both unique and duplicated genomic regions; therefore, it will examine paired-end reads having multiple discordant alignments. Hydra does not attempt to classify SV breakpoints based on the mapping distances and orientations of each breakpoint cluster, it merely detects and reports breakpoints. This is an intentional decision, as it was observed that in loci affected by complex rearrangements, the type of variant suggested by the breakpoint signature is not always correct. Hydra does report the orientations, distances, number of supporting read-pairs, etc., for each breakpoint. It is suggested that downstream methods be used to classify variants based on the genomic features that they overlap and the co-occurrence of other breakpoints. For example, they developed BEDTools for exactly this purpose and the breakpoints reported by Hydra are in the BEDPE format used by BEDTools. Future releases of Hydra will include scripts that assist in the classification process.

Proper citation: Hydra (RRID:SCR_005260) Copy

Source: SciCrunch Registry

GEM

RRID:SCR_005339

http://cgs.csail.mit.edu/gem/

Java software for studying protein-DNA interaction using ChIP-seq / ChIP-exo data. It links binding event discovery and motif discovery with positional priors in the context of a generative probabilistic model of ChIP data and genome sequence, resolves ChIP data into explanatory motifs and binding events at unsurpassed spatial resolution. GEM reciprocally improves motif discovery using binding event locations, and binding event predictions using discovered motifs.

Proper citation: GEM (RRID:SCR_005339) Copy

Source: SciCrunch Registry

Fulcrum

RRID:SCR_005523

http://pringlelab.stanford.edu/projects.html

Software to collapse identical and near-identical Illumina and 454 reads (such as those from PCR clones) into single error-corrected sequences; it can process paired-end as well as single-end reads. Fulcrum is customizable and can be deployed on a single machine, a local network or a commercially available MapReduce cluster, and it has been optimized to maximize ease-of-use, cross-platform compatibility and future scalability. Sequence datasets have been collapsed by up to 71%, and the reduced number and improved quality of the resulting sequences allow assemblers to produce longer contigs while using less memory.

Proper citation: Fulcrum (RRID:SCR_005523) Copy

Source: SciCrunch Registry

DiProGB

RRID:SCR_005651

http://diprogb.fli-leibniz.de/

Genome browser that encodes the genome sequence by physico-chemical dinucleotide properties such as stacking energy, melting temperature or twist angle. Analyses can be performed for the + and ?, as well as for the double strand.

Proper citation: DiProGB (RRID:SCR_005651) Copy

Source: SciCrunch Registry

Bismark

RRID:SCR_005604

http://www.bioinformatics.babraham.ac.uk/projects/bismark/

Software tool to map bisulfite converted sequence reads and determine cytosine methylation states. Flexible aligner and methylation caller for Bisulfite-Seq applications. Used to map bisulfite treated sequencing reads to genome of interest and perform methylation calls in single step.

Proper citation: Bismark (RRID:SCR_005604) Copy

Source: SciCrunch Registry

Staden Package

RRID:SCR_005629

http://staden.sourceforge.net/

A fully developed set of DNA sequence assembly (Gap4 and Gap5), editing and analysis tools (Spin) for Unix, Linux, MacOSX and MS Windows.

Proper citation: Staden Package (RRID:SCR_005629) Copy

Source: SciCrunch Registry

NovelSeq

RRID:SCR_003136

http://compbio.cs.sfu.ca/software-novelseq

Software pipeline to detect novel sequence insertions using high throughput paired-end whole genome sequencing data.

Proper citation: NovelSeq (RRID:SCR_003136) Copy

Source: SciCrunch Registry

mrCaNaVaR

RRID:SCR_003135

http://mrcanavar.sourceforge.net/

Copy number caller that analyzes the whole-genome next-generation sequence mapping read depth to discover large segmental duplications and deletions. It also has the capability of predicting absolute copy numbers of genomic intervals.

Proper citation: mrCaNaVaR (RRID:SCR_003135) Copy

Source: SciCrunch Registry

deCODE genetics

RRID:SCR_003334

http://www.decode.com/

A biopharmaceutical company applying its discoveries in human genetics to develop drugs and diagnostics for common diseases. They specialize in gene discovery - their population approach and resources have enabled them to isolate key genes contributing to major public health challenges from cardiovascular disease to cancer. The company's genotyping capacity is now one of the highest in the world. They have a large population-based biobank containing whole blood and DNA samples with extensive relevant phenotypic information from around 120.000 Icelanders. In the company's work in more than 50 disease projects, their statistical and informatics departments have established themselves in data processing and analysis. deCODE genetics is widely recognized as a center of excellence in genetic research.

Proper citation: deCODE genetics (RRID:SCR_003334) Copy

Source: SciCrunch Registry

Searching the RRID Resource Information Network

Our searching services are busy right now. Please try again later

Log in

Leaving Community

About

Community Resources

More Resources

Literature

Log in

Tools Select Another Resource Report Type

Options

Current Facets and Filters

Facets

Recent searches

RRID:SCR_001244

RRID:SCR_001263

RRID:SCR_001264

RRID:SCR_005133

RRID:SCR_005205

RRID:SCR_005182

RRID:SCR_005377

RRID:SCR_005257

RRID:SCR_005372

RRID:SCR_005397

RRID:SCR_005264

RRID:SCR_005260

RRID:SCR_005339

RRID:SCR_005523

RRID:SCR_005651

RRID:SCR_005604

RRID:SCR_005629

RRID:SCR_003136

RRID:SCR_003135

RRID:SCR_003334

RRID Portal Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Sources

Categories

Subcategories

Further Questions

Category Graph

About

Recent News Entries

Contact Us

SciCrunch