Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
Database of positive selection based on a rigorous branch-site specific likelihood test. Positive selection is detected using CODEML on all branches of animal gene trees.
Proper citation: Selectome: a Database of Positive Selection (RRID:SCR_004542) Copy
http://aws.amazon.com/1000genomes/
A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.
Proper citation: 1000 Genomes Project and AWS (RRID:SCR_008801) Copy
http://montana.eagle-i.net/i/0000012b-00be-4e65-df3b-3fdc80000000
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on October 27, 2023. Core for Microarray analysis, Database development, Systems biology analysis, Genome assembly, Pathway data analysis, Expression data analysis, Metagenomics analysis. To maintain equipment and software for bioinformatic research, promote bioinformatics education on the MSU campus, and provide training and support to biologists implementing bioinformatics tools in their research.
Proper citation: Montana State University Bioinformatics Core Facility (RRID:SCR_009937) Copy
http://www.sanger.ac.uk/science/tools/alien-hunter
Software for the prediction of putative Horizontal Gene Transfer (HGT) events with the implementation of Interpolated Variable Order Motifs (IVOMs). The predictions (embl format) can be automatically loaded into Artemis genome viewer.
Proper citation: Alien-hunter (RRID:SCR_015967) Copy
http://www.scienceexchange.com/facilities/university-of-utah
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 15,2024. Labs and facilities of the University of Utah, which include: Microarray and Genomic Analysis Core Facility, Flow Cytometry Core Facility, Mutation Generation and Detection Facility, and the Transgenic and Gene Targeting Core.
Proper citation: University of Utah Labs and Facilities (RRID:SCR_001042) Copy
http://archive.ics.uci.edu/ml/datasets/EEG+Database
Data set from a large study to examine EEG correlates of genetic predisposition to alcoholism. It contains measurements from 64 electrodes placed on the scalp sampled at 256 Hz (3.9-msec epoch) for 1 second. There were two groups of subjects: alcoholic and control. Each subject was exposed to either a single stimulus (S1) or to two stimuli (S1 and S2) which were pictures of objects chosen from the 1980 Snodgrass and Vanderwart picture set. When two stimuli were shown, they were presented in either a matched condition where S1 was identical to S2 or in a non-matched condition where S1 differed from S2. There were 122 subjects and each subject completed 120 trials where different stimuli were shown. The electrode positions were located at standard sites (Standard Electrode Position Nomenclature, American Electroencephalographic Association 1990). Zhang et al. (1995) describes in detail the data collection process. There are three versions of the EEG data set. * The Small Data Set (smni97_eeg_data.tar.gz) contains data for the 2 subjects, alcoholic a_co2a0000364 and control c_co2c0000337. For each of the 3 matching paradigms, c_1 (one presentation only), c_m (match to previous presentation) and c_n (no-match to previous presentation), 10 runs are shown. * The Large Data Set (SMNI_CMI_TRAIN.tar.gz and SMNI_CMI_TEST.tar.gz) contains data for 10 alcoholic and 10 control subjects, with 10 runs per subject per paradigm. The test data used the same 10 alcoholic and 10 control subjects as with the training data, but with 10 out-of-sample runs per subject per paradigm. * The Full Data Set contains all 120 trials for 122 subjects. The entire set of data is about 700 MBytes.
Proper citation: EEG Database (RRID:SCR_001581) Copy
The EBI genomes pages give access to a large number of complete genomes including bacteria, archaea, viruses, phages, plasmids, viroids and eukaryotes. Methods using whole genome shotgun data are used to gain a large amount of genome coverage for an organism. WGS data for a growing number of organisms are being submitted to DDBJ/EMBL/GenBank. Genome entries have been listed in their appropriate category which may be browsed using the website navigation tool bar on the left. While organelles are all listed in a separate category, any from Eukaryota with chromosome entries are also listed in the Eukaryota page. Within each page, entries are grouped and sorted at the species level with links to the taxonomy page for that species separating each group. Within each species, entries whose source organism has been categorized further are grouped and numbered accordingly. Links are made to: * taxonomy * complete EMBL flatfile * CON files * lists of CON segments * Project * Proteomes pages * FASTA file of Proteins * list of Proteins
Proper citation: EBI Genomes (RRID:SCR_002426) Copy
Multicenter observational study designed to identify genetic determinants of diabetic nephropathy. It is conducted in eleven U.S. clinical centers and a coordinating center, and with four ethnic groups (European Americans, African Americans, Mexican Americans, and American Indians). Two strategies are used to localize susceptibility genes: a family-based linkage study and a case-control study using mapping by admixture linkage disequilibrium (MALD). In the family-based study, probands with diabetic nephropathy are recruited with their parents and selected siblings. Linkage analyses will be conducted to identify chromosomal regions containing genes that influence the development of diabetic nephropathy or related quantitative traits such as serum creatinine concentration, urinary albumin excretion, and plasma glucose concentrations. Regions showing evidence of linkage will be examined further with both genetic linkage and association studies to identify genes that influence diabetic nephropathy or related traits. Two types of MALD studies are being done. One is a case-control study of unrelated individuals of Mexican American heritage in which both cases and controls have diabetes, but only the case has nephropathy. The other is a case-control study of African American patients with nephropathy (cases) and their spouses (controls) unaffected by diabetes and nephropathy; offspring are genotyped when available to provide haplotype data. The specific goals of this program: * Delineate genomic regions associated with the development and progression of renal disease(s) * Evaluate whether there is a genetic link between diabetic nephropathy and diabetic retinopathy * Improve outcomes * Provide protection for people at risk and slow the progression of renal disease * Help establish a resource for genetic studies of kidney disease and diabetic complications by creating a repository of genetic samples and a database * Encourage studies of the genetics of progressive renal disease
Proper citation: Family Investigation of Nephropathy of Diabetes (RRID:SCR_001525) Copy
http://bpg.utoledo.edu/~afedorov/lab/eid.html
Data sets of protein-coding intron-containing genes that contain gene information from humans, mice, rats, and other eukaryotes, as well as genes from species whose genomes have not been completely sequenced. This is a comprehensive and convenient dataset of sequences for computational biologists who study exon-intron gene structures and pre-mRNA splicing. The database is derived from GenBank release 112, and it contains protein-coding genes that harbor introns, along with extensive descriptions of each gene and its DNA and protein sequences, as well as splice motif information. They have created subdatabases of genes whose intron positions have been experimentally determined. The collection also contains data on untranslated regions of gene sequences and intron-less genes. For species with entirely sequenced genomes, species-specific databases have been generated. A novel Mammalian Orthologous Intron Database (MOID) has been introduced which includes the full set of introns that come from orthologous genes that have the same positions relative to the reading frames.
Proper citation: EID: Exon-Intron Database (RRID:SCR_002469) Copy
http://www.sci.unisannio.it/docenti/rampone/
Data set of Homo Sapiens Exons, Introns and Splice regions extracted from GenBank Rel.123 with an aim of giving standardized material to train and to assess the prediction accuracy of computational approaches for gene identification and characterization. From the complete GenBank (Primate Sequences Division) Rel.123 (162,557 entries), entries of Human Nuclear DNA including Complete CDS and more than one Exon have been selected, and 4523 exons and 3802 introns have been extracted from these entries. Details about extracted exons and introns are reported (Locus, number, Start and End position in the entry, sequence, length, G+C content, presence of not AGCT data (nucleotide scan check)). Statistics are also reported (overall nucleotides, average G+C content, nucleotide scan check results, number of not GT starting / AG ending introns, minimum / maximum / average length, length standard deviation). 3799+3799 donor and acceptor sites, as windows of 140 nucleotides around each splice site have been extracted. After discarding sequences not including canonical GTAG junctions (65+74), including insufficient data (not enough material for a 140 nucleotide window) (686+589), including not AGCT bases (29+30), and redundant (218+226) there are 2796+ 2880 windows. Finally, there are 271,937 + 332,296 windows of false splice sites, selected by searching canonical GTAG pairs in not splicing positions. The false sites in a range of +/- 60 from a true splice site are marked as proximal.
Proper citation: HS3D - Homo Sapiens Splice Sites Dataset (RRID:SCR_002939) Copy
http://rp-www.cs.usyd.edu.au/~yangpy/software/MFGE.html
A hybrid software system for feature selection and sample classification of high-dimensional datasets. It is designed for microarray but can be applied to any other high-dimensional datasets. It uses multiple filters to produce a normalized score for each feature. The score is an indication of the usefulness of each feature. It is then translated into a frequency map with more useful features receive a higher frequency in the map.
Proper citation: MF-GE (RRID:SCR_003509) Copy
Curated lists of genes associated to speech / language phenotypes and structural or functional abnormalities observed in patient populations. Entrez ID gene information, as well as gene expression profiles from the Allen Brain Atlas are available. You can also download expression data for a given gene in JSON or XML format.
Proper citation: Speech Language Disorders Database (RRID:SCR_003655) Copy
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 11, 2023. Archiving services, insertional site analysis, pharmacology and toxicology resources, and reagent repository for academic investigators and others conducting gene therapy research. Databases and educational resources are open to everyone. Other services are limited to gene therapy investigators working in academic or other non-profit organizations. Stores reserve or back-up clinical grade vector and master cell banks. Maintains samples from any gene therapy related Pharmacology or Toxicology study that has been submitted to FDA by U.S. academic investigator that require storage under Good Laboratory Practices. For certain gene therapy clinical trials, FDA has required post-trial monitoring of patients, evaluating clinical samples for evidence of clonal expansion of cells. To help academic investigators comply with this FDA recommendation, the NGVB offers assistance with clonal analysis using LAM-PCR and LM-PCR technology.
Proper citation: National Gene Vector Biorepository (RRID:SCR_004760) Copy
http://www.linked-neuron-data.org/
Neuroscience data and knowledge from multiple scales and multiple data sources that has been extracted, linked, and organized to support comprehensive understanding of the brain. The core is the CAS Brain Knowledge base, a very large scale brain knowledge base based on automatic knowledge extraction and integration from various data and knowledge sources. The LND platform provides services for neuron data and knowledge extraction, representation, integration, visualization, semantic search and reasoning over the linked neuron data. Currently, LND extracts and integrates semantic data and knowledge from the following resources: PubMed, INCF-CUMBO, Allen Reference Atlas, NIF, NeuroLex, MeSH, DBPedia/Wikipedia, etc.
Proper citation: Linked Neuron Data (RRID:SCR_003658) Copy
http://bc02.iis.sinica.edu.tw/gobu/manual/index.html
Gene Ontology Browsing Utility (GOBU) (GOBU) is a Java-based software program for integrating biological annotation catalogs under an extendable software architecture. Users may interact with the Gene Ontology and user-defined hierarchy data of genes, and then use its plugins to (and not limited to) (1) browse the GO hierarchy with user defined data, (2) browse GO-oriented expression levels in the user data, (3) compute GO enrichment, and/or (4) customize data reporting. A set of classes and utility functions has been established so that a customized program can be made as a plugin or a command-line tool that programmically manipulate the Gene Ontology and specified user data. See the source code repository for examples. Reference Lin WD, Chen YC, Ho JM, Hsiao CD. GOBU: Toward an Integration Interface for Biological Objects. Journal of Information Science and Engineering. 2006 22(1):19-29. Platform: Windows compatible, Mac OS X compatible, Linux compatible, Unix compatible
Proper citation: Gene Ontology Browsing Utility (GOBU) (RRID:SCR_005662) Copy
http://www.sgn.cornell.edu/bulk/input.pl?modeunigene
Allows users to download Unigene or BAC information using a list of identifiers or complete datasets with FTP., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: Sol Genomics Network - Bulk download (RRID:SCR_007161) Copy
http://linux1.softberry.com/spldb/SpliceDB.html
Database of canonical and non-canonical mammalian splice sites. The information about verified splice site sequences for canonical and non-canonical sites is presented with the supporting evidence. Weight matrices were built for the major splice groups, which can be incorporated into gene prediction programs.
Proper citation: SpliceDB (RRID:SCR_006262) Copy
http://www.ncbi.nlm.nih.gov/CBBresearch/Schaffer/fastlink.html
Software application (entry from Genetic Analysis Software)
Proper citation: FASTLINK (RRID:SCR_009177) Copy
https://github.com/gaow/genetic-analysis-software/blob/master/pages/FASTEHPLUS.md
THIS RESOURCE IS NO LONGER IN SERVCE, documented September 7, 2016.
Proper citation: FASTEHPLUS (RRID:SCR_009176) Copy
http://www.mds.qmw.ac.uk/statgen/dcurtis/software.html
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 16,2023. Software application for non-parametric analysis (entry from Genetic Analysis Software)
Proper citation: ERPA (RRID:SCR_009173) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the RRID Resources search. From here you can search through a compilation of resources used by RRID and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that RRID has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on RRID then you can log in from here to get additional features in RRID such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into RRID you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within RRID that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.