SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
Database that catalogs experimentally verified pathogenicity, virulence and effector genes from fungal, Oomycete and bacterial pathogens, which infect animal, plant, fungal and insect hosts. It is an invaluable resource in the discovery of genes in medically and agronomically important pathogens, which may be potential targets for chemical intervention. In collaboration with the FRAC team, it also includes antifungal compounds and their target genes. Each entry is curated by domain experts and is supported by strong experimental evidence (gene disruption experiments, STM etc), as well as literature references in which the original experiments are described. Each gene is presented with its nucleotide and deduced amino acid sequence, as well as a detailed description of the predicted protein's function during the host infection process. To facilitate data interoperability, genes have been annotated using controlled vocabularies and links to external sources (Gene Ontology terms, EC Numbers, NCBI taxonomy, EMBL, PubMed and FRAC).
Proper citation: PHI-base (RRID:SCR_003331) Copy
Database of polymorphisms and mutations of the human mitochondrial DNA. It reports published and unpublished data on human mitochondrial DNA variation. All data is curated by hand. If you would like to submit published articles to be included in mitomap, please send them the citation and a pdf.
Proper citation: MITOMAP - A human mitochondrial genome database (RRID:SCR_002996) Copy
A manually curated resource of signal transduction pathways in humans. All pathways are freely available for download in BioPAX level 3.0, PSI-MI version 2.5 and SBML version 2.1 formats. The slim pathway models representing only core reactions in each pathway are available at NetSlim. All the NetPath pathway models are also submitted to WikiPathways., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: NetPath (RRID:SCR_003567) Copy
http://exac.broadinstitute.org/
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 9, 2023. An aggregated data platform for genome sequencing data created by a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community. The data set provided on this website spans 61,486 unrelated individuals sequenced as part of various disease-specific and population genetic studies. They have removed individuals affected by severe pediatric disease, so this data set should serve as a useful reference set of allele frequencies for severe disease studies. All of the raw data from these projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects. They ask that you not publish global (genome-wide) analyses of these data until after the ExAC flagship paper has been published, estimated to be in early 2015. If you''re uncertain which category your analyses fall into, please email them. The aggregation and release of summary data from the exomes collected by the Exome Aggregation Consortium has been approved by the Partners IRB (protocol 2013P001477, Genomic approaches to gene discovery in rare neuromuscular diseases).
Proper citation: ExAc (RRID:SCR_004068) Copy
An information management framework for comprehensive ion channel information. It is a knowledge base system centered on genetically expressed ion channel models and it encourages researchers of the field to contribute, build and refine the information through an interactive wiki-like interface. It is web-based, freely accessible and currently contains 187 annotated ion channels with 50 Hodgkin-Huxley models (September 2014). Channelepdia provides an ideal platform to collectively build ion channel knowledge base by accommodating both structured and unstructured data. The current version of Channelpedia contains the following sections : Introduction, Genes, Ontologies, Interactions, Structure, Expression, Distribution, Function, Kinetics and Models. Newly published literature related to ion channels is automatically queried every week from PubMed and added to respective categories. Currently, Channelpedia contains ~180,000 abstracts related to ion channels from Pubmed.
Proper citation: ChannelPedia (RRID:SCR_003807) Copy
A database of genomic and protein data for Drosophila site-specific transcription factors.
Proper citation: FlyTF.org (RRID:SCR_004123) Copy
https://scicrunch.org/scicrunch/data/source/nlx_154697-4/search?q=*
Virtual database indexing brain region gene expression data from mice from: Gene Expression Nervous System Atlas (GENSAT), Allen Mouse Brain Atlas, and Mouse Genome Institute (MGI).
Proper citation: Integrated Brain Gene Expression (RRID:SCR_004197) Copy
THIS RESOURCE IS NO LONGER IN SERVICE; REPLACED BY NEPHROSEQ; A growing database of publicly available renal gene expression profiles, a sophisticated analysis engine, and a powerful web application designed for data mining and visualization of gene expression. It provides unique access to datasets from the Personalized Molecular Nephrology Research Laboratory incorporating clinical data which is often difficult to collect from public sources and mouse data.
Proper citation: Nephromine (RRID:SCR_003813) Copy
http://life.ccs.miami.edu/life/
LIFE search engine contains data generated from LINCS Pilot Phase, to integrate LINCS content leveraging semantic knowledge model and common LINCS metadata standards. LIFE makes LINCS content discoverable and includes aggregate results linked to Harvard Medical School and Broad Institute and other LINCS centers, who provide more information including experimental conditions and raw data. Please visit LINCS Data Portal.
Proper citation: LINCS Information Framework (RRID:SCR_003937) Copy
http://www.broadinstitute.org/mmgp/
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 6, 2023. Database providing access and limited analysis to the MMGP portal data sets. These include the MMRC funded reference array comparative genomic hybridization (aCGH) and gene expression data and additional public multiple myeloma datasets. The MMGP will be updated with new features such as additional data and analysis tools as they become available.
Proper citation: Multiple Myeloma Genomics Portal (RRID:SCR_003722) Copy
http://www.hgsc.bcm.tmc.edu/content/hapmap-3-and-encode-3
Draft release 3 for genome-wide SNP genotyping and targeted sequencing in DNA samples from a variety of human populations (sometimes referred to as the HapMap 3 samples). This release contains the following data: * SNP genotype data generated from 1184 samples, collected using two platforms: the Illumina Human1M (by the Wellcome Trust Sanger Institute) and the Affymetrix SNP 6.0 (by the Broad Institute). Data from the two platforms have been merged for this release. * PCR-based resequencing data (by Baylor College of Medicine Human Genome Sequencing Center) across ten 100-kb regions (collectively referred to as ENCODE 3) in 712 samples. Since this is a draft release, please check this site regularly for updates and new releases. The HapMap 3 sample collection comprises 1,301 samples (including the original 270 samples used in Phase I and II of the International HapMap Project) from 11 populations, listed below alphabetically by their 3-letter labels. Five of the ten ENCODE 3 regions overlap with the HapMap-ENCODE regions; the other five are regions selected at random from the ENCODE target regions (excluding the 10 HapMap-ENCODE regions). All ENCODE 3 regions are 100-kb in size, and are centered within each respective ENCODE region. The HapMap 3 and ENCORE 3 data are downloadable from the ftp site.
Proper citation: HapMap 3 and ENCODE 3 (RRID:SCR_004563) Copy
THIS RESOURCE IS NO LONGER IN SERVICE, documented May 26, 2016. Search engine that integrates over 100 curated and publicly contributed data sources and provides integrated views on the genomic, proteomic, transcriptomic, genetic and functional information currently available. Information featured in the database includes gene function, orthologies, gene expression, pathways and protein-protein interactions, mutations and SNPs, disease relationships, related drugs and compounds.
Proper citation: IntegromeDB (RRID:SCR_004620) Copy
A curated database that provides comprehensive integrated biological information for Saccharomyces cerevisiae along with search and analysis tools to explore these data. SGD allows researchers to discover functional relationships between sequence and gene products in fungi and higher organisms. The SGD also maintains the S. cerevisiae Gene Name Registry, a complete list of all gene names used in S. cerevisiae which includes a set of general guidelines to gene naming. Protein Page provides basic protein information calculated from the predicted sequence and contains links to a variety of secondary structure and tertiary structure resources. Yeast Biochemical Pathways allows users to view and search for biochemical reactions and pathways that occur in S. cerevisiae as well as map expression data onto the biochemical pathways. Literature citations are provided where available.
Proper citation: SGD (RRID:SCR_004694) Copy
Database of positive selection based on a rigorous branch-site specific likelihood test. Positive selection is detected using CODEML on all branches of animal gene trees.
Proper citation: Selectome: a Database of Positive Selection (RRID:SCR_004542) Copy
http://aws.amazon.com/1000genomes/
A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.
Proper citation: 1000 Genomes Project and AWS (RRID:SCR_008801) Copy
http://archive.ics.uci.edu/ml/datasets/EEG+Database
Data set from a large study to examine EEG correlates of genetic predisposition to alcoholism. It contains measurements from 64 electrodes placed on the scalp sampled at 256 Hz (3.9-msec epoch) for 1 second. There were two groups of subjects: alcoholic and control. Each subject was exposed to either a single stimulus (S1) or to two stimuli (S1 and S2) which were pictures of objects chosen from the 1980 Snodgrass and Vanderwart picture set. When two stimuli were shown, they were presented in either a matched condition where S1 was identical to S2 or in a non-matched condition where S1 differed from S2. There were 122 subjects and each subject completed 120 trials where different stimuli were shown. The electrode positions were located at standard sites (Standard Electrode Position Nomenclature, American Electroencephalographic Association 1990). Zhang et al. (1995) describes in detail the data collection process. There are three versions of the EEG data set. * The Small Data Set (smni97_eeg_data.tar.gz) contains data for the 2 subjects, alcoholic a_co2a0000364 and control c_co2c0000337. For each of the 3 matching paradigms, c_1 (one presentation only), c_m (match to previous presentation) and c_n (no-match to previous presentation), 10 runs are shown. * The Large Data Set (SMNI_CMI_TRAIN.tar.gz and SMNI_CMI_TEST.tar.gz) contains data for 10 alcoholic and 10 control subjects, with 10 runs per subject per paradigm. The test data used the same 10 alcoholic and 10 control subjects as with the training data, but with 10 out-of-sample runs per subject per paradigm. * The Full Data Set contains all 120 trials for 122 subjects. The entire set of data is about 700 MBytes.
Proper citation: EEG Database (RRID:SCR_001581) Copy
The EBI genomes pages give access to a large number of complete genomes including bacteria, archaea, viruses, phages, plasmids, viroids and eukaryotes. Methods using whole genome shotgun data are used to gain a large amount of genome coverage for an organism. WGS data for a growing number of organisms are being submitted to DDBJ/EMBL/GenBank. Genome entries have been listed in their appropriate category which may be browsed using the website navigation tool bar on the left. While organelles are all listed in a separate category, any from Eukaryota with chromosome entries are also listed in the Eukaryota page. Within each page, entries are grouped and sorted at the species level with links to the taxonomy page for that species separating each group. Within each species, entries whose source organism has been categorized further are grouped and numbered accordingly. Links are made to: * taxonomy * complete EMBL flatfile * CON files * lists of CON segments * Project * Proteomes pages * FASTA file of Proteins * list of Proteins
Proper citation: EBI Genomes (RRID:SCR_002426) Copy
Multicenter observational study designed to identify genetic determinants of diabetic nephropathy. It is conducted in eleven U.S. clinical centers and a coordinating center, and with four ethnic groups (European Americans, African Americans, Mexican Americans, and American Indians). Two strategies are used to localize susceptibility genes: a family-based linkage study and a case-control study using mapping by admixture linkage disequilibrium (MALD). In the family-based study, probands with diabetic nephropathy are recruited with their parents and selected siblings. Linkage analyses will be conducted to identify chromosomal regions containing genes that influence the development of diabetic nephropathy or related quantitative traits such as serum creatinine concentration, urinary albumin excretion, and plasma glucose concentrations. Regions showing evidence of linkage will be examined further with both genetic linkage and association studies to identify genes that influence diabetic nephropathy or related traits. Two types of MALD studies are being done. One is a case-control study of unrelated individuals of Mexican American heritage in which both cases and controls have diabetes, but only the case has nephropathy. The other is a case-control study of African American patients with nephropathy (cases) and their spouses (controls) unaffected by diabetes and nephropathy; offspring are genotyped when available to provide haplotype data. The specific goals of this program: * Delineate genomic regions associated with the development and progression of renal disease(s) * Evaluate whether there is a genetic link between diabetic nephropathy and diabetic retinopathy * Improve outcomes * Provide protection for people at risk and slow the progression of renal disease * Help establish a resource for genetic studies of kidney disease and diabetic complications by creating a repository of genetic samples and a database * Encourage studies of the genetics of progressive renal disease
Proper citation: Family Investigation of Nephropathy of Diabetes (RRID:SCR_001525) Copy
http://bpg.utoledo.edu/~afedorov/lab/eid.html
Data sets of protein-coding intron-containing genes that contain gene information from humans, mice, rats, and other eukaryotes, as well as genes from species whose genomes have not been completely sequenced. This is a comprehensive and convenient dataset of sequences for computational biologists who study exon-intron gene structures and pre-mRNA splicing. The database is derived from GenBank release 112, and it contains protein-coding genes that harbor introns, along with extensive descriptions of each gene and its DNA and protein sequences, as well as splice motif information. They have created subdatabases of genes whose intron positions have been experimentally determined. The collection also contains data on untranslated regions of gene sequences and intron-less genes. For species with entirely sequenced genomes, species-specific databases have been generated. A novel Mammalian Orthologous Intron Database (MOID) has been introduced which includes the full set of introns that come from orthologous genes that have the same positions relative to the reading frames.
Proper citation: EID: Exon-Intron Database (RRID:SCR_002469) Copy
http://www.sci.unisannio.it/docenti/rampone/
Data set of Homo Sapiens Exons, Introns and Splice regions extracted from GenBank Rel.123 with an aim of giving standardized material to train and to assess the prediction accuracy of computational approaches for gene identification and characterization. From the complete GenBank (Primate Sequences Division) Rel.123 (162,557 entries), entries of Human Nuclear DNA including Complete CDS and more than one Exon have been selected, and 4523 exons and 3802 introns have been extracted from these entries. Details about extracted exons and introns are reported (Locus, number, Start and End position in the entry, sequence, length, G+C content, presence of not AGCT data (nucleotide scan check)). Statistics are also reported (overall nucleotides, average G+C content, nucleotide scan check results, number of not GT starting / AG ending introns, minimum / maximum / average length, length standard deviation). 3799+3799 donor and acceptor sites, as windows of 140 nucleotides around each splice site have been extracted. After discarding sequences not including canonical GTAG junctions (65+74), including insufficient data (not enough material for a 140 nucleotide window) (686+589), including not AGCT bases (29+30), and redundant (218+226) there are 2796+ 2880 windows. Finally, there are 271,937 + 332,296 windows of false splice sites, selected by searching canonical GTAG pairs in not splicing positions. The false sites in a range of +/- 60 from a true splice site are marked as proximal.
Proper citation: HS3D - Homo Sapiens Splice Sites Dataset (RRID:SCR_002939) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within FDI Lab - SciCrunch.org that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.