Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
Allows annotation of gene expression at all stages of development and tissue types (including sub cellular location) using standard Drosophila anatomy ontology. All methods of input use a controlled vocabulary to ensure data integrity.
Proper citation: Flannotator (RRID:SCR_001608) Copy
http://cddb.nhlbi.nih.gov/cddb/
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 16, 2013. This database is intended to serve as a learning tool to obtain curated information for the design of microarray targets to scan collecting duct tissues (human, rat, mouse). The database focuses on regulatory and transporter proteins expressed in the collecting duct, but when collecting duct proteins are a member of a larger family of proteins, common additional members of the family are included even if they have not been demonstrated to be expressed in the collecting duct. An Internet-accessible database has been devised for major collecting duct proteins involved in transport and regulation of cellular processes. The individual proteins included in this database are those culled from literature searches and from previously published studies involving cDNA arrays and serial analysis of gene expression (SAGE). Design of microarray targets for the study of kidney collecting duct tissues is facilitated by the database, which includes links to curated base pair and amino acid sequence data, relevant literature, and related databases. Use of the database is illustrated by a search for water channel proteins, aquaporins, and by a subsequent search for vasopressin receptors. Links are shown to the literature and to sequence data for human, rat, and mouse, as well as to relevant web-based resources. Extension of the database is dynamic and is done through a maintenance interface. This permits creation of new categories, updating of existing entries, and addition of new ones. CDDB is a database that organizes lists of genes found in collecting duct tissues from three mammalian species: human, rat, and mouse. Proteins are divided into categories by family relationships and functional classification, and each category is assigned a section in the database. Each section includes links to the literature and to sequence information for genes, proteins, expressed sequence tags, and related information. The user can peruse a section or use a search engine at the bottom of the web page to search the database for a name or abbreviation or for a link to a sequence. Each entry in the database includes links to relevant papers in the kidney and collecting duct literature. It uses links to PubMed to generate MEDLINE searches for retrieval of references. In addition, each entry includes links to curated sequence data available in LocusLink. Individual links are made to sequence and protein data for human, rat, and mouse. Links are then added as curated sequences become available for proteins identified in the renal collecting duct and for proteins identified in kidney and similar in function or homologous to proteins identified in the collecting duct.
Proper citation: Collecting Duct Database (RRID:SCR_000759) Copy
http://cmbi.bjmu.edu.cn/mirsnp
Database of human SNPs in predicted miRNA-mRNA binding sites, based on information from dbSNP135 and mirBASE18. MirSNP is highly sensitive and covers most experiments confirmed SNPs that affect miRNA function. MirSNP may be combined with researchers' own GWAS or eQTL positive data sets to identify the putative miRNA-related SNPs from traits/diseases associated variants. They aim to update the MirSNP database as new versions of mirBASE and dbSNP database become available.
Proper citation: MirSNP (RRID:SCR_001629) Copy
https://www.hgmd.cf.ac.uk/ac/introduction.php?lang=english
Curated database of known (published) gene lesions responsible for human inherited disease.
Proper citation: Human Gene Mutation Database (RRID:SCR_001621) Copy
Database and browser that provides a central resource to archive and display association between genetic variation and high-throughput molecular-level phenotypes. This effort originated with the NIH GTEx roadmap project: however the scope of this resource will be extended to include any available genotype/molecular phenotype datasets.
Proper citation: GTEx eQTL Browser (RRID:SCR_001618) Copy
A database for maternal gene expression information for ascidia, colloquially known as sea squirts. Information available includes DNA sequences, expression patterns of ESTs, and cDNA data from uncleaved fertilized eggs. The goal is to utilize the database to understand molecular mechanisms of establishment of embryonic body plans of chordates and to understand evolution from invertebrates to vertebrates in the future.
Proper citation: MAboya Gene Expression Patterns and Sequence Tags (RRID:SCR_000763) Copy
https://fungi.ensembl.org/Neurospora_crassa/Info/Index
It's strategy involves Whole Genome Shotgun (WGS) sequencing, in which sequence from the entire genome is generated and reassembled. This method is standard for microbial genome sequencing, and has been successfully applied to Drosophila. Neurospora is an ideal candidate for this approach because of the low repeat content of the genome. Neurospora crassa Database has expanded the scope of its database by including a mitochondrial annotation, incorporating information from the Neurospora compendium, and assigning NCU numbers to tRNA and rRNAs. They have improved the annotation process to predict untranslated regions and to reduce the number of spurious predictions. As a result, version 3 contains 9,826 genes, 794 fewer than version 2. During the initial phase of a WGS project they sequence both ends of the 4 kb inserts from a plasmid library prepared using randomly sheared and sized-selected DNA. The shotgun reads are assembled by recognizing overlapping regions of sequence and making use of the knowledge of the orientation and distance of the paired reads from each plasmid. Obtaining deep sequence coverage though high levels of sequence redundancy assures that the majority of the genome is represented in the initial assembly and that the consensus sequence is of high quality. Their approach toward the initial assembly was conservative, meaning they would rather fail to join sequence contigs that might overlap each other than risk making false joins between two closely related but non-overlapping genomic regions. Hence, the initial assembly contains many sequence contigs and over time these contigs will increase in size and decrease in number as they are joined together. After shotgun sequencing and assembly there was a second phase of sequencing in which additional sequence was obtained from specific regions that were missing from the original assembly or are recognized to be of low quality in the consensus. The Neurospora crassa sequencing project reflects a close collaboration between the Broad Institute and the Neurospora research community. Principal investigators include Bruce Birren and Chad Nusbaum from the Broad Institute, Matt Sachs at the Oregon Graduate Institute of Science and Technology, Chuck Staben at the University of Kentucky and Jak Kinsey at the Fungal Genetics Stock Center at the University of Kansas Medical Center. In addition, we have a larger Advisory Board made up of a number of Neurospora researchers. Sponsors: They have been funded by the National Science Foundation to sequence the N. crassa genome and make the information publicly available.
Proper citation: Neurospora crassa Database (RRID:SCR_001372) Copy
http://hscl.cimr.cam.ac.uk/bloodexpress/
A database of gene expression in mouse haematopoiesis, integrating 271 individual microarray experiments derived from 15 distinct studies done on most characterized mouse blood cell types. Gene expression information has been discretized to absent/present/unknown calls. It supports gene-centric searches to find out where a gene of interest is expressed, and what other genes follow the same (or a similar) pattern of expression. It also supports cell-centric searches to find out what genes are expressed in specific cell types/studies and not others.
Proper citation: BloodExpress (RRID:SCR_001142) Copy
http://exon.cshl.org/cgi-bin/atprobe/atprobe.pl
Arabidopsis thaliana promoter binding element database that focuses on specific binding elements on known genes, found with experimental methods.
Proper citation: AtProbe (RRID:SCR_005412) Copy
Database for identifying orthologous phenotypes (phenologs). Mapping between genotype and phenotype is often non-obvious, complicating prediction of genes underlying specific phenotypes. This problem can be addressed through comparative analyses of phenotypes. We define phenologs based upon overlapping sets of orthologous genes associated with each phenotype. Comparisons of >189,000 human, mouse, yeast, and worm gene-phenotype associations reveal many significant phenologs, including novel non-obvious human disease models. For example, phenologs suggest a yeast model for mammalian angiogenesis defects and an invertebrate model for vertebrate neural tube birth defects. Phenologs thus create a rich framework for comparing mutational phenotypes, identify adaptive reuse of gene systems, and suggest new disease genes. To search for phenologs, go to the basic search page and enter a list of genes in the box provided, using Entrez gene identifiers for mouse/human genes, locus ids for yeast (e.g., YHR200W), or sequence names for worm (e.g., B0205.3). It is expected that this list of genes will all be associated with a particular system, trait, mutational phenotype, or disease. The search will return all identified model organism/human mutational phenotypes that show any overlap with the input set of the genes, ranked according to their hypergeometric probability scores. Clicking on a particular phenolog will result in a list of genes associated with the phenotype, from which potential new candidate genes can identified. Currently known phenotypes in the database are available from the link labeled ''Find phenotypes'', where the associated gene can be submitted as queries, or alternately, can be searched directly from the link provided.
Proper citation: Phenologs (RRID:SCR_005529) Copy
Database of the international consortium working together to mutate all protein-coding genes in the mouse using a combination of gene trapping and gene targeting in C57BL/6 mouse embryonic stem (ES) cells. Detailed information on targeted genes is available. The IKMC includes the following programs: * Knockout Mouse Project (KOMP) (USA) ** CSD, a collaborative team at the Children''''s Hospital Oakland Research Institute (CHORI), the Wellcome Trust Sanger Institute and the University of California at Davis School of Veterinary Medicine , led by Pieter deJong, Ph.D., CHORI, along with K. C. Kent Lloyd, D.V.M., Ph.D., UC Davis; and Allan Bradley, Ph.D. FRS, and William Skarnes, Ph.D., at the Wellcome Trust Sanger Institute. ** Regeneron, a team at the VelociGene division of Regeneron Pharmaceuticals, Inc., led by David Valenzuela, Ph.D. and George D. Yancopoulos, M.D., Ph.D. * European Conditional Mouse Mutagenesis Program (EUCOMM) (Europe) * North American Conditional Mouse Mutagenesis Project (NorCOMM) (Canada) * Texas A&M Institute for Genomic Medicine (TIGM) (USA) Products (vectors, mice, ES cell lines) may be ordered from the above programs.
Proper citation: International Knockout Mouse Consortium (RRID:SCR_005574) Copy
A publicly available database of Transposed elements (TEs) which are located within protein-coding genes of 7 organisms: human, mouse, chicken, zebrafish, fruilt fly, nematode and sea squirt. Using TranspoGene the user can learn about the many aspects of the effect these TEs have on their hosting genes, such as: exonization events (including alternative splicing-related data), insertion of TEs into introns, exons, and promoters, specific location of the TE over the gene, evolutionary divergence of the TE from its consensus sequence and involvement in diseases. TranspoGene database is quickly searchable through its website, enables many kinds of searches and is available for download. TranspoGene contains information regarding specific type and family of the TEs, genomic and mRNA location, sequence, supporting transcript accession and alignment to the TE consensus sequence. The database also contains host gene specific data: gene name, genomic location, Swiss-Prot and RefSeq accessions, diseases associated with the gene and splicing pattern. The TranspoGene and microTranspoGene databases can be used by researchers interested in the effect of TE insertion on the eukaryotic transcriptome.
Proper citation: TranspoGene (RRID:SCR_005634) Copy
http://www.gene-regulation.com/pub/databases.html#transfac
Manually curated database of eukaryotic transcription factors, their genomic binding sites and DNA binding profiles. Used to predict potential transcription factor binding sites.
Proper citation: TRANSFAC (RRID:SCR_005620) Copy
http://www.dbs.ifi.lmu.de/~bundschu/LHGDN.html
A text mining derived database with focus on extracting and classifying gene-disease associations with respect to several biomolecular conditions. It uses a machine learning based algorithm to extract semantic gene-disease relations from a textual source of interest. The semantic gene-disease relations were extracted with F-measures of 78. More specifically, the textual source utilized here originates from Entrez Gene''''s GeneRIF (Gene Reference Into Function) database (Mitchell, et al., 2003). LHGDN was created based on a GeneRIF version from March 31st, 2009, consisting of 414241 phrases. These phrases were further restricted to the organism Homo sapiens, which resulted in a total of 178004 phrases. We benchmark our approach on two different tasks. The first task is the identification of semantic relations between diseases and treatments. The available data set consists of manually annotated PubMed abstracts. The second task is the identification of relations between genes and diseases from a set of concise phrases, so-called GeneRIF (Gene Reference Into Function) phrases. In our experimental setting, we do not assume that the entities are given, as is often the case in previous relation extraction work. Rather the extraction of the entities is solved as a subproblem. Compared with other state-of-the-art approaches, we achieve very competitive results on both data sets. To demonstrate the scalability of our solution, we apply our approach to the complete human GeneRIF database. The resulting gene-disease network contains 34758 semantic associations between 4939 genes and 1745 diseases. The gene-disease network is publicly available as a machine-readable RDF graph. We extend the framework of Conditional Random Fields towards the annotation of semantic relations from text and apply it to the biomedical domain. Our approach is based on a rich set of textual features and achieves a performance that is competitive to leading approaches. The model is quite general and can be extended to handle arbitrary biological entities and relation types. The resulting gene-disease network shows that the GeneRIF database provides a rich knowledge source for text mining.
Proper citation: Literature-derived human gene-disease network (RRID:SCR_005653) Copy
A knowledgebase of Biochemically, Genetically and Genomically structured genome-scale metabolic network reconstructions. BiGG integrates several published genome-scale metabolic networks into one resource with standard nomenclature which allows components to be compared across different organisms. BiGG can be used to browse model content, visualize metabolic pathway maps, and export SBML files of the models for further analysis by external software packages. Users may follow links from BiGG to several external databases to obtain additional information on genes, proteins, reactions, metabolites and citations of interest.
Proper citation: BiGG Database (RRID:SCR_005809) Copy
http://igdb.nsclc.ibms.sinica.edu.tw/
IGDB.NSCLC database is aiming to facilitate and prioritize identified lung cancer genes and microRNAs for pathological and mechanistic studies of lung tumorigenesis and for developing new strategies for clinical interventions. We integrated and curated various lung cancer genomic datasets to present # lung cancer genes with somatic mutations, experimental supports and statistic significance in association with clinicopathological features; # genomic alterations with copy number alterations (CNA) detected by high density SNP arrays, gain or loss regions detected by arrayed comparative genome hybridization (aCGH), and loss of heterozygosity (LOH) detected by microsatellite markers; # aberrant expression of genes and microRNAs detected by various microarrays. IGDB.NSCLC database provides user friendly interfaces and searching functions to display multiple layers of evidence for detecting lung cancer target genes and microRNAs, especially emphasizing on concordant alterations: # genes with altered expression located in the CNA regions; # microRNAs with altered expression located in the CNA regions; # somatic mutation genes located in the CNA regions; and # genes associated with clinicopathological features located in the CNA regions. These concordant altered genes and miRNAs should be prioritized for further basic and clinical studies.
Proper citation: IGDB.NSCLC (RRID:SCR_006048) Copy
http://www.informatics.jax.org
International database for laboratory mouse. Data offered by The Jackson Laboratory includes information on integrated genetic, genomic, and biological data. MGI creates and maintains integrated representation of mouse genetic, genomic, expression, and phenotype data and develops reference data set and consensus data views, synthesizes comparative genomic data between mouse and other mammals, maintains set of links and collaborations with other bioinformatics resources, develops and supports analysis and data submission tools, and provides technical support for database users. Projects contributing to this resource are: Mouse Genome Database (MGD) Project, Gene Expression Database (GXD) Project, Mouse Tumor Biology (MTB) Database Project, Gene Ontology (GO) Project at MGI, and MouseCyc Project at MGI.
Proper citation: Mouse Genome Informatics (MGI) (RRID:SCR_006460) Copy
Database and discovery platform containing publicly available collections of genes and variants associated to human diseases. Integrates data from curated repositories, GWAS catalogues, animal models and scientific literature.
Proper citation: DisGeNET (RRID:SCR_006178) Copy
http://www.snpedia.com/index.php/SNPedia
Wiki investigating human genetics including information about the effects of variations in DNA, citing peer-reviewed scientific publications. It is used by Promethease to analyze and help explain your DNA. It is based on a wiki model in order to foster communication about genetic variation and to allow interested community members to help it evolve to become ever more relevant. As the cost of genotyping (and especially of fully determining your own genomic sequence) continues to drop, we''''ll all want to know more - a lot more - about the meaning of these DNA variations and SNPedia will be here to help. SNPedia has been launched to help realize the potential of the Human Genome Project to connect to our daily lives and well-being. For more information see the Wikipedia page, http://en.wikipedia.org/wiki/SNPedia * Download URL: http://www.SNPedia.com/index.php/Bulk * Web Service URL: http://bots.SNPedia.com/api.php
Proper citation: SNPedia (RRID:SCR_006125) Copy
ProPortal is a database containing genomic, metagenomic, transcriptomic and field data for the marine cyanobacterium Prochlorococcus. Our goal is to provide a source of cross-referenced data across multiple scales of biological organization--from the genome to the ecosystem--embracing the full diversity of ecotypic variation within this microbial taxon, its sister group, Synechococcus and phage that infect them. The site currently contains the genomes of 13 Prochlorococcus strains, 11 Synechococcus strains and 28 cyanophage strains that infect one or both groups. Cyanobacterial and cyanophage genes are clustered into orthologous groups that can be accessed by keyword search or through a genome browser. Users can also identify orthologous gene clusters shared by cyanobacterial and cyanophage genomes. Gene expression data for Prochlorococcus ecotypes MED4 and MIT9313 allow users to identify genes that are up or downregulated in response to environmental stressors. In addition, the transcriptome in synchronized cells grown on a 24-h light-dark cycle reveals the choreography of gene expression in cells in a ''natural'' state. Metagenomic sequences from the Global Ocean Survey from Prochlorococcus, Synechococcus and phage genomes are archived so users can examine the differences between populations from diverse habitats. Finally, an example of cyanobacterial population data from the field is included.
Proper citation: ProPortal (RRID:SCR_006112) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the SPARC SAWG Resources search. From here you can search through a compilation of resources used by SPARC SAWG and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that SPARC SAWG has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on SPARC SAWG then you can log in from here to get additional features in SPARC SAWG such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into SPARC SAWG you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within SPARC SAWG that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.