Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
http://www.genoscope.cns.fr/externe/tetraodon/
The initial objective of Genoscope was to compare the genomic sequences of this fish to that of humans to help in the annotation of human genes and to estimate their number. This strategy is based on the common genetic heritage of the vertebrates: from one species of vertebrate to another, even for those as far apart as a fish and a mammal, the same genes are present for the most part. In the case of the compact genome of Tetraodon, this common complement of genes is contained in a genome eight times smaller than that of humans. Although the length of the exons is similar in these two species, the size of the introns and the intergenic sequences is greatly reduced in this fish. Furthermore, these regions, in contrast to the exons, have diverged completely since the separation of the lineages leading to humans and Tetraodon. The Exofish method, developed at Genoscope, exploits this contrast such that the conserved regions which can be identified by comparing genomic sequences of the two species, correspond only to coding regions. Using preliminary sequencing results of the genome of Tetraodon in the year 2000, Genoscope evaluated the number of human genes at about 30,000, whereas much higher estimations were current. The progress of the annotation of the human genome has since supported the Genoscope hypothesis, with values as low as 22,000 genes and a consensus of around 25,000 genes. The sequencing of the Tetraodon genome at a depth of about 8X, carried out as a collaboration between Genoscope and the Whitehead Institute Center for Genome Research (now the Broad Institute), was finished in 2002, with the production of an assembly covering 90 of the euchromatic region of the genome of the fish. This has permitted the application of Exofish at a larger scale in comparisons with the genome of humans, but also with those of the two other vertebrates sequenced at the time (Takifugu, a fish closely related to Tetraodon, and the mouse). The conserved regions detected in this way have been integrated into the annotation procedure, along with other resources (cDNA sequences from Tetraodon and ab initio predictions). Of the 28,000 genes annotated, some families were examined in detail: selenoproteins, and Type 1 cytokines and their receptors. The comparison of the proteome of Tetraodon with those of mammals has revealed some interesting differences, such as a major diversification of some hormone systems and of the collagen molecules in the fish. A search for transposable elements in the genomic sequences of Tetraodon has also revealed a high diversity (75 types), which contrasts with their scarcity; the small size of the Tetraodon genome is due to the low abundance of these elements, of which some appear to still be active. Another factor in the compactness of the Tetraodon genome, which has been confirmed by annotation, is the reduction in intron size, which approaches a lower limit of 50-60 bp, and which preferentially affects certain genes. The availability of the sequences from the genomes of humans and mice on one hand, and Takifugu and Tetraodon on the other, provide new opportunities for the study of vertebrate evolution. We have shown that the level of neutral evolution is higher in fish than in mammals. The protein sequences of fish also diverge more quickly than those of mammals. A key mechanism in evolution is gene duplication, which we have studied by taking advantage of the anchoring of the majority of the sequences from the assembly on the chromosomes. The result of this study speaks strongly in favor of a whole genome duplication event, very early in the line of ray-finned fish (Actinopterygians). An even stronger evidence came from synteny studies between the genomes of humans and Tetraodon. Using a high-resolution synteny map, we have reconstituted the genome of the vertebrate which predates this duplication - that is, the last common ancestor to all bony vertebrates (most of the vertebrates apart from cartilaginous fish and agnaths like lamprey). This ancestral karyotype contains 12 chromosomes, and the 21 Tetraodon chromosomes derive from it by the whole genome duplication and a surprisingly small number of interchromosomal rearrangements. On the contrary, exchanges between chromosomes have been much more frequent in the lineage that leads to humans. Sponsors: The project was supported by the Consortium National de Recherche en Genomique and the National Human Genome Research Institute.
Proper citation: Tetraodon Genome Browser (RRID:SCR_007079) Copy
http://weizhong-lab.ucsd.edu/cd-hit/
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software program for clustering biological sequences with many applications in various fields such as making non-redundant databases, finding duplicates, identifying protein families, filtering sequence errors and improving sequence assembly etc. It is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT, CD-HIT-OTU and over a dozen scripts. * CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. * CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. * CD-HIT-454 identifies natural and artificial duplicates from pyrosequencing reads. * CD-HIT-OTU cluster rRNA tags into OTUs The usage of other programs and scripts can be found in CD-HIT user''s guide. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik''s Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute)., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: CD-HIT (RRID:SCR_007105) Copy
http://hcv.lanl.gov/content/immuno/immuno-main.html
The HCV Immunology Database contains a curated inventory of immunological epitopes in HCV and their interaction with the immune system, with associated retrieval and analysis tools. The funding for the HCV database project has stopped, and this website and the HCV immunology database are no longer maintained. The site will stay up, but problems will not be fixed. The database was last updated in September 2007. The HIV immunology website contains the same tools, and may be usable for non-HCV-specific analyses. For new epitope information, users of this database can try the Immuno Epitope Database (http://www.immuneepitope.org).
Proper citation: HCV Immunology Database (RRID:SCR_007086) Copy
Software platform, general technologies and theoretical supports for computational biology with the grand aim to make precise whole cell simulation at the molecular level possible.Technologies include formalisms and techniques, including technologies to predict, obtain or estimate parameters such as reaction rates and concentrations of molecules in the cell. The E-Cell System is a software platform for modeling, simulation and analysis of complex, heterogeneous and multi-scale system like the cell. The E-Cell Project is open to anyone who shares the view with u that development of cell simulation technology, and, even if such ultimate goal might not be within ten years of reach yet, solving various conceptual, computational and experimental problems that will continue to arise in the course of pursuing it, may have a multitude of eminent scientific, medical and engineering impacts on our society.
Proper citation: Electronic Cell Project (RRID:SCR_007381) Copy
Functional genomic database for malaria parasites. Database for Plasmodium spp. Provides resource for data analysis and visualization in gene-by-gene or genome-wide scale. PlasmoDB 5.5 contains annotated genomes, evidence of transcription, proteomics evidence, protein function evidence, population biology and evolution data. Data can be queried by selecting from query grid or drop down menus. Results can be combined with each other on query history page. Search results can be downloaded with associated functional data and registered users can store their query history for future retrieval or analysis.Key community database for malaria researchers, intersecting many types of laboratory and computational data, aggregated by gene.
Proper citation: PlasmoDB (RRID:SCR_013331) Copy
http://www.viprbrc.org/brc/home.do?decorator=vipr
Provides searchable public repository of genomic, proteomic and other research data for different strains of pathogenic viruses along with suite of tools for analyzing data. Data can be shared, aggregated, analyzed using ViPR tools, and downloaded for local analysis. ViPR is an NIAID-funded resource that support the research of viral pathogens in the NIAID Category A-C Priority Pathogen lists and those causing (re)emerging infectious diseases. It provides a dedicated gateway to SARS-CoV-2 data that integrates data from external sources (GenBank, UniProt, Immune Epitope Database, Protein Data Bank), direct submissions, analysis pipelines and expert curation, and provides a suite of bioinformatics analysis and visualization tools for virology research.
Proper citation: Virus Pathogen Resource (ViPR) (RRID:SCR_012983) Copy
http://www.rcsb.org/#Category-welcome
Collection of structural data of biological macromolecules. Database of information about 3D structures of large biological molecules, including proteins and nucleic acids. Users can perform queries on data and analyze and visualize results.
Proper citation: Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) (RRID:SCR_012820) Copy
Integrated database resource consisting of 16 main databases, broadly categorized into systems information, genomic information, and chemical information. In particular, gene catalogs in completely sequenced genomes are linked to higher-level systemic functions of cell, organism, and ecosystem. Analysis tools are also available. KEGG may be used as reference knowledge base for biological interpretation of large-scale datasets generated by sequencing and other high-throughput experimental technologies.
Proper citation: KEGG (RRID:SCR_012773) Copy
A high-quality integrated knowledge resource specialized in the immunoglobulins (IG) or antibodies, T cell receptors (TR), major histocompatibility complex (MHC) of human and other vertebrate species, and in the immunoglobulin superfamily (IgSF), MHC superfamily (MhcSF) and related proteins of the immune system (RPI) of vertebrates and invertebrates, serving as the global reference in immunogenetics and immunoinformatics. IMGT provides a common access to sequence, genome and structure Immunogenetics data, based on the concepts of IMGT-ONTOLOGY and on the IMGT Scientific chart rules. IMGT works in close collaboration with EBI (Europe), DDBJ (Japan) and NCBI (USA). IMGT consists of sequence databases, genome database, structure database, and monoclonal antibodies database, Web resources and interactive tools.
Proper citation: IMGT - the international ImMunoGeneTics information system (RRID:SCR_012780) Copy
http://umber.sbs.man.ac.uk/dbbrowser/bioie/
BioIE is a rule-based system that extracts informative sentences relating to protein families, their structures, functions and diseases from the biomedical literature. Based on manual definition of templates and rules, it aims at precise sentence extraction rather than wide recall. After uploading source text or retrieving abstracts from MEDLINE, users can extract sentences based on predefined or user-defined template categories. BioIE also provides a brief insight into the syntactic and semantic context of the source-text by looking at word, N-gram and MeSH-term distributions. Important Applications of BioIE are in, for example, annotation of microarray data and of protein databases.
Proper citation: BioIE: Extracting Informative Sentences From the Biomedical Literature (RRID:SCR_013464) Copy
https://www.encodeproject.org/
Consortium to build comprehensive parts list of functional elements in human genome. This includes elements that act at protein and RNA levels, and regulatory elements that control cells and circumstances in which gene is active. Data from 2012-present.
Proper citation: Encode (RRID:SCR_015482) Copy
http://apps.cytoscape.org/apps/cluepedia
Data analysis software and search tool for new markers potentially associated to pathways. CluePedia calculates linear and non-linear statistical dependencies from experimental data and investigates interrelations within each pathway to reveal associations through gene/protein/miRNA enrichments.
Proper citation: CluePedia Cytoscape plugin (RRID:SCR_015784) Copy
http://molevol.cmima.csic.es/castresana/Gblocks_server.html
Software that eliminates poorly aligned positions and divergent regions of a DNA or protein alignment so that it becomes more suitable for phylogenetic analysis., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: Gblocks (RRID:SCR_015945) Copy
http://nematode.lab.nig.ac.jp/
Expression pattern map of the 100Mb genome of the nematode Caenorhabditis elegans through EST analysis and systematic whole mount in situ hybridization. NEXTDB is the database to integrate all information from their expression pattern project and to make the data available to the scientific community. Information available in the current version is as follows: * Map: Visual expression of the relationships among the cosmids, predicted genes and the cDNA clones. * Image: In situ hybridization images that are arranged by their developmental stages. * Sequence: Tag sequences of the cDNA clones are available. * Homology: Results of BLASTX search are available. Users of the data presented on our web pages should not publish the information without our permission and appropriate acknowledgment. Methods are available for: * In situ hybridization on whole mount embryos of C.elegans * Protocols for large scale in situ hybridization on C.elegans larvae
Proper citation: NEXTDB (RRID:SCR_004480) Copy
Open source environment for sharing, processing and analyzing stem cell data bringing together stem cell data sets with tools for curation, dissemination and analysis. Standardization of the analytical approaches will enable researchers to directly compare and integrate their results with experiments and disease models in the Commons. Key features of the Stem Cell Commons * Contains stem cell related experiments * Includes microarray and Next-Generation Sequencing (NGS) data from human, mouse, rat and zebrafish * Data from multiple cell types and disease models * Carefully curated experimental metadata using controlled vocabularies * Export in the Investigation-Study-Assay tabular format (ISA-Tab) that is used by over 30 organizations worldwide * A community oriented resource with public data sets and freely available code in public code repositories such as GitHub Currently in development * Development of Refinery, a novel analysis platform that links Commons data to the Galaxy analytical engine * ChIP-seq analysis pipeline (additional pipelines in development) * Integration of experimental metadata and data files with Galaxy to guide users to choose workflows, parameters, and data sources Stem Cell Commons is based on open source software and is available for download and development.
Proper citation: Stem Cell Commons (RRID:SCR_004415) Copy
http://www.ncbi.nlm.nih.gov/biosystems/
Database that provides access to biological systems and their component genes, proteins, and small molecules, as well as literature describing those biosystems and other related data throughout Entrez. A biosystem, or biological system, is a group of molecules that interact directly or indirectly, where the grouping is relevant to the characterization of living matter. BioSystem records list and categorize components, such as the genes, proteins, and small molecules involved in a biological system. The companion FLink tool, in turn, allows you to input a list of proteins, genes, or small molecules and retrieve a ranked list of biosystems. A number of databases provide diagrams showing the components and products of biological pathways along with corresponding annotations and links to literature. This database was developed as a complementary project to (1) serve as a centralized repository of data; (2) connect the biosystem records with associated literature, molecular, and chemical data throughout the Entrez system; and (3) facilitate computation on biosystems data. The NCBI BioSystems Database currently contains records from several source databases: KEGG, BioCyc (including its Tier 1 EcoCyc and MetaCyc databases, and its Tier 2 databases), Reactome, the National Cancer Institute's Pathway Interaction Database, WikiPathways, and Gene Ontology (GO). It includes several types of records such as pathways, structural complexes, and functional sets, and is desiged to accomodate other record types, such as diseases, as data become available. Through these collaborations, the BioSystems database facilitates access to, and provides the ability to compute on, a wide range of biosystems data. If you are interested in depositing data into the BioSystems database, please contact them.
Proper citation: NCBI BioSystems Database (RRID:SCR_004690) Copy
http://www.proconsortium.org/pro/
An ontological representation of protein-related entities by explicitly defining them and showing the relationships between them. Each PRO term represents a distinct class of entities (including specific modified forms, orthologous isoforms, and protein complexes) ranging from the taxon-neutral to the taxon-specific. The ontology has a meta-structure encompassing three areas: proteins based on evolutionary relatedness (ProEvo); protein forms produced from a given gene locus (ProForm); and protein-containing complexes (ProComp). NOTICE: The PRO ID format has changed from PRO: to PR: (e.g. PRO:000000563 is now PR:000000563).
Proper citation: PR (RRID:SCR_004964) Copy
GenMAPP is a free computer application designed to visualize gene expression and other genomic data on maps representing biological pathways and groupings of genes. Integrated with GenMAPP are programs to perform a global analysis of gene expression or genomic data in the context of hundreds of pathway MAPPs and thousands of Gene Ontology Terms (MAPPFinder), import lists of genes/proteins to build new MAPPs (MAPPBuilder), and export archives of MAPPs and expression/genomic data to the web. The main features underlying GenMAPP are: *Draw pathways with easy to use graphics tools *Color genes on MAPP files based on user-imported genomic data *Query data against MAPPs and the GeneOntology Enhanced features include the simultaneous view of multiple color sets, expanded species-specific gene databases and custom database options.
Proper citation: Gene Map Annotator and Pathway Profiler (RRID:SCR_005094) Copy
https://www.rostlab.org/services/snpdbe/
A database to fill the annotation gap left by the high cost of experimental testing for functional significance of protein variants. It joins related bits of knowledge, currently distributed throughout various databases, into a consistent, easily accessible, and updatable resource. It currently covers over 155,000 protein sequences which come from more than 2,600 organisms. Overall more than one million single amino acid substitutions (SAASs) are referenced consisting of natural variants, SAASs from mutagenesis experiments and sequencing conflicts. SNPdbe offers the following pieces of information (if available) on each SAAS: * Experimentally derived functional and structural impact * Predicted functional effect * Associated disease * Average heterozygosity * Experimental evidence of the nsSNP * Evolutionary conservation of wildtype and mutant amino acid * Link-outs to external databases A convenient webinterface to query SAASs on the following levels is offered: * Protein and gene identifiers and keywords * Disease keywords * Protein sequence on different sequence identity thresholds * Variant identifier (dbSNP rs, SwissVar, PMD) or specific mutant like XposY and specified sequence They offer the possibility to submit protein sequences along with experimentally substantiated mutations in order to predict their functional effect and inclusion into our database.
Proper citation: SNPdbe (RRID:SCR_005190) Copy
http://bioplex.hms.harvard.edu/
Database of cell lines with each expressing a tagged version of a protein from the ORFeome collection. The overarching project goal is to determine protein interactions for every member of the collection.
Proper citation: BioPlex (RRID:SCR_016144) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the dkNET Resources search. From here you can search through a compilation of resources used by dkNET and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that dkNET has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on dkNET then you can log in from here to get additional features in dkNET such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into dkNET you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within dkNET that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.