Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
https://confluence.crbs.ucsd.edu/display/NIF/StemCellInfo
Data tables providing an overview of information about stem cells that have been derived from mice and humans. The tables summarize published research that characterizes cells that are capable of developing into cells of multiple germ layers (i.e., multipotent or pluripotent) or that can generate the differentiated cell types of another tissue (i.e., plasticity) such as a bone marrow cell becoming a neuronal cell. The tables do not include information about cells considered progenitor or precursor cells or those that can proliferate without the demonstrated ability to generate cell types of other tissues. The tables list the tissue from which the cells were derived, the types of cells that developed, the conditions under which differentiation occurred, the methods by which the cells were characterized, and the primary references for the information.
Proper citation: National Institutes of Health Stem Cell Tables (RRID:SCR_008359) Copy
https://brads.nichd.nih.gov/Home/
Access to data from the Division of Intramural Population Health Research (DIPHR) of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) from completed studies, including biospecimens and ancillary data.
Proper citation: Biospecimen Repository Access and Data Sharing (RRID:SCR_017383) Copy
Collection of reference datasets for human immunology, derived from control subjects in the NIAID ImmPort database . Available data include flow cytometry, CyTOF, multiplex ELISA, gene expression, HAI titers, clinical lab tests, HLA type, and others.
Proper citation: The 10000 Immunomes (RRID:SCR_016624) Copy
https://www.grnpedia.org/trrust/
TRUSST is reference database of human transcriptional regulatory interactions.TRRUST v2 is manually curated expanded reference database of human and mouse transcriptional regulatory interactions.
Proper citation: Transcriptional Regulatory Relationships Unrevealed by Sentence based Text mining database (RRID:SCR_022554) Copy
https://www.vet.k-state.edu/research/docs/BRITE-application.pdf
The BRITE Veterinary Student Program provides DVM students interested in research with a subsidized, in-depth mentored research experience. The opportunity can be used to gain research experience, to obtain an MS, or to jump-start a DVM/PhD program. The BRITE veterinary student program is designed to expose DVM students to hypothesis-driven research activities, methodologies involved in design and execution of laboratory experiments and ethical issues pertinent to biomedical research, at a formative stage of their veterinary education. BRITE veterinary students are given a unique opportunity to utilize the rigorous didactic basic science training obtained during the first two years of the professional curriculum in pursuit of a research problem relevant to human and animal health. Sponsors: The program is funded by Kansas State University.
Proper citation: Basic Research Immersion Training Experience Veterinary Student Program (RRID:SCR_008305) Copy
http://www.vetmed.wisc.edu/ms-phd/
The Comparative Biomedical Sciences Graduate Degree program provides exceptional graduate research training in core areas of animal and human health including genomics, immunology, molecular and cellular biology, physiology, infectious disease, neuroscience, pharmacology and toxicology, and oncology. Seventy-five faculty members in a diverse number of UW departments including Bacteriology, Biochemistry, Medical Microbiology and Immunology, Medicine, Oncology, Pathology, Radiology in addition to the 4 departments of the School of Veterinary Medicine are trainers in the program. These internationally recognized professors, as well as the integrative nature of our program, provide outstanding and unique research opportunities for our students. Because the University of Wisconsin is consistently ranked as one of the best 10 graduate institutions in the nation, the strength of our program is not only due to the superb research and teaching of our faculty but also due to the University as a whole. Approximately 55 students, most of whom are Ph.D. candidates, are currently enrolled in the program. Research strategies and academic curricula are tailored to the specific needs of each individual student. Graduates from our program are highly successful in the biotechnology industry and at top-ranked research institutions in the U.S. and abroad. The Comparative Biomedical Sciences Graduate Program offers a diverse number of research opportunities in multiple fields of study. A brief description of some of the major areas of research being performed by faculty affiliated with the Comparative Biomedical Sciences Graduate Program is provided below. Use the pull down menu above or click on the heading to find faculty members doing research in these areas. Sponsors: CBMS is supported by the University of Wisconsin
Proper citation: Comparative Biomedical Sciences Graduate Program (RRID:SCR_008304) Copy
http://amazonia.montp.inserm.fr/
A web interface and associated tools for easy query of public human transcriptome data by keyword, through thematic pages with list annotations. Amazonia provides a thematic entry to public transcriptomes: users may for instance query a gene on a Stem Cells page, where they will see the expression of their favorite gene across selected microarray experiments related to stem cell biology. This selection of samples can be customized at will among the 6331 samples currently present in the database. Every transcriptome study results in the identification of lists of genes relevant to a given biological condition. In order to include this valuable information in any new query in the Amazonia database, they indicate for each gene in which lists it is included. This is a straightforward and efficient way to synthesize hundreds of microarray publications., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: AmaZonia: Explore the Jungle of Microarrays Results (RRID:SCR_008405) Copy
http://www.ebi.ac.uk/asd/altextron/indexhtml
THIS RESOURCE IS NO LONGER IN SERVICE. A computer generated high quality dataset of human transcript-confirmed constitutive and alternative exons and introns. The alternative events have been delineated and annotated with various characterizations. AltExtron is the prototype database for the production version AltSplice. AltExtron is more geared towards investigating various aspects of the methodologies used, and focuses in general on the biology behind alternative splicing. The complete data used in this work is available for downloading in several flat files, containing human genes, introns, exons, isoform events, human-mouse comparisons, and additional information on GC-AG introns. Two versions of AltExtron data are available - one as prototype (for human) and another as latest build (for human, drosophila, mouse, and others) based on EMBL/GenBank (Feb 2003).
Proper citation: AltExtron Database (RRID:SCR_008404) Copy
THIS RESOURCE IS NO LONGER IN SERVICE, it has been replaced by Monarch Initiative. LAMHDI, the initiative to Link Animal Models to Human DIsease, is designed to accelerate the research process by providing biomedical researchers with a simple, comprehensive Web-based resource to find the best animal model for their research. LAMDHI is a free, Web-based, resource to help researchers bridge the gap between bench testing and human trials. It provides a free, unbiased resource that enables scientists to quickly find the best animal models for their research studies. LAMHDI includes mouse data from MGI, the Mouse Genome Informatics website; zebrafish data from ZFIN, the Zebrafish Model Organism Database; rat data from RGD, the Rat Genome Database; yeast data from SGD, the Saccharomyces Genome Database; and fly data from FlyBase. LAMHDI.org is operational today, and data is added regularly. Enhancements are planned to let researchers contribute their knowledge of the animal models available through LAMHDI. The LAMHDI goal is to allow researchers to share information about and access to animal models so they can refine research and testing, and reduce or replace the use of animal models where possible. LAMHDI Database Search: LAMHDI brings together scientifically validated information from various sources to create a composite multi-species database of animal models of human disease. To do this, the LAMHDI database is prepared from a variety of sources. The LAMHDI team takes publicly available data from OMIM, NCBI''s Entrez Gene database, Homologene, and WikiPathways, and builds a mathematical graph (think of it as a map or a web) that links these data together. OMIM is used to link human diseases with specific human genes, and Entrez provides universal identifiers for each of those genes. Human genes are linked to their counterpart genes in other species with Homologene, and those genes are linked to other genes tentatively or authoritatively using the data in WikiPathways. This preparatory work gives LAMHDI a web of human diseases linked to specific human genes, orthologous human genes, homologous genes in other species, and both human and non-human genes involved in specific metabolic pathways associated with those diseases. LAMHDI includes model data that partners provide directly from their data structures. For instance, MGI provides information about mouse models, including a disease for each model, as well as some genetic information (the ID of the model, in fact, identifies one or more genes). ZFIN provides genetic information for each zebrafish model, but no diseases, so zebrafish models are integrated by using the genes as the glue. For instance, a zebrafish model built to feature the zebrafish PKD2 gene would plug into the larger disease-gene map at the node representing the zebrafish PKD2 gene, which is connected to the node representing the human PKD2 gene, which in turn is connected to the node representing the human disease known as polycystic kidney disease. (Some of the partner data LAMHDI receives can even extend the base map. MGI provides a disease for every model, and in some cases this allows the creation of a disease-to-gene relationship in the LAMHDI database that might not already be documented in the OMIM dataset.) With curatorial and model information in hand, LAMHDI runs a lengthy automated process that exhaustively searches for every possible path between each model and each disease in the data, up to a set number of hops, producing for each disease-to-model pair a set of links from the disease to the model. The algorithm avoids circular paths and paths that include more than one disease anywhere in the middle of the path. At the end of this phase, LAMHDI has a comprehensive set of paths representing all the disease-to-model relationships in the data, varying in length from one hop to many hops. Each disease-to-model path is essentially a string of nodes in the data, where each node represents a disease, a gene, a linkage between genes (an orthologue, a homologue, or a pathway connection, referred to as a gene cluster or association), or a model. Each node has a human-friendly label, a set of terms and keywords, and - in most cases - a URL linking the node to the data source where it originated. When a researcher submits a search on the LAMHDI website, LAMHDI searches for the user''s search terms in its precomputed list of all known disease-to-model paths. It looks for the terms not only in the disease and model nodes, but also in every node along each path. The complete set of hits may include multiple paths between any given disease-to-model pair of endpoints. Each of these disease-to-model pair sets is ordered by the number of hops it involves, and the one involving the fewest hops is chosen to represent its respective disease-to-model pair in the search results presented to the user. Results are sorted by scores that represent their matches. The number of hops is one barometer of the strength of the evidence linking the model and the disease; fewer hops indicates the relationship is stronger, more hops indicates it may be weaker. This indicator works best for comparing models from a single partner dataset: MGI explicitly identifies a disease for each mouse model, so there can be disease-to-model hits for mice that involve just one hop. Because ZFIN does not explicitly identify a disease for each model, no zebrafish model will involve fewer than four hops to the nearest disease, from the zebrafish model to a zebrafish gene to a gene cluster to a human gene to a human disease.
Proper citation: LAMHDI: The Initiative to Link Animal Models to Human DIsease (RRID:SCR_008643) Copy
http://alizadehlab.stanford.edu/
This is an open-source Mouse Exonic Evidence-Based Oligonucleotide Chip (MEEBOChip), and are in the process of building the human counterpart, HEEBOChip. The set of 70mers for MEEBOChip is already available from Illumina, Inc., with synthesis of HEEBOChip 70mers in progress. Both arrays are based on a novel selection of exonic long-oligonucleotides (70-mers) from a genomic annotation of the corresponding complete genome sequences, using a transcriptome-based annotation of exon structure for each genomic locus. Using a combination of existing and custom-tailored tools and datasets (including millions of mRNA and EST sequences), we built and performed a systematic examination of transcript-supported exon structure for each genomic locus at the base-pair level (i.e., exonic evidence). This strategy allowed them to select both constitutive and in many cases alternative exons for nearly every gene in the corresponding genome (e.g., protocadherin locus), allowing an unprecedented exploration of human and mouse biology. Furthermore, they used experimentally derived data to hone the selection of these 70mers, helping maximize their performance under typical fluorescent labeling and hybridization conditions. Specifically, they applied and refined the ArrayOligoSelector algorithm from Joe DeRisis laboratory to select 70mers, considering not only their uniqueness (i.e., hybridization specificity) within the content of the entire genome, but also to overcome the known biases of labeling and hybridization methods (e.g., 3-biased reverse transcription and in vitro transcription reactions).
Proper citation: Alizadehlab: MeeboChip and HeeboChip Open Source Project (RRID:SCR_008384) Copy
http://www.molecularbrain.org/
MolecularBrain is an attempt to collect, collates, analyze and present the microarray derived gene expression data from various brain regions side by side. Transcription Profile of any gene in Mouse (online) and Human Brain (not yet) can be accessed as a histogram along with links to access various aspects of that gene. The expression levels were calculated from microarray data deposited at GEO (Gene expression omnibus). The molecular brain database could be searched using the built in search tool with the terms Entrez GeneID, gene symbol, synonym or description. Gene information along with their expression values can be also accessed from the alphabetical list of gene symbols on the footer. The protocol and GEO sample information is available.
Proper citation: Molecular Brain: Transcription Profiles of Mouse and Human Brains (RRID:SCR_008689) Copy
http://www.cmbi.ru.nl/GeneSeeker/
The GeneSeeker allows you to search across different databases simultaneously, given a known human genetic location and expression/phenotypic pattern. The GeneSeeker returns any found gene names which are located on the specified location and expressed in the specified tissue. To search for more expression location in one search, just enter them in the textbox for the expression location and separate them with logical operators (and, or, not). You can specify as many tissues as you want, the program starts 20 queries simultaneously, and then waits for a query to finish before starting another query, to keep server loads to a minimum. You can also search only for expression, just leave the cytogenetic location fields blank, and do the query. If you only want to look for one cytogenetic location, only fill in the first location field, and the GeneSeeker will search with only this one. Housekeeping genes , found in Swissprot can be excluded, or genes that are to be excluded can be specified. Human chromosome localizations are translated with an oxford-grid to mouse chromosome localizations, and then submitted to the Mgd. Sponsors: GeneSeeker is a service provided by the Centre for Molecular and Biomolecular Informatics (CMBI).
Proper citation: GeneSeeker (RRID:SCR_008347) Copy
http://mips.gsf.de/services/genomes/uwe25/
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 15, 2013. This is the official database of the environmental chlamydia genome project. This resource provides access to finished sequence for Parachlamydia-related symbiont UWE25 and to a wide range of manual annotations, automatical analyses and derived datasets. Functional classification and description has been manually annotated according to the Annotation guidelines. Chlamydiae are the major cause of preventable blindness and sexually transmitted disease. Genome analysis of a chlamydia-related symbiont of free-living amoebae revealed that it is twice as large as any of the pathogenic chlamydiae and had few signs of recent lateral gene acquisition. We showed that about 700 million years ago the last common ancestor of pathogenic and symbiotic chlamydiae was already adapted to intracellular survival in early eukaryotes and contained many virulence factors found in modern pathogenic chlamydiae, including a type III secretion system. Ancient chlamydiae appear to be the originators of mechanisms for the exploitation of eukaryotic cells. Environmental chlamydiae have recently been recognized as obligate endosymbionts of free-living amoebae and have been implicated as potential human pathogens. Environmental chlamydiae form a deep branching evolutionary lineage within the medically important order Chlamydiales. Despite their high diversity and ubiquitous distribution in clinical and environmental samples only limited information about genetics and ecology of these microorganisms is available. The Parachlamydia-related Acanthamoeba symbiont UWE25 was therefore selected as representative environmental chlamydia strain for whole genome sequencing. Comparative genome analysis was performed using PEDANT and simap. Sponsors: The environmental chlamydia genome project was funded by the bmb+f (German Federal Ministry of Education and Research) and is part of the Competence Network PathoGenoMiK.
Proper citation: Protochlamydia amoebophila UWE25 (RRID:SCR_008222) Copy
http://www.phac-aspc.gc.ca/msds-ftss/
Material Safety Data Sheets for chemical products are available to laboratory workers for most chemicals and reagents. However because many laboratory workers, whether in research, public health, teaching, etc., are exposed to not only chemicals but infectious substances as well, there was a large gap in the readily available safety literature for employees. These MSDS are produced for personnel working in the life sciences as quick safety reference material relating to infectious micro-organisms. The MSDS are organized to contain health hazard information such as infectious dose, viability (including decontamination), medical information, laboratory hazard, recommended precautions, handling information and spill procedures. The intent of these documents is to provide a safety resource for laboratory personnel working with these infectious substances. Because these workers are usually working in a scientific setting and are potentially exposed to much higher concentrations of these human pathogens than the general public, the terminology in these MSDS is technical and detailed, containing information that is relevant specifically to the laboratory setting. It is hoped along with good laboratory practices, these MSDS will help provide a safer, healthier environment for everyone working with infectious substances. The MSDS is ran by the Public Health Agency of Canada. The Public Health Agency of Canada (PHAC) is the main Government of Canada agency responsible for public health in Canada. PHACs primary goal is to strengthen Canadas capacity to protect and improve the health of Canadians and to help reduce pressures on the health-care system. To do this, the Agency is working to build an effective public health system that enables Canadians to achieve better health and well-being in their daily lives by promoting good health, helping prevent and control chronic diseases and injury, and protecting Canadians from infectious diseases and other threats to their health. PHAC is also committed to reducing health disparities between the most advantaged and disadvantaged Canadians. Because public health is a shared responsibility, the Public Health Agency of Canada works in close collaboration with all levels of government (provincial, territorial and municipal) to build on each others skills and strengths. The Agency also works closely with non-government organizations, including civil society and business, and other countries and international organizations like the World Health Organization (WHO) to share knowledge, expertise and experiences.
Proper citation: Material Safety Data Sheets for Infectious Substances of Canada (RRID:SCR_013003) Copy
THIS RESOURCE IS NO LONGER IN SEVICE. Documented on August 19,2019.It hosts records of currently available essential genes among a wide range of organisms. For prokaryotes, DEG contains essential genes in more than 10 bacteria, such as E. coli, B. subtilis, H. pylori, S. pneumoniae, M. genitalium and H. influenzae, whereas for eukaryotes, DEG contains those in yeast, humans, mice, worms, fruit flies, zebra fish and the plant A. thaliana. Users can Blast query sequences against DEG, and can also search for essential genes by their functions and names. Essential gene products comprise excellent targets for antibacterial drugs. Essential genes in a bacterium constitute a minimal genome, forming a set of functional modules, which play key roles in the emerging field, synthetic biology.
Proper citation: DEG - Database of Essential Genes (RRID:SCR_012929) Copy
PhenomicDB is a multi-organism phenotype-genotype database including human, mouse, fruit fly, C.elegans, and other model organisms. The inclusion of gene indices (NCBI Gene) and orthologs (same gene in different organisms) from HomoloGene allows to compare phenotypes of a given gene over many organisms simultaneously. PhenomicDB contains data from publicly available primary databases: FlyBase, Flyrnai.org, WormBase, Phenobank, CYGD, MatDB, OMIM, MGI, ZFIN, SGD, DictyBase, NCBI Gene, and HomoloGene. We brought this wealth of data into a single integrated resource by coarse-grained semantic mapping of the phenotypic data fields, by including common gene indexes (NCBI Gene), and by the use of associated orthology relationships (HomoloGene). PhenomicDB is thought as a first step towards comparative phenomics and will improve the understanding of the gene functions by combining the knowledge about phenotypes from several organisms. It is not intended to compete with the much more dedicated primary source databases but tries to compensate its partial loss of depth by linking back to the primary sources. The basic functional concept of PhenomicDB is an integrated meta-search-engine for phenotypes. Users should be aware that comparison of genotypes or even phenotypes between organisms as different as yeast and man can have serious scientific hurdles. Nevertheless finding that the phenotype of a given mouse gene is described as ��similar to psoriasis�� and at the same time that the human ortholog has been described as a gene causing skin defects can lead to novelty and interesting hypotheses. Similarly, a gene involved in cancer in mammalian organisms could show a proliferation phenotype in a lower organism such as yeast and thus, give further insights to a researcher.
Proper citation: PhenomicDB (RRID:SCR_013051) Copy
A database of phylogenetic trees of animal genes. It aims at developing a curated resource that gives reliable information about ortholog and paralog assignments, and evolutionary history of various gene families. TreeFam defines a gene family as a group of genes that evolved after the speciation of single-metazoan animals. It also tries to include outgroup genes like yeast (S. cerevisiae and S. pombe) and plant (A. thaliana) to reveal these distant members.TreeFam is also an ortholog database. Unlike other pairwise alignment based ones, TreeFam infers orthologs by means of gene trees. It fits a gene tree into the universal species tree and finds historical duplications, speciations and losses events. TreeFam uses this information to evaluate tree building, guide manual curation, and infer complex ortholog and paralog relations.The basic elements of TreeFam are gene families that can be divided into two parts: TreeFam-A and TreeFam-B families. TreeFam-B families are automatically created. They might contain errors given complex phylogenies. TreeFam-A families are manually curated from TreeFam-B ones. Family names and node names are assigned at the same time. The ultimate goal of TreeFam is to present a curated resource for all the families. phylogenetic tree, animal, vertebrate, invertebrate, gene, ortholog, paralog, evolutionary history, gene families, single-metazoan animals, outgroup genes like yeast (S. cerevisiae and S. pombe), plant (A. thaliana), historical duplications, speciations, losses, Human, Genome, comparative genomics
Proper citation: Tree families database (RRID:SCR_013401) Copy
THIS RESOURCE IS NO LONGER IN SERVICE,documented on August 16, 2019. Fugu genome is among the smallest vertebrate genomes and has proved to be a valuable reference genome for identifying genes and other functional elements such as regulatory elements in the human and other vertebrate genomes, and for understanding the structure and evolution of vertebrate genomes. This site presents version 4 of the Fugu genome, released in October 2004 by the International Fugu Genome Consortium. Fugu rubripes has a very compact genome, with less than 15 consisting of dispersed repetitive sequence, which makes it ideal for gene discovery. A draft sequence of the fugu genome was determined by the International Fugu Genome Consortium in 2002 using the ''whole-genome shotgun'' sequencing strategy. Fugu is the second vertebrate genome to be sequenced, the first being the human genome. This webpage presents the annotation made on the fourth assembly by the IMCB team using the Ensembl annotation pipeline. We are continuing with the gap filling work and linking of the scaffolds to obtain super-contigs.
Proper citation: Fugu Genome Project (RRID:SCR_013014) Copy
http://mips.gsf.de/genre/proj/ustilago/
The MIPS Ustilago maydis Genome Database aims to present information on the molecular structure and functional network of the entirely sequenced, filamentous fungus Ustilago maydis. The underlying sequence is the initial release of the high quality draft sequence of the Broad Institute. The goal of the MIPS database is to provide a comprehensive genome database in the Genome Research Environment in parallel with other fungal genomes to enable in depth fungal comparative analysis. The specific aims are to: 1. Generate and assemble Whole Genome Shotgun sequence reads yielding 10X coverage of the U. maydis genome 2. Integrate the genomic sequence assembly with physical maps generated by Bayer CropScience 3. Perform automated annotation of the sequence assembly 4. Align the strain 521 assembly with the FB1 assembly provided by Exelixis 5. Release the sequence assembly and results of our annotation and analysis to public Ustilago maydis is a basidiomycete fungal pathogen of maize and teosinte. The genome size is approximately 20 Mb. The fungus induces tumors on host plants and forms masses of diploid teliospores. These spores germinate and form haploid meiotic products that can be propagated in culture as yeast-like cells. Haploid strains of opposite mating type fuse and form a filamentous, dikaryotic cell type that invades plant tissue to reinitiate infection. Ustilago maydis is an important model system for studying pathogen-host interactions and has been studied for more than 100 years by plant pathologists. Molecular genetic research with U. maydis focuses on recombination, the role of mating in pathogenesis, and signaling pathways that influence virulence. Recently, the fungus has emerged as an excellent experimental model for the molecular genetic analysis of phytopathogenesis, particularly in the characterization of infection-specific morphogenesis in response to signals from host plants. Ustilago maydis also serves as an important model for other basidiomycete plant pathogens that are more difficult to work with in the laboratory, such as the rust and bunt fungi. Genomic sequence of U. maydis will also be valuable for comparative analysis of other fungal genomes, especially with respect to understanding the host range of fungal phytopathogens. The analysis of U. maydis would provide a framework for studying the hundreds of other Ustilago species that attack important crops, such as barley, wheat, sorghum, and sugarcane. Comparisons would also be possible with other basidiomycete fungi, such as the important human pathogen C. neoformans. Commercially, U. maydis is an excellent model for the discovery of antifungal drugs. In addition, maize tumors caused by U. maydis are prized in Hispanic cuisine and there is interest in improving commercial production. The complete putative gene set of the Broad Institute''s second release is loaded into the database and in addition all deviating putative genes from a putative gene set produced by MIPS with different gene prediction parameters are also loaded. The complete dataset will then be analysed, gene predictions will be manually corrected due to combined information derived from different gene prediction algorithms and, more important, protein and EST comparisons. Gene prediction will be restricted to ORFs larger than 50 codons; smaller ORFs will be included only if similarities to other proteins or EST matches confirm their existence or if a coding region was postulated by all prediction programs used. The resulting proteins will be annotated. They will be classified according to the MIPS classification catalogue receiving appropriate descriptions. All proteins with a known, characterized homolog will be automatically assigned to functional categories using the MIPS functional catalog. All extracted proteins are in addition automatically analysed and annotated by the PEDANT suite.
Proper citation: MIPS Ustilago maydis Database (RRID:SCR_007563) Copy
http://projects.tcag.ca/humandup/
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. It contains information about segmental duplications in the human genome. The criteria used to identify regions of segmental duplication are: Sequence identity of at least 90, Sequence length of at least 5 kb, Not be entirely composed of repetitive elements. Background Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5 of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. Results Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90 identity. We have also detected that 38.9 Mb (1.28) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. Conclusion Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve. The segmental duplication data and summary statistics are available for download. Data for Human Genome (based on the May 2004 Human Genome Assembly (hg17)) Visualize duplication relationships in GBrowse (GBrowse) Duplicon Pair relationships (GFF) Genes within duplication regions (HTML) Genome duplication content (MS Excel) The segmental duplication data can be visualized in a genome browser in the GBrowse section. Selected human genome annotation tracks (except the segmental duplication track) have also been obtained from UCSC and loaded into the genome browser. Detailed information (e.g. overlapping genes, overlapping clones, detailed alignment) can be obtained by clicking on a duplication cluster in GBrowse. Both keyword search and BLAT search are available. Analyses based on previous human genome assemblies can be found in the Previous Analyses section. Acknowledgments We thank The Centre for Applied Genomics at the Hospital for Sick Children (HSC) as well as collaborators worldwide. Supported by Genome Canada the Howard Hughes Medical Institute International Scholar Program (to S.W.S.) and the HSC Foundation.
Proper citation: Human Genome Segmental Duplication Database (RRID:SCR_007728) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the RRID Resources search. From here you can search through a compilation of resources used by RRID and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that RRID has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on RRID then you can log in from here to get additional features in RRID such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into RRID you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within RRID that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.