Data Releases & Updates
PATRIC Data Release and Website Update. (29 June 2009)
Bacterial Genome Data
The following four Coxiella genomes have been annotated using PATRIC's automated genome annotation pipeline: Coxiella burnetii CbuG_Q212, Coxiella burnetii CbuK_Q154, Coxiella burnetii Dugway 5J108-111, Coxiella burnetii RSA 331. In addition, Coxiella burnetii RSA 493 has been reannotated for consistency. Coxiella ortholog groups and corresponding MSA are now available for comparative analysis across these five genomes. Metabolic pathways for these five genomes have been computed using Pathway Tools.
Two new Rickettsia genomes, Rickettsia endosymbiont of Ixodes scapularis and Rickettsia peacockii str. Rustic, have been added to the PATRIC database and have received PATRIC annotations. A previously added genome, Rickettsia rickettsii str. Iowa, has also received PATRIC annotations. Rickettsia ortholog groups, MSA and trees have been updated to include these new genomes. Metabolic pathways for these three genomes have been computed using Pathway Tools.
Viral Genome DataSix new Coronavirus genomes, five Hepatitis E virus genomes and twelve new Lyssavirus genomes have been added to the PATRIC database and are released with their primary RefSeq/GenBank annotations.
Web Site Enhancements- PubMed Integration Enhancements
A new Filter Publications by Scope option has been added on the Literature/PubMed integration page which allows users to see literature from other closely related genomes in the same organism class. (Example)
PATRIC Data Release and Website Update. (10 April 2009)
Bacterial Genome Data Brucella Mass Spec Data
Two new mass spec datasets have been added for Brucella melitensis biovar Abortus 2308 and Brucella abortus S19 genomes. Results of these experiments are available as direct evidence under the Experiment Data tab for these genomes and as indirect evidence (via ortholog groups) for all other Brucella genomes.
Brucella Host Response DataA new Brucella Host Response Experiment Dataset is now available under the Experiment Data tab for Brucella.
Metabolic PathwaysPATRIC has recently upgraded to Pathway Tools version 12.5. Using this new version of Pathway Tools, we have recomputed metabolic pathways for all of the bacterial genomes annotated by PATRIC. Pathway data can be accessed using links provided under the Pathways tab for individual genomes. (Example: Brucella suis 1330 Pathways).
Viral Genome DataSix new Coronavirus genomes have been added to the PATRIC database and fifteen new genomes have received manual curation. Ortholog groups, MSA, and trees have been updated to include newly curated genomes. PATRIC genome classification has been updated to include new genomes.
Five new Hepatitis E genomes have been added to the PATRIC database and have received manual curation. Ortholog groups, MSA, and trees have been updated to include newly curated genomes. PATRIC genome classification has been updated to include new genomes.
One new Lyssavirus genome has been added to the PATRIC database and has received manual curation. Ortholog groups, MSA, and trees have been updated to include newly curated genome. PATRIC genome classification has been updated to include new genome.
Web Site Enhancements- Gene Set Explorer Enhancements
We have enhanced Gene Set Explorer (GSE) to allow creation of new groups (gene sets) based on the area/items selected from existing groups.
PATRIC Data Release and Website Update. (9 February 2009)
Bacterial Genome Data
A new Brucella genome, Brucella str. BO2, is being released to the public for the first time on the PATRIC website. Draft sequencing of this genome was performed at CDC and it has received its primary annotation from PATRIC. Annotation includes protein coding genes, functional annotation, pseudogenes, RNA features, and riboswitches. Brucella ortholog groups, MSA, and trees have been updated to include this new genome.
Two new Coxiella genomes, Coxiella burnetii CbuG_Q212 and Coxiella burnetii CbuK_Q154 have been added to our database and are included in this release with primary RefSeq annotations.
Viral Genome DataThree new Calicivirus genomes have been added to the PATRIC database. PATRIC genome classification has been updated to include all new the genomes.
Thirteen new Coronavirus genomes have been added to the PATRIC database and five new genomes have received manual curation. Ortholog groups, MSA, and trees have been updated to include newly curated genomes. PATRIC genome classification has been updated to include new genomes.
Five Hepatitis A genomes that were classified as complete genomes are now classified as partial genomes. An alternative ORF for polyprotein has been annotated in each Hepatitis A genome. Ortholog groups, MSA, and trees have been updated to include the new protein.
Thirty-eight new Hepatitis E genomes have been added to the PATRIC database. Two new genomes have received manual curation. Ortholog groups, MSA, and trees have been updated to include newly curated genomes. PATRIC genome classification has been updated to include new genomes.
Web Site Enhancements- PubMed Integration
We have developed a simple but effective literature retrieval system that quickly identifies publications relevant to an organism, genome, or gene of interest using PubMed and Entrez Programming Utilities (eUtils) from NCBI. A user can access relevant publications from organism, genome, and gene level pages on the PATRIC website. Our system automatically derives search terms using genome metadata and/or functional annotation and database identifiers of a gene/protein. Multiple search terms are combined to form a search string and then queried in real-time against PubMed using eUtils. Results are parsed and presented on the website with summary and direct links to abstracts and full text articles (Example). In addition, results can be filtered by area of interest (i.e. Countermeasures, Diagnosis, Disease, Epidemiology, or Gene expression) and/or time period (i.e. past week, past month, or past year) using pre-computed links. Since our system always queries PubMed in real-time, results are always based on the latest information available in PubMed.
PATRIC Data Release and Website Update. (7 October 2008)
Bacterial Genome Data
The following four new genomes, three Brucella and a close relative, are being released to the public for the first time on the PATRIC website. All four genomes have received their primary annotation from PATRIC. Annotation includes protein coding genes, functional annotation, pseudogenes, RNA features, and riboswitches. Brucella ortholog groups, MSA, and trees have been updated to include new genomes.
| Genome Name | Sequence Source | Contigs | Size |
|---|---|---|---|
| Brucella sp. BO1 | CDC | 55 | 3366774 |
| Brucella ceti | LANL | 7 | 3389269 |
| Brucella abortus 2308 isolate A | LANL | 9 | 3277197 |
| Ochrobactrum intermedium | LANL | 4 | 4725392 |
Forty-one new Calicivirus genomes have been added to the PATRIC database. 176 genomes have received standardized annotation and product names. Ortholog groups, MSA, and trees have been updated to include all curated genomes. PATRIC genome classification has been updated to include all new the genomes.
Three new Coronavirus genomes have been added to the PATRIC database and have received manual curation. Ortholog groups, MSA, and trees have been updated to include new genomes. PATRIC genome classification has been updated to include new genomes.
Two new Hepatitis A genomes have been added to the PATRIC database and have received manual curation. Ortholog groups, MSA, and trees have been updated to include new genomes. PATRIC genome classification has been updated to include new genomes.
Thirteen new Hepatitis E genomes have been added to the PATRIC database and have received manual curation. Ortholog groups, MSA, and trees have been updated to include new genomes. PATRIC genome classification has been updated to include new genomes and the elevation of Avian hepatitis E virus to species rank by the ICTV last summer.
Twelve new Lyssavirus genomes have been added to the PATRIC database. Twenty-six genomes have received standardized annotations. Lyssavirus ortholog groups, MSA, and trees have been updated to include newly curated genomes. PATRIC genome classification has been updated to include new genomes.
Web Site Enhancements- Multiple Gene Groups and Gene Set Explorer
PATRIC website now supports creation of multiple gene sets/groups. A user can now save genes of interest and results from various searches as a new group or add them to one of the previously created groups. All the created groups and their members are listed on the Group Tools and Management Page (which replaces old Cart Page). From here, user can create new groups, delete/edit existing groups, or download groups and their members to the local machine. We have also integrated a group manipulation tool called Gene Set Explorer (GSE). GSE was developed within VBI. User can select multiple groups and launch GSE. GSE displays groups as table and as Venn diagram. User can interact with and manipulate Venn diagram. When one or more areas in the Venn diagram are selected, the corresponding records in the table are highlighted. Venn diagram can also be saved as an image.
- Unclosed Genomes
For the first time, unclosed bacterial genome are displayed on the PATRIC website. Organism Overview and Genome Overview pages are modified to display genomes with multiple contigs.
- Google Search
We have added Google Search to the PATRIC website. It is present in the header and can be accessed from any page on the website. In addition to searching entire PATRIC site for a given search term, it also searches entire web, pubmed, news, images, video, and books. Search results are organized and presented as different tabs. Here is a sample result page for keyword Brucella.
PATRIC Data Release. (3 September 2008)
Bacterial Genome Data
Genomic sequence (in draft stage) for a new Brucella genome, Brucella BO1, is being released to the public for the first time on the PATRIC Download pages. The contigs were supplied to PATRIC by the Centers for Disease Control and are provided in FASTA format. Annotation of the sequence will be available in the next PATRIC data release.
PATRIC Data Release and Website Update. (11 June 2008)
Bacterial Genome Data
A new Brucella genome, Brucella melitensis ATCC 23457, is being released to the public for the first time on the PATRIC website. The genomic sequence was supplied to PATRIC by LANL and is has received its primary annotation from PATRIC. Annotation includes protein coding genes, pseudogenes, RNA features, and riboswitches.
Total 144 new pseudogenes were Identified and annotated in Brucella genomes. As a result, 291 previously annotated CDS have been removed. Brucella ortholog groups, MSA, and trees have been updated to include new genome.
Total 511 new pseudogenes were identified and annotated in Rickettsia genomes. As a result, 1107 previously annotated CDS have been removed. Rickettsia ortholog groups, MSA, and trees have been updated to reflect annotation changes.
A new Coxiella plasmid, Coxiella burnetii 'MSU Goat Q177', has been loaded into the database and is released on the PATRIC website with primary RefSeq annotation. A new Coxiella unclosed/draft genome, Coxiella burnetii RSA 334, is now available on the PATRIC FTP site.
Viral Genome DataNineteen new Calicivirus genomes have been added to PATRIC database. PATRIC genome classification has been updated to include new genomes.
Four new Coronavirus genomes have been added to the PATRIC database. Seven new genomes have received manual curation. In addition, gene and product names have been standardized for all of the features in previously annotated genomes. Ortholog groups, MSA, and trees have been updated to include newly curated genomes. PATRIC genome classification has also been updated to include all new genomes.
Sixteen new Lyssavirus genomes have been added to the PATRIC database and fourteen of them have received standardized annotations. Lyssavirus ortholog groups, MSA and trees have been updated to include newly curated genomes. PATRIC genome classification has aslo been updated to include all new genomes.
Web Site Enhancements- Partial Genomes for Viruses
Total 24,266 partial genome sequences related to PATRIC viral pathosystems have been obtained from GenBank and loaded into the PATRIC database. All of the partial genome sequences are listed under the Partial Genome Sequence Tab under the Genomes Tab for each of the viral pathosystems (Example: Lyssavirus Partial Genome Sequences). From the partial genome list, one can navigate to Genome Summary Page, Genome Browser, or Feature Table. One can also download the list of partial genomes as Excel or text file. Genome Finder Tool has been enhanced to allow one to quickly search for partial genome sequences using keyword, taxonomy id, sequence size, country of origin, or host species. Genomic Features Search Tool now allows one to quickly gather genomic features of interest from partial and/or complete genome sequences. Partial genome sequences and corresponding gene and protein sequences can also be searched using the BLAST Search Tool.
- Integration of Swiss-Prot (SIB) Annotations
Total 4,621 of the PATRIC bacterial proteins have been mapped to corresponding Swiss-Prot (SIB) annotation records. Number of proteins with SIB annotations is presented on Genome Overview Page (Example: Brucella suis 1330). The number is hyperlinked to the Feature Table, allowing one to quickly review the list of features with SIB annotations. SIB Annotations, when available, are displayed on the Feature Overview Page (Example). Any protein features annotated by SIB are displayed on the AA Evidence Page (Example).
- Integration and Visualization of PDB Structures
PATRIC annotated proteins are mapped to corresponding PDB structures using BLASTP search. Mappings are organized in three distinct categories: exact match, partial match, and similar. Total 804 proteins have been mapped to 36 PDB structures as exact or partial matches, while 4580 more proteins have been mapped to total 1477 PDB structures based on sequence similarity. For each genome the number of proteins with PDB structures is summarized on the Genome Overview Page. The number is hyperlinked to the Feature Table, allowing easy access to feature with PDB structures. At organism, genome, and feature levels, a new 3D Structure Tab has been added which shows list of PDB structures available (Example). From here, one can navigate to the 3D Structure Visualization Page (Example). On this page, 3D structure of the protein is displayed using JMOL viewer. Areas of interest, such as pre-computed IEDB epitopes, InterPro Domains, and Swiss-Prot annotated protein features can be highlighted on the 3D structure view.
- Literature Data
Literature references parsed from the GenBank submission files and Swiss-Prot annotations are displayed under Literature Tab at the Organism, Genome and Feature levels. The available literature set for any organism can also be searched by keywords using the Literature Search Tool.
PATRIC Data Release and Website Update. (18 April 2008)
Bacterial Genome Data
A new mass spec dataset has been added for Brucella abortus S19 genome. Results of this experiment are available as direct evidence under the Experiment Data tab for the genome.
A new Rickettsia genome, Rickettsia rickettsii str. Iowa (NC_010263), has been added to our database. It is included in this release with primary RefSeq annotations. Four of the whole-genome-shotgun sequence records annotated at PATRIC have been superceded by new assembly of the genomes at NCBI. Since the original genome sequences have not changed, the annotations at PATRIC remain the same.
A microarray dataset has been added for Rickettsia conorii Malish 7 genome. Experiment Results are available as direct evidence under the Experiment Data tab for the genome. Results from the same microarray experiment are also available as indirect evidence (via ortholog groups) for all other Rickettsia genomes.
All of the bacterial genomes have received complete and improved RNA annotations.
Viral Genome DataOne new Calicivirus genome has been added to the PATRIC database and has received standardized annotations. PATRIC genome classification has been updated to include the new genome.
One new Coronavirus genome has been added to the PATRIC database and Thirty-five additional genomes have received standardized annotations. Coronavirus ortholog groups, MSA and trees have been updated to include newly annotated genomes. PATRIC genome classification has also been updated. Nine additional genomes have been designated as Reference Genomes.
One new Hepatitis A genome has been added to the PATRIC database and has received standardized annotations. This genome has also been added to the PATRIC genome classification.
One new Hepatitis E genome has been added to the PATRIC database and has received manual curation. New annotations for specific protein domains have been provided for all Hepatitis A virus genomes.
Two new Lyssavirus genomes have been added to the PATRIC database and have received standardized annotations.
Web Site Enhancements- Integration of Rickettsia Microarray Data
Microarray data from Rickettsia conorii Malish 7 genome were integrated with PATRIC. Experiment results are available as direct evidence under the Experiment Data tab for this genome. Results from the same experiment are also mapped to proteins in other Rickettsia genomes via ortholog groups and presented as indirect evidence. The Experiment Data Search Tool has been enhanced to allow querying on microarray data.
- New Genomic Feature Search Tool
The Gene/Protein Search Tool has been replaced by a new Genomic Feature Search Tool. This tool allows searching on not only protein coding genes and CDSs, but all other DNA feature types.
- BLAST Improvements
New organism specific BLAST libraries containing full genome, gene, protein sequences have been created. They are available through the BLAST Search page.
PATRIC Data Release and Website Update. (31 January 2008)
Bacterial Genome Data
The Brucella abortus S19 genome has been sequenced and the annotated genome is being released to the public for the first time on the PATRIC website. Its annotation includes RNA species and identification of pseudogenes. With regard to pseudogene identification, the boundaries of each pseudogene have been identified, however the precise location of gene disruptions have not been curated. Brucella ortholog groups, multiple sequence alignments (MSAs), and trees have been recalculated to include this new genome.
A new mass spectrometry experiment has been released under the experiment data tab.
Two new Rickettsia genomes, Rickettsia africae and Rickettsia massiliae, have been added to the public database and received PATRIC annotations, which include annotation of RNA species and pseudogenes. Rickettsia ortholog groups, multiple sequence alignments (MSAs), and trees have been updated to include these two genomes.
A new Coxiella genome, Coxiella burnetii RSA 331, has been added to our database. Also, the genome that was previously described as Coxiella burnetii Dugway 7E9-12 has been renamed as Coxiella burnetii Dugway 5J108-111 to reflect the naming correction made by the sequencing center. Coxiella burnetii RSA 331 genome and Coxiella burnetii Dugway 5J108-111 are currently presented with their primary annotations. The proteins on the Coxiella plasmids have received manual curation.
Viral Genome DataSixteen new Coronavirus genomes have been added to the PATRIC database, and additional nine genomes have received standardized annotations.The classification of Coronavirus genomes has been updated.Corona ortholog groups, multiple sequence alignments (MSAs), and trees have been recalculated.
Five genomes have been added to the PATRIC database.
Three new Hepatitis E genomes have been added to the PATRIC database, and the genome classification scheme has been updated. A new Hepatitis E species tree based on the full genome nucleotide sequence has been built and is available with this release. The proposed genotype, Genotype 5, is not represented in this tree since its level of divergence from the other sequences prevented a good tree from being constructed.
Two new Lyssavirus genomes have been added to the PATRIC database, and these have received standardized annotations.
Web Site Enhancements- Experiment Data
Mass Spectrometry data from Brucella were integrated into the PATRIC system in the October 2007 website update. In this release refinements to both the user interface and database have been made. These refinements include a distinction in the database between data that was generated using the same genomic strain and data that can be mapped from a different strain; direct and indirect evidence, respectively. This is clearly presented when looking in the Experiment Data tab for individual genomes and individual genes. The tables that display experimental data have been reformatted to list genes just a single time, rather than listing the data for each experimental condition for each gene in a single table. Users can now drill down to the data on experimental conditions for the genes in which they are interested. The data is now summarized in a scatter plot with supporting data for each peptide that contributes to the data for that protein. A new query for experimental data has been implemented with a look and feel that is consistent with PATRIC's other queries. The user can specify the genomes for which they are interested, whether they are interested in direct or indirect data, and which keywords or annotations they would like to use in their query. (Link to Brucella Experiment Page)
PATRIC Data Release and Website Update. (15 October 2007)
Bacterial Genome Data
A new Brucella genome is included in this release, Brucella suis ATCC23445. The Brucella suis ATCC23445 sequence was supplied to PATRIC by Los Alamos National Labs (LANL) and received its primary annotation from PATRIC. The Brucella ovis genome has also received a PATRIC annotation in this release for a total of seven Brucella genomes with PATRIC provided/updated annotations. Ortholog groups have been calculated to include the seven PATRIC-curated Brucella genomes. The seven PATRIC-curated Brucella genomes have been evaluated for split genes resulting from frame-shifts or nonsense codons using the GenVar program (download GenVar). The segments of these split genes have been joined and curated as pseudogenes, though some of these potential pseudogenes may be the result of sequencing errors. The resulting proteins have been analyzed by our Protein Annotation Pipeline (PAP), and are included as members of ortholog groups to permit comparison across genomes. Multiple sequence alignments and trees have been created for the new ortholog groups.
For Rickettsia, we have added an analysis of microarray data evaluating the transcriptional effects of nutrient-limiting conditions. The analysis and data are presented under the "Collaborative Research" tab for Rickettsia.
Viral Genome DataThis data release includes 33 new genomes for Calicivirus genomes, 3 new genomes for Coronaviruses, 2 new genomes for Hepatitis E viruses, and 1 new genome for Lyssaviruses. In addition 13 Hepatitis A genomes, 9 Lyssavirus, and many Coronavirus genomes have received manual curation. For Coronaviruses, we've also updated all of our locus tags. Ortholog groups, multiple sequence alignments, and trees have been recalculated to include these newly curated genomes. A mapping of old and new locus tags is available under the download section.
Enhancements to functionality of the website:- New Search Tools
A Genome Finder tool allows one to find genomic sequences of interest by keyword (organism name, genome name, accession, etc.), NCBI Taxon ID, or GI number. Also, a Genomic Pattern Search allows one to search any pattern (defined as regular expression) in the complete sequences of one or more selected genomes. This version will search only against genome sequences, but the next version will allow users to find protein sequences.
- BLAST Search Improvements
We have enabled users to BLAST against complete genome sequences in this release. Sequences in the BLAST report can now be added directly to the feature cart.
- Improved Visibility of Reference Genomes
Reference genomes are now indicated with a visual cue
in a number of lists and tables throughout the site, including the ortholog group filtering mechanism. - Improved Representation of Multi-segment Features in Table Layouts
The feature table now represents features with multiple segments on a single row. An added visual cue
distinguishes these features. These features will include the CDSs stitched together based on GenVar data, as well as known frame shifts and splice sites. - Genome Classification on Viral Organism Landing Pages
The main page for each viral organism now displays the PATRIC-defined classification of available genomes. These classification schemes have been derived from literature and in collaboration with our organism experts. This should enable users to more easily find their genomes of interest directly from each viral organism's home page.
The mechanism to filter and search for ortholog groups has been laid out more intuitively, allowing users to find ortholog groups with specific genome memberships, specific keywords, or of specific sizes. Also, histograms at the top of the ortholog pages are now clickable. Clicking on a bar within the histogram will display the ortholog groups consisting of the number of members indicated in the histogram.
PATRIC Data Release and Website Update. (17 August 2007)
Bacterial Genome Data
Two new Brucella genomes are included in this release, Brucella ovis ATCC 25840 and Brucella canis. Brucella canis was supplied to PATRIC by Los Alamos National Labs (LANL) and received its primary annotation from PATRIC. Brucella ovis ATCC 25840 maintains the annotation provided by its sequenceing center (JCVI, formerly known as TIGR) in this data release. The species tree for Brucella genomes has been recalculated to include these new genomes. Ortholog groups have been calculated to include the five PATRIC-curated Brucella genomes. The five PATRIC-curated Brucella genomes have been evaluated for split genes resulting from frame-shifts or nonsense codons using the GenVar program (download GenVar). The segments of these split genes have been joined and curated as pseudogenes, though some of these potential pseudogenes may be the result of sequencing errors. The resulting proteins have been analyzed by our Protein Annotation Pipeline (PAP), and are included as members of ortholog groups to permit comparison across genomes.
Rickettsia protein annotations have have been manually reviewed by the PATRIC curation team. One member of each ortholog group has been manually curated. In this release we are also removing 1580 previously released gene predictions, which may be the result of gene overprediction by GeneMark or Glimmer. They all have no orthologs in other PATRIC Rickettsia genomes, are shorter than 100 amino acids, and have no identified homology in NCBI's BLAST databases, and have no predicted protein domains.
Viral Genome DataThis data release contains 26 new genomes for Calicivirus, 24 new genomes for Coronaviruses, 12 for Hepatitis A, 2 for Hepatitis E, 7 for Lyssaviruses. These numbers include both complete and nearly complete genomes. Forty-five Coronavirus genomes have received standardized PATRIC annotations, and their genes and mature peptides are now included in Coronavirus ortholog groups.
SARS Coronavirus 3D StructuresSolved 3D structure data for SARS Coronavirus proteins are now available from the PATRIC website. Users can gain access to this information through the 3D Structure tab on the Coronavirus page. These structures are provided through the Resource Center for Biodefense Proteomics Research.
Brucella abortus 2308 Mass Spectrometry DataMass spectrometry data from the outer membrane fraction of Brucella abortus 2308 and related mutant strains are now available from the PATRIC website. Users can gain access to this information through the Experimental Data tab on the Brucella page, as well as Experimental Data tabs for Brucella abortus 2308 and individual genes. These data are provided through the Resource Center for Biodefense Proteomics Research.
Brucella suis Genome Prioritization DataTo facilitate downstream research on validation and development of countermeasures we are using bioinformatics methods for prioritizing the pathogen genomes. As a first step we have evaluated a number of data for Brucella suis. We have taken into account information on pathways, druggable protein domains, protein localization, and literature on essential and virulence genes in this analysis. Methods and results of this analysis are available through the Collaborative Research tab on the Brucella page.
PATRIC Website Update. (01 June 2007)
This update contains significant changes to the PATRIC website. The changes were designed to make the website easier to use and get the information that you are looking for. Some of the highlights are detailed below. We encourage you to use the site Feedback capability to provide us input on the features that are provided as well as requests to consider for future work or questions about the current site.
Additional Support for Comparative GenomicsEnhanced comparative genomics capabilities allow you to make time-saving comparisons across genomes, such as finding features of interest that are unique to specific genomes, or conversely, features that are common to all but one of many related genomes. At the feature level, you can perform a multiple sequence alignment across members of an ortholog group, and visually arrange members (and their sequences) based on phylogeny. At the pathway level, you can now perform comparative analysis examining relationships between PATRIC's bacterial reference genomes and the human host genome.
A Suite of New Searches and Sophisticated Context-Sensitive FilteringWe have added nine new specialized searches to provide rich querying of PATRIC's data. You can query against PATRIC's annotation data, RefSeq/GenBank data or both. You can search for epitopes, GO Terms, EC number, InterProScan, TIGRFam, PFam, BLOCKS, COGS, and more. In addition to the searches, you can narrow your search results by using the new context-sensitive filtering tools. For example, when presented with a list of ortholog groups, you can filter based on genome membership, size of group or keyword (such as protein function).
Enhanced Support for Collecting Related Sequences of InterestAn improved feature cart (much like a shopping cart) has been added throughout the site, allowing you to gather sequences of interest while you work. The feature cart allows you to collect large sets of features from organism feature tables; sets of related features from the ortholog group pages; sets of related features from search results tables; and from individual feature pages. Once collected, these features can be exported as FASTA DNA or FASTA Protein sequences.
New Website Organization, Navigation, and Look-and-FeelWe have re-designed the website to provide more flexible navigation between and among website areas. Specifically, we provide support for "organism-centric" task flows for those interested in specific properties and features of specific genomes, as well as support for "search-centric" task flows for those interested in locating resources both within and across PATRIC organisms. The look and feel of the new website has been upgraded as well, leveraging the latest web-technology to provide application-like productivity.
PATRIC Data Release and Website Update. (16 April 2007)
This data release contains new genomes for Coronaviruses (18 genomes), Hepatitis A (2 genomes), and Lyssaviruses (14 genomes). The new Hepatitis A and Lyssavirus genomes have had their DNA annotations and gene product naming standardized. Nineteen Coronavirus genomes have received this standardization in this release as well.
The functional annotation of the gene products for Coxiella burnetii RSA493 has been manually reviewed by the PATRIC curation team to ensure proper naming, G.O., and E.C. number assignments. Based on this updated functional annotation, pathways have been reconstructed computationally for Coxiella burnetii.
To accommodate the new genomes, phylogenetic trees have been reconstructed for Hepatitis A, Lyssaviruses, and Coronaviruses. Also a new tree has been constructed for the Caliciviruses, which is based on the amino acid sequence of the capsid protein rather than ORF1. Additional phylogenetic trees for Lyssaviruses are now available, and provide broader Lyssavirus coverage than the whole genomes available through PATRIC.
Ortholog groups have been rebuilt to incorporate the proteins of the new genomes added to the database. For the first time ortholog groups are also available for Caliciviruses and Coronaviruses.
PATRIC Data Release and Website Update. (15 January 2007)
This release contains newly curated genomes and website enhancements. Updated curation includes standardization of protein product names and names of mature peptides for all Calicivirus genomes and curation of 10 additional Hepatitis A genomes. Additionally, all bacterial protein products have received updated functional curation through our Automated Protein Curation Pipeline (APCP), which adopts functional annotation from TIGRfams, SwissProt, and BLAST hits to the NCBI non-redundant database in decreasing order of preference. The SOP for this pipeline is posted on Standard Operating Procedures page.
The reference genomes for the bacterial pathogens have had RNAs curated. We improve upon traditional tRNAscan-SE prediction of tRNA genes by coupling it with a second tRNA-finding program, Aragorn. We determine the endpoints of small subunit (16S) and large subunit (23S and 5S) rRNAs using secondary-structure-based multiple sequence alignments, with trimming to match the endpoints of the E. coli RNAs. Profiles and tools from Rfam are applied to identify many small RNA genes in genomic sequences. Among the RNA genes that can be detected this way are those for trans-acting regulatory RNAs, riboswitches, RNase P RNA, SRP RNA, 6S RNA, plasmid replication RNAs, retron RNAs, self-splicing introns and other ribozymes, the rRNA modification guide RNAs of Archaea, and other microbial RNAs. The gene table has been updated with a filter that allows the user to selectively display CDS, RNA, and/or mature peptides.
We have integrated the Immune Epitope Database Data with our sequence data. The first iteration of the interface for this data can be reached from the Epitope Search page. Additionally, each protein information page contains a list of any epitopes present in the protein sequence within the evidence section of the page.
Two bioinformatic analysis "special projects" have been carried out as use cases for building new pipelines. The rationale, workflow, and results of these special projects are available on Coxiella Collaborative Research page and Lyssavirus Collaborative Research page. The first of these projects aims to identify the secreted or membrane-attached proteins for Rickettsia and Coxiella genomes as one approach to identifying potential vaccine candidates. The second project aims to design PCR primers that broadly amplify Lyssavirus genomes. We anticipate that the results of additional projects will become available as projects are completed.
Only minor changes have been made to website functionality with this release. In order to better organize genomes, particularly for pathogens with a large number of genomes, we have implemented a "Group Name" column that specifies the groups under which the genomes are classified. Generally, this is one reference genome and all of its associated genomes. We hope that this facilitates easier access to genome data. The other enhancement, as described above, is the ability to filter the gene table for any genome for CDS, mature peptides, and RNAs.
