Value Addition Metrics
| Evaluation Date | Accuracy | Completeness | Consistency | |||||
|---|---|---|---|---|---|---|---|---|
| % incorrect functional names | % suspicious functional names | overall % correct functional names | functional names | genetic names | GO assign -ment | EC number | consistency measure | |
| Oct-06 | 7.0% | 7.3% | 85.6% | 100.0% | 57.5% | 23.6% | 30.1% | 26.8% |
| Apr-07 | 0.3% | 0.2% | 99.5% | 99.9% | 96.9% | 96.9% | 96.8% | 96.4% |
| Feb-08 | 0.6% | 0.4% | 99.1% | 99.9% | 96.1% | 89.4% | 96.7% | 93.4% |
| Evaluation Date | % Consistency using Same Function | ||
|---|---|---|---|
| FIGfams as a basis | TIGRfamEquivalogs as a basis | TIGRfams as a basis | |
| Feb-08 | 84% | 94% | 89% |
All BRCs responsible for bacterial pathogens are assessing the value they add to primary genome annotations through an agreed upon, common set of metrics. These are accuracy, completeness, and consistency, and the means by which each is calculated is described below.
Accuracy
TIGR will conduct a sampling of BRC genes to test for accuracy. This will be done by searching the BRC genes against 100 randomly chosen TIGRFAM equivalog HMM to provide the set of genes that should have the same function. A human curator will inspect a small set of results and make a subjective assessment of the correctness of the functional name assignment. The statistic will be reported as a percentage of "correct" functional name annotations. After a period allowing for evaluation and refinement by the BRCs, the accuracy statistics will be made available to the general scientific community on the BRC-central web site.
Completeness
TIGR will perform an exhaustive search of BRC genes against TIGRFAM equivalog HMMs to identify sets of genes that have the same function. The TIGRFAMs members that contain functional names, genetic names, GO ids, and EC#'s will serve as the source of datatypes that are expected to appear in the BRC genes. Completeness will be measured by counting the number of functional names, genetic names, GO ids, and EC#'s that have been assigned and submitted in GFF3 to BRC-central. Note that this metric does not attempt to assess the correctness of the annotations, only that an annotation is provided. The completeness statistic will be reported as a percentage of possible annotations, based on the metric: (number of actual annotations) / (number of expected annotations) After a period allowing for evaluation and refinement by the BRCs, the completeness statistic will be made available to the general scientific community on the BRC-central web site.
Consistency
TIGR will perform an exhaustive search of BRC genes against TIGRFAM equivalog HMMs to identify sets of genes that have the same function. Each set of genes will be expected to have consistent functional names. However, consistency will be measured for functional name assignments within each BRC, not for assignments across centers. The functional names from a BRC will only be compared to each other, the names will not be compared against the TIGRFam name. The consistency statistic will be reported as the likelihood of any 2 genes having the same annotated text string. After a period allowing for evaluation and refinement by the BRCs, the consistency statistic will be made available to the general scientific community on the BRC-central web site.
