![]() |
![]() |
![]() |
|
![]() |
|
![]() |
|
Category: Diagnostic testing. Evaluation of diagnostic tests involves some subtle but important issues in Statistics. These webpages show some interesting examples of diagnostic tests, offer pointers for critical evaluation of studies of diagnostic tests, and present practical applications of diagnostic tests in your day-to-day medical practice. Articles are arranged by date with the most recent entries at the top. You can find the theme and closely related categories, definitions, and other resources at the bottom of this page.
Stats: ROC curve for an imperfect gold standard (March 12, 2008). Someone asked me about how to use an ROC curve if you have more than two categories. Apparently the gold standard that the researchers were using was known to be imperfect, so they wanted an intermediate category (possible disease).
Stats: Does prevalence affect sensitivity (January 31, 2008). Dear Professor Mean, Does lowering the prevalence of a disease have an effect on sensitivity?
Stats: Postlude to my Dallas talk (November 11, 2007). I gave a talk this morning to the American College of Allergy, Asthma & Immunology. I documented my preparations for this talk on my webpages and wanted to share some thoughts I had during and after the talk.
Stats: Handout for diagnostic testing (November 6, 2007). I have been busy preparing a handout describing the basics of diagnostic testing (e.g., sensitivity and specificity), the medical issues associated with these tests (e.g., the difficulty in testing for a rare disease, the need to balance the costs of false positives and false negatives), and applications of the likelihood ratio. I also show how to use the likelihood ratio slide rule.
Stats: Continuing education questions for a talk on diagnostic tests (July 24, 2007). As part of my talk to the American College of Allergy, Asthma & Immunology, I have been asked to present two questions related to my topic (Use of Diagnostic Tests for Making Clinical Decisions). These questions would consist of a brief clinical stem followed by four choices on how to manage the situation. These will be presented prior to my talk and then afterwards to see how effective the training is.
Stats: Classic calculations for a diagnostic test (July 20, 2007). I created a table that illustrates many of the classic calculations for a diagnostic test.
Stats: Code for drawing new likelihood ratio slide rule (July 12, 2007). I have made some minor changes to my likelihood ratio slide. The original code was lost somewhere, so I wrote some new code and added documentation. I also changed the orientation of the slide rule so it can be held horizontally and shaded the regions that need to be cut out or away.
Stats: Recommendations from Sackett et al for evaluating a diagnostic test (July 2, 2007). There is a lot of controversy about diagnostic testing, and I have mentioned some of these controversies in other weblog entries. I wanted to review what the experts say about diagnostic testing. The definitive resource for evaluating any medical controversy is Evidence-based Medicine How to Practice and Teach EBM. David L. Sackett, Scott W. Richardson, William Rosenberg, Brian R. Haynes (1998) Edinburgh: Churchill Livingstone.
Stats: Use of diagnostic tests for making clinical decisions (June 15, 2007). I'm giving a talk for the American College of Allergy, Asthma, and Immunology with the title "Use of diagnostic tests for making clinical decisions." Here's an abstract of this talk.
Stats: Applying likelihood ratios in your head (June 1, 2007). Someone sent me a nice email complimenting my likelihood ratio slide rule. He/she also pointed out a simple way to apply likelihood ratios in your head.
Stats: Quantifying the ability of dreams to predict the future (April 10, 2007). Someone wrote to me about a diary they had kept for the past eight years about their dreams. About every other month or so, a dream of theirs came true. I was asked if I could quantify the likelihood of successful predictions. Assessing psychic phenomena is outside my area of expertise, but I offered a few general suggestions, partly because I thought that an analogy to diagnostic testing was interesting.
Stats: What makes a good diagnostic test? (April 6, 2007). I've been invited to give a talk at the annual meeting of the American College of Allergy, Asthma & Immunology. The tentative title of the talk is "What makes a good diagnostic test?" It will be part of a plenary session and I'll be followed by two speakers debating the merits of two particular diagnostic tests. I don't have a lot of details at this time, but as I develop my talk, I'll put details here on this weblog.
Stats: Incorporating risk factors into diagnostic test calculations (November 9, 2006). A contributor to the Evidence-Based Health email discussion group (PK) raised an interesting question about how to incorporate information about risk factors when applying the results of a diagnostic test. When you are estimating a pre-test probability for a diagnostic test, you need to take three steps: (1) find an estimate of the prevalence of the disease in the general population, (2) modify this estimate based on characteristics of your particular practice, and (3) further modify this estimate based on characteristics of the individual patient that is currently sitting in front of you.
Stats: Mathematical derivation of the odds form of Bayes theorem (October 16, 2006). I had included some rather technical details on my web page about likelihood ratios, but I thought it would be best to move it to a separate page.
Stats: Calculations involving diagnostic tests using open source abstracts (October 5, 2006). I spent a few hours reviewing 200+ abstracts published in BiomedCentral that had the words "sensitivity" and "specificity" in the title. There were four which had enough information in the abstract to be used as teaching examples on how to calculate sensitivity, specificity, positive predictive value, and/or negative predictive value.
Stats: A novel diagnostic test (January 26, 2006). A recently published article on diagnosing cancer got a lot of press. The article, Diagnostic Accuracy of Canine Scent Detection in Early- and Late-Stage Lung and Breast Cancers. McCulloch M, Jezierski T, Broffman M, Hubbard A, Kirk Turner, Janecki T. Integrative Cancer Therapies 2006: 5(1); 1-10., noted that canines have an unusually sensitive sense of smell and might be able to diagnose cancer by sniffing breath sample from human patients. This is rather intriguing, since dogs have already been trained to locate explosives, cadavers, drugs, and so forth.
Stats: An error slips through the peer review process (September 19, 2005). A group of residents wanted me to look at an article because they were confused about the calculation of the likelihood ratio. The numbers that they got were quite different from those in the publication. It turns out that they were calculating things correctly, and did not realize that the paper had several serious errors in some of the more fundamental calculations of sensitivity and specificity.
Stats: Likelihood ratio--extra information (August 3, 2005). In a meta-analysis of studies of diagnosing anemia (Guyatt 1992 JGIM 7(2): 145-53), Serum ferritin was discovered to be the most effective test. Here are the results of this test
Stats: The costs of a false positive test (March 1, 2005). The New York Times had an excellent article on newborn screening tests, .Panel to Advise Testing Babies for 29 Diseases. Kolata G. The New York Times, February 21, 2005. Unfortunately, this article is no longer available online. But it discusses a recent push to standardize and expand the screening tests for newborns to include 29 different diseases.
Stats: Spectrum Bias (January 4, 2005). I tried to start a page on diagnostic tests a while back, but have not had the time to fully develop it. One of the important issues for diagnostic tests is spectrum bias. The sensitivity and specificity of a diagnostic test can depend on who exactly is being tested. Think of disease as a range of possibilities from slight to moderate to extreme. If only a portion of the disease range is included, you may get an incorrect impression of how well a diagnostic test works. This is known as spectrum bias.
Stats: Unnecessary diagnostic tests (October 25, 2004). You would think that you can never have enough information about your health. Barring financial considerations, the more testing the better. That actually is not true. In some situations, too many diagnostic tests are being run, and it hurts rather than helps the patient. American Medical News has an article about this, Lab tests go under a critical microscope Experts point out that good tests used badly can lead to bad medicine. Victoria Stagg Elliott. Nov. 1, 2004. www.ama-assn.org/amednews/2004/11/01/hlsd1101.htm. They offer several good examples.
Stats: Full-Body Computed Tomography Screening (September 6, 2004). Full body scans represent a good example of the conflicting considerations when you need to evaluate a screening test. A full body scan uses a CT (Computerized Tomography) scan to examine the inside of your body. These full body scans are heavily advertised as a way to detect physiologic abnormalities that might provide an early warning of cancer, heart disease, or other illnesses. Many organizations, including the U.S. Food and Drug Administration strongly discourage the use of full body scans in healthy adults with no obvious symptoms of disease.
Stats: Unbalanced sample sizes for evaluating a diagnostic test (August 5, 2004). I get a lot of questions about unbalanced sample sizes. Quite often the mechanics of the research protocol make it easier to find a lot of patients in one group and only a few in another group. For example, someone is evaluating a diagnostic test and notes that only 16% of the patients in the study will actually have the disease being tested for. Will this cause any bias, he wonders? Any loss in precision? You will lose some precision, but there is no bias of any kind.
Stats: Evaluating the AUC for an ROC curve (July 27, 2004). Someone asked me where I got the following guidance for Area Under the Curve (AUC) for a Receiver Operating Characteristic (ROC) curve: 0.50 to 0.75 = fair, 0.75 to 0.92 = good, 0.92 to 0.97 = very good, 0.97 to 1.00 = excellent. I cannot find where I got these numbers. It must be a sign of senility on my part.
Stats: Pap smears for women without a cervix (June 24, 2004). In the most recent issue of JAMA is an article by Sirovich and Welch, Cervical Cancer Screening Among Women Without a Cervix that estimates almost 10 million women in the United States have received a pap smear unnecessarily because they have had a full hysterectomy and no longer have a cervix. For women who have had only a partial hysterectomy or where the hysterectomy was done for cervical neoplasia, regular pap smears are recommended. For the other women, though, this is an unnecessary test, because the pap smear is trying to detect cancer in an organ that the woman no longer has.
Stats: Prostate Specific Antigen testing (May 31, 2004). A recent report in the New England Journal of Medicine highlights the continuing controversy over Prostate-Specific Antigen (PSA) testing. This controversy is interesting to me because it highlights the uncertain nature of medical research. Keep in mind that I am not a doctor (read my disclaimer) and if you are confronting this issue with regard to your own health, please discuss this with your doctor. PSA is a test commonly used to detect prostate cancer, and any value larger than 4.0 ng per milliliter is considered by some as cause for additional testing. The article examines prevalence of prostate cancer among men in the control arm of a large randomized prevention trial. Of the 9,459 men in the trial, 2,950 had measured PSA that never exceeded 4.0, and yet 15% of these men had prostate cancer confirmed by biopsy.
Stats: Likelihood ratio slide rule (October 24, 2002). The use of likelihood ratios requires a bit of tedious calculations. I have developed a simple slide rule that will do likelihood ratio calculations for you.
Stats: Sample size for a diagnostic study (September 3, 1999) Dear Professor Mean, How big should a study of a diagnostic test be? I want to estimate a sample size for the sensitivity and specifity of a test. I guess confidence intervals would address this, but is there a calculation analogous to a power analysis that would apply to figure out the size of the groups beforehand? -- Jovial John
Stats: ROC curve (August 18, 1999) Dear Professor Mean: I was at a meeting in Belgium and the buzz statistic was ROC Analysis. I think it stands for Receiver Operating Characteristic curve. It seems to be used for predictive values. I seemed to be a lone ranger in not understanding as they were showing in several presentations "by this curve you can see this is good or bad" and they didn't look very different. Do you have a simple explanation about ROC curves?
Theme and closely related categories:
[Return to full topic list] [Read current weblog entries]
~~~@@@ This webpage was written by Steve Simon on 2003-09-08, edited by Steve Simon, and was last modified on 2008-07-08. Send feedback to ssimon at cmh dot edu or click on the email link at the top of the page.