Select your font size 
 
about us products & services consulting & support news & events contact us
To make it clear how Bayes theorem works, you will develop an online

Medical diagnosis wizard - Alberta

print this article 
 

To make it clear how Bayes theorem works, you will develop an online medical diagnosis wizard using PHP. This wizard could also have been called a calculator except that it takes four input steps to supply the prerequisite information then a step to review the result.

The wizard works by asking the user to supply the various pieces of information critical to computing the full posterior probability. The user can examine the posterior distribution to determine which which disease hypothesis enjoys the highest probability based on:

  1. The diagnositic test information
  2. The sample data used to estimate the prior and likelihood distributions

Bayes Wizard: Step 1

Step 1 in using Bayes theorem to make a medical diagnosis involves specifying the number of disease alternatives that you will examine along with the number of symptoms or evidence keys. In the generic example you will look at, you will evaluate three disease alternatives based on evidence from two diagnostic tests. Each diagnostic test can only produce a positive or negative result. This means that the total number of symptom combinations, or evidence keys, you can observe is four (++, +-, -+, or --).

Figure 3. Form to enter disease hypotheses and symptom possibilities
Form to enter  disease hypotheses and symptom possibilities

Bayes Wizard: Step 2

Step 2 involves entering the disease and symptom labels. In this case, you are just going to enter d1, d2, and d3 for the disease labels and ++, +-, -+ and -- for the symptom labels. The two symbols used for symptom labels signify whether the results of the two diagnostic tests came out positive or negative.

Figure 4. Form to enter disease and symptom labels
Form to enter disease and symptom labels

Bayes Wizard: Step 3

Step 3 involves entering the prior probabilities for each disease. You will use the data table below to determine the prior probabilities to enter for step three and the likelihood to enter for step four (this data table originally appeared in Introduction to Probability). Using this example allows you to confirm that the final result you obtain from the wizard agrees with the results you can find in this book.

Figure 5. Joint frequency of diseases and symptoms
Joint frequency of diseases and symptoms

The prior probability of each disease refers to the number of patients diagnosed with each disease divided by the total number of diagnosed cases in this sample. The relevant prior probabilities for each disease are entered in the following:

Figure 6. Form to enter disease priors
Form to enter disease priors

You do not have to rely upon a data table such as the previous one to derive the prior probability estimates. In some cases, you can derive prior probabilities by using common-sense reasoning: The prior probability of a fair two-sided coin coming up heads is 0.5. The prior probability of selecting a queen of hearts from a randomized deck of cards is 1/52.

You also commonly run into situations where you intially have no good estimates of what the prior probability of each hypothesis might be. In such cases, it is common to posit noninformative priors. If you have four hypothesis alternatives, then the noninformative prior distribution would be 1/4 or 0.25 for each hypothesis. You might note here that Bayesians often criticize the use of a null hypothesis in significance testing because it amounts to assuming noninformative priors in cases where positing informative priors might be more theoretically or empirically justified.

A final way to derive estimates of the prior probability of each hypothesis P(Hi) is through a subjective estimate of what those probabilities might be given everything you have learned about the way the world works up to that point P( H=h | Everything you know). You will often find Bayesian inference sharing the same bed with a subjective view of probability in which the probability of a proposition is equated with one's subjective degree of belief in the proposition.

What it important in this discussion is that Bayesian inference is a flexible technique that allows you to estimate prior probabilities using objective methods, common-sense logical methods, and subjective methods. When using subjective methods, you must still be willing to defend your prior probability estimates. You may use objective data to help set and justify your subjective estimates which means that Bayesian inference is not necessarily in conflict with more objectively oriented approaches to statistical inference.

Bayes Wizard: Step 4

The data table provides you with information you can use to compute the probability of the symptoms (like test results) given the disease, also known as the likelihood distribution P(E | H).

To see how the likelihood values entered below were computed, you can unpack P(E|H) using the frequency format for computing conditional probabilities:

P(E | H) = {E & H} / {H}

This tells us that you need to divide a joint frequency count {E & H} by a marginal frequency count {H} to obtain the likelihood value for each cell in your likelihood matrix. The top left cell of your likelihood matrix P(E='++' | H='d1) can be immediately computed from the joint and marginal frequency counts appearing in the data table:

P(E='++' | H='d1) = 2110 / 3125 = .6562

All the likelihood values entered in Step 4 were computed in this manner.

Figure 7. Form to enter likelihood of symptoms given the disease
Form to enter likelihood of symptoms given the disease

It should be noted that many statisticians use likelihood as a system of inference instead of, or in addition to, Bayesian inference. This is because likelihoods also provide a metric one can use to evaluate the relative degree of support for several hypotheses given the data.

In the previous example, you can see that the probability of a particular evidence key varies for each hypothesis under consideration. The probability of the ++ evidence key is the greatest for the d1 hypothesis. You can assess which hypothesis is best supported by the data by:

  1. Examining the likelihood of the evidence key given each hypothesis key
  2. Selecting the hypothesis that maximizes the likelihood of the evidence key

Doing so would be an example of inference according to the principle of maximum likelihood.

Another interesting point to note is that all the values in the above likelihood distibution sum to a value greater than 1. What this means is that the likelihood distribution is not really a probability distribution because it lacks the defining property that the distribution of values sum to 1. This summation property is not essential for the purposes of evaluating the relative support for different hypotheses. What is important for this purpose is that the "likelihood supplies a natural order of preference among the possibilities under consideration" (from R.A. Fisher's Statistical Methods and Scientific Inference, p. 68).

You may not understand fully the concept of likelihood from this brief discussion, but I do hope that you appreciate its importance to the overall Bayes theorem calculation and its importance as the foundation for another system of inference. The likelihood system of inference is preferred by many statisticians because you don't have to resort to the dubious practice of trying to estimate the prior probability of each hypothesis.

Maximum likelihood estimators also have many desirable mathematical properties that make them nice to work with (the properties include transitivity, additivity, a lack of bias, and invariance under transformations, among others). For these reasons, it is often a good idea to closely examine your likelihood distribution in addition to your posterior distibution when making inferences from your data.

Bayes Wizard: Step 5

The final step of the process involves displaying the posterior distribution of the diseases given the symptoms P(H | E):

Figure 8. Probability of each disease given symptoms
Probability of each disease given symptoms

The section of the script that was used to compute and display the posterior distribution looks like this:

Listing 4. Computing and displaying the posterior distribution
<?php
include "Bayes.php";

$disease_labels = $_POST["disease_labels"];
$symptom_labels = $_POST["symptom_labels"];
$priors         = $_POST["priors"];
$likelihoods    = $_POST["likelihoods"];

$bayes = new Bayes($priors, $likelihoods);
$bayes->getPosterior();
$bayes->setRowLabels($symptom_labels);    // aka evidence labels
$bayes->setColumnLabels($disease_labels); // aka hypothesis labels
$bayes->toHTML();
?>

You begin by loading the Bayes constructor with the priors and likelihoods obtained from previous wizard steps. Using this information, you compute the posterior using the $bayes->getPosterior() method. To output the posterior distribution to the browser, you first set the row and column labels to display, then output the posterior distribution using the $bayes->toHTML() method.



Page:   1  2  3  4  5  6  7  8  9  10  11 Next Page: Implementing the calculation with Bayes.php

The content shown in this page was first published by IBM developerWorks and is reprinted with permission from Paul Meagher (www.datavore.com)


Most Recent Website and Regional Updates

 Transparen Toronto Office Locations
Addresses of Transparen Corporation offices in Toronto, Ontario.

 
 High Scalability - Large Systems Optimization
Transparen Corporation lends its expertise to clients experiencing rapid and sudden growth in traffic or server utilization, bottlenecks, systems instability, downtime during peak traffic, or which would like to plan to avoid such issues.

 
 Throughput (or Bandwidth) vs. Latency
This document uses the example of Bill Gates purchasing Google to explain the difference between bandwidth (or throughput) and latency.

 
 Emergency Management Services
The prototypical emergency involves a shutdown of essential services for a finite period of time. What will your organization do when a world-wide financial crisis strikes?

 
 Agriculture Financial Services Amendment Act, 2008
Chapter/Regulation: 27 2008Item/ISBN# 9780779737239 Pages 5

 
 Agriculture Statutes Repeal Act, 2008
Chapter/Regulation: 10 2008Item/ISBN# 9780779735891 Pages 1

 
 Alberta Capital Finance Authority Amendment Act, 2008
Chapter/Regulation: 28 2008Item/ISBN# 9780779737130 Pages 1

 
 Alberta Capital Finance Authority Regulation
Chapter/Regulation: 258/2006Current to   213/2008Item/ISBN# 9780779736560 Pages 1

 
 A Death in the Family - Documentary
Today on the podcast, the story of Paul Johnson and Bill Mullins-Johnson, two brothers from Sault Saint Marie, Ontario whose lives were torn apart after the murder of Paul's four-year-old daughter ... a crime that turned the two men against each other even though neither of them had committed it.

 
 06/01/2009: The Threatening Sea
Today on the podcast, we continue our Watershed series with a trip to Vanuatu, a nation of 83 islands in the South Pacific that is slowly but surely sinking into the sea.

 
 05/01/2009: Australia Drought
Dispatches from The Big Dry. Current producer Kathleen Goldhar brings us a report from Australia's enduring drought and the economy it's spawned, where rainless communities unravel, only the adaptable prosper and water is the new gold standard.

 
 02/02/2009: Economy Panel - 2009 Forecast
With the annus horibilis of 2008 in the rear view mirror, and 2009 lying in the wait, The Current organized an economy panel to give us their forecast for the new year.

 
 31/12/2008: Looking at Israel
Israel is a country where history is never really past, and where politics leeches into all quarters of society. No historian is merely an academic or a chronicler of the times. What he or she writes, in some cases, becomes the starting point of painful and contentious self-examiniation. In this podcast you will hear from a controvesial Israel professor and an author and intellectual counterpart to our first guest.

 
 30/12/2008: Gaza Witnesses
For some perpective about how Israel's latest military campaign is affecting ordinary Gazan citizens Tom Harrington was joined by two guests to discuss what they have been witnessing. Their stories and more about can be heard in this podcast.

 
 29/12/2008: Year-End Political Panel
It's been a year to remember, even if a lot of people would rather forget it ... elections and rumours of elections ... a tenuous coalition and an empty house of commons ... far-off wars and fears of financial Armageddon at home. There's still a couple of days left in it, but 2008 is marching into the history books, and so it's time for a post-mortem on a year that kept political watchers busy. In today's podcsat, you'll be hearing the thoughts of those on our year-end political panel for 2008.

 
 24/12/2008: Helping the Homeless
For Stephen Hwang, the term "help the homeless" has taken on deep meaning. The son of Chinese immigrants, he was born in the U.S. and studied medicine at the country's finest schools. He faced a bright career in research there. But he turned his back on that, and chose instead to move to Canada, and dedicate his work to studying and helping the homeless. We hear his story in today's podcast.

 

Google
 
Web transparen.com

Contact Information

Related Information

 
  Edmonton
Calgary
Lethbridge
Red Deer
Medicine Hat
St. Albert
Fort MacMurray
 
 
E C M | © 2003-2007 Transparen Corp.      

Standardized Services: Data Recovery Service / Creative Services / Premium Web Hosting Services / System Administration Tech Support Services
Recent Projects: Full-Service Mortgage and Financing Company / System to manage flights from Vancouver to Tofino / Photo exchange verification service
Our Vancouver BC Server Proudly Hosts: automated parking and revenue control systems, leafside lane at southlands, cost effective alternative power sources, Higher Grade Learning Centres, pacific forage bag supply, sunburst medical, neosonic design, roger mahler photography - passionate, intriguing, desirable, the connection between east and west, affordable flights to victoria and tofino, low interest mortgage brokers in vancouver, richmond, surrey, toronto, Toronto Calgary and Vancouver IT staffing and talent search
* Airdrie * Brooks * Calgary * Camrose * Cold Lake * Edmonton * Fort Saskatchewan * Grande Prairie * Leduc * Lethbridge * Lloydminster * Medicine Hat * Red Deer * Spruce Grove * St. Albert * Wetaskiwin * Fort McMurray * Sherwood Park Athabasca Banff Barrhead Bashaw Bassano Beaumont Beaverlodge Bentley Black Diamond Blackfalds Bon Accord Bonnyville Bow Island Bowden Bruderheim Calmar Canmore Cardston Carstairs Castor Chestermere Claresholm Coaldale Coalhurst Cochrane Coronation Crossfield Crowsnest Pass, Municipality of Daysland Devon Didsbury Drayton Valley Drumheller Eckville Edson Elk Point Fairview Falher Fort Macleod Fox Creek Gibbons Grande Cache Granum Grimshaw Hanna Hardisty High Level High Prairie High River Hinton Innisfail Irricana Killam Lac La Biche Magrath Manning Mayerthorpe McLennan Milk River Millet Morinville Mundare Nanton Okotoks Olds Onoway Oyen Peace River Penhold Picture Butte Pincher Creek Ponoka Provost Rainbow Lake Raymond Redcliff Redwater Rimbey Rocky Mountain House Sedgewick Sexsmith Slave Lake Smoky Lake Spirit River St. Paul Stavely Stettler Stony Plain Strathmore Sundre Swan Hills Sylvan Lake Taber Three Hills Tofield Trochu Turner Valley Two Hills Valleyview Vauxhall Vegreville Vermilion Viking Vulcan Wainwright Wembley Westlock Whitecourt Acme Alberta Beach Alix Alliance Amisk Andrew Arrowwood Barnwell Barons Bawlf Beiseker Berwyn Big Valley Bittern Lake Botha Boyle Breton Carbon Carmangay Caroline Cereal Champion Chauvin Chipman Clive Clyde Consort Coutts Cowley Cremona Czar Delburne Delia Derwent Dewberry Donalda Donnelly Duchess Edberg Edgerton Elnora Empress Ferintosh Foremost Forestburg Gadsby Galahad Girouxville Glendon Glenwood Halkirk Hay Lakes Heisler Hill Spring Hines Creek Holden Hughenden Hussar Hythe Innisfree Irma Kinuso Kitscoty Linden Lomond Longview Lougheed Mannville Marwayne Milo Minburn Morrin Munson Myrnam Nampa New Norway New Sarepta Nobleford Paradise Valley Rockyford Rosalind Rosemary Rycroft Ryley Sangudo Spring Lake Standard Stirling Strome Thorhild Thorsby Tilley Veteran Vilna Wabamun Warburg Warner Waskatenau Willingdon Youngstown Gull Lake Lakeview Rochon Sands Seba Beach Abee Acadia Valley Aldersyde Ardmore Ardrossan Ashmont Atikameg Atmore Balzac Benchlands Blairmore Blue Ridge Bottrel Bragg Creek Brocket Burdett Carway Chisholm Cochrane Lake Coleman Conrich Craigmyle Chisholm Crooked Creek Dalemead Dalroy Dalum De Winton Deadwood Delacour Desmarais Diamond City Dickson Duhamel Dunmore Elkwater Endiang Entwistle Erskine Evansburg Exshaw Fort Chipewyan Fort McMurray Fort Vermilion Frank Frog Lake Hays Heritage Pointe Hillcrest Mines Indus Irvine Janet Jasper Jefferson Kananaskis Kathyrn Kavanagh Kelsey Keoma Kingman La Crete Lac Des Arcs Lake Louise Langdon Madden Mallaig Niton Junction Nordegg Ohaton Pine Lake Plamondon Richdale Rochfort Bridge Saskatchewan River Crossing Schuler Scotfield Seebe Seven Persons Shepard Sherwood Park South Cooking Lake St. Isidore Star Tawatinaw Teepee Creek Tomahawk Torrington Turin Wabasca Wabasca-Desmarais Welling Zama City Acadia Athabasca Barrhead Beaver Bighorn Big Lakes Birch Hills Bonnyville Brazeau Camrose Cardston Clear Hills Clearwater Cypress Fairview Flagstaff Foothills Forty Mile Grande Prairie Greenview Kananaskis Kneehill Lacombe Lac Ste. Anne Lakeland Lamont Leduc Lesser Slave River Lethbridge Mackenzie Minburn Mountain View Newell Northern Lights Northern Sunrise Opportunity Paintearth Parkland Peace Pincher Creek Ponoka Provost Ranchland Red Deer Rocky View Saddle Hills Smoky Lake Smoky River Spirit River Starland Stettler St. Paul Strathcona Sturgeon Taber Thorhild Two Hills Vermilion River Vulcan Wainwright Warner Westlock Wetaskiwin Wheatland Willow Creek Wood Buffalo Woodlands Yellowhead Crowsnest Pass Jasper Municipal District of Mackenzie Wood Buffalo Strathcona County 4-Waterton Lakes National Park 5-Kananaskis Country 9-Banff National Park 12-Jasper National Park 13-Elk Island National Park 24-Wood Buffalo National Park 25-Willmore Wilderness Park