کاربرد فن آوری اطلاعات در علوم پزشکی

Download Report

Transcript کاربرد فن آوری اطلاعات در علوم پزشکی

MEDICAL DECISION MAKING

Dr. Ali M. Hadianfard

(Medical Informatics)

Faculty member of AJUMS (paramedical school) http://www.alihadianfard.info/download.html

Further reading

 Biomedical Informatics-Computer Applications in Health Care and Biomedicine, Edward H. Shortliffe, James J. Cimino, 4th Ed., 2014 (chapter 3 & 22).

 Decision Support Systems and Intelligent Systems, Efraim Liang,Ting-Peng Aronson, Jay E. Turban, 7th Ed., 2004 (chapter 2).

 Medical Decision Making, Harold C. Sox, Michael C.Higgins, Douglas K. Owens, 2nd Ed., 2013 (chapters 3, 5 & 6).  From Patient Data to Medical Knowledge The Principles and Practice of Health Informatics, Paul_Taylor, 2006 (chapter 10).

 Clinical Decision Support Systems Theory and Practice Health Informatics, Eta S. Berner, 2nd Ed., 2006 (chapters 1 & 2).

 Fuzzy control and identification, John. Lilly, 2010 (chapters 1 & 2).

What

is Medical Informatics?

The study of applying computer technology to manage medical information in order to affect medical care and support

problem-solving

and

decision-making.

Decision-making

Can be considered as the cognitive process (the thought process) selecting a logical choice from the available alternatives. Common examples include shopping , deciding what to eat , and deciding whom or what to vote for in an election or referendum .

Problem-solving

A problem occurs when a system does not meet its established goals , does not yield the predicted results, or does not work as planned. Problem-solving may also deal with identifying new opportunities.

Sometimes the terms decision-making and problem-solving are used interchangeably.

Decision-making process

Simon's model (1977) is the most concise and yet complete characterization of rational decision-making. This involves three major phases:

Intelligence

,

Design

, and

Choice

. He later added a fourth phase,

Implementation

.

Monitoring

can be considered a fifth phase.

(1)

(2)

Decision-making process

Intelligence phase:

    Scanning the environment Data collection (objectives) Problem identification Problem ownership ( at the national or international levels or a problem exists in an organization only if someone or some group takes on the responsibility of attacking it and if the organization has the ability to solve it)  Problem classification (definable category, well structured problems, unstructured problems )  Problem statement

(3)

Decision-making process

Design phase:

    These include understanding the problem and testing solutions for feasibility. A model of the decision-making problem is constructed, tested, and validated  Formulate a model based on the relationships among all the variables (Modeling involves conceptualizing the problem and abstracting it to quantitative and/or qualitative form. For a mathematical model, the variables are identified and their mutual relationships are established) Validate of the model Set criteria for choice Search for alternatives Predict and measure outcomes

(4)

Decision-making process

Choice phase:

 Selection of a proposed solution to the model, Selection of best (good) alternative(s) Plan for implementation  The solution is tested to determine its viability and profitability The boundary between the design and choice phases is often unclear because certain activities can be performed during both of them and because one can return frequently from choice activities to design activities. For example, one can generate new alternatives while performing an evaluation of existing ones .

Decision-making process

(5)

Implementation phase:

Once the proposed solution seems reasonable, we are ready for the last phase:

implementation

of the decision

.

implementation means putting a recommended solution to work. Successful implementation results in solving the real problem. Failure leads to a return to an earlier phase of the process. In fact, we can return to an earlier phase during any of the latter three phases.

Decision-making under different conditions  •

Certainty:

There is perfect knowledge of all the information needed to make a • • decision. Problems are structured Solutions are already available from past experiences  • •

Risk:

Information is incomplete The problem and the alternatives are defined, but has no guarantee how each solution will work.

• It is feasible to make a list of all possible outcomes and assign probabilities to the various outcomes.

 • • •

Uncertainty:

information is very poor Problems are unstructured Decision maker cannot list all possible outcomes and/or cannot assign probabilities to the various outcomes.

Medical Decision Making is under Uncertainty

Decision making is one of the quintessential activities of the healthcare professional. Some decisions are made on the basis of of physiological principles . deductive (subtract from a total; antonym: inductive ) reasoning or Many decisions, however, are made on the basis of knowledge that has been gained through collective experience: the clinician often must rely on empirical knowledge of associations between symptoms and disease to evaluate a problem.

A decision that is based on these usually imperfect associations will be, to some degree, uncertain .

Clinical data are imperfect. The degree of imperfection varies, but all clinical data —including the results of diagnostic tests, the history given by the patient, and the findings on physical examination — are uncertain.

Example: uncertain condition

Mr. James is a 59-year-old man with coronary artery disease. The patient often experiences chest pain (angina). Mr. James has twice undergone coronary artery bypass graft (CABG) surgery. Unfortunately, he has again begun to have chest pain, which becomes progressively more severe, despite medication. If the heart muscle is deprived of oxygen, the result can be a heart attack (myocardial infarction), in which a section of the muscle dies.

Should Mr. James undergo a third operation?

The medications are not working; without surgery, he runs a high risk of suffering a heart attack, which may be fatal. On the other hand, the surgery is hazardous. Not only is the surgical mortality rate for a third operation higher than that for a first or second one but also the chance that surgery will relieve the chest pain is lower than that for a first operation. All choices in the example entail considerable uncertainty . Furthermore, the risks are grave; an incorrect decision may substantially increase the chance that Mr. James will die. The decision will be difficult even for experienced clinicians.

The use of probability or odds as an expression of uncertainty avoids the ambiguities inherent in common descriptive terms.

Probability

Probability is represented numerically by a number between 0 and 1 . Statements with a probability of 0 are

false

. Statements with a probability of 1 are

true

. An event that is certain to occur has a probability of 1; an event that is certain not to occur has a probability of 0. probability of 0.5 or 50% are just as likely to be true as false. The probability of event A is written p[A] . The sum of the probabilities of all possible, collectively exhaustive outcomes of a chance event must be equal to 1. e.g., p[heads]+ p[tails] = 1.0.

Probabilities can be combined

and:

p[A ∩B] = p[A]  to yield new probabilities.

p[B]

or:

p[A U B] = p[A] + p[B]

Conditional Probability

The probability that event A will occur given that event B is known to occur is called the

conditional probability

of event A given event B, denoted by p[A|B] and read as “the probability of A given B.” Thus a post-test probability is a conditional probability predicated on the test or finding. For example, if 30 % of patients who have a swollen leg have a blood clot, we say the probability of a blood clot given a swollen leg is 0.3, denoted: p[blood clot | swollen leg] = 0.3.

p[ A∩B] = p[A]  p[B|A] (Bayes ’ theorem)

Odds

The ratio of the probability of an event occurring over the probability of the event not occurring. Odds and probability are equivalent. The relationship between the odds of an event and its probability is the following: where

p

is the probability that the event will occur For example , if the probability of the event is 0.67, the odds of the event are 0.67 divided by 0.33, or 2 to 1. Another way to express the odds of an event is

p

:(1−

p

). Thus, writing 2:1 is equivalent to saying ‘‘2 to 1 odds.’’ Some find it especially useful to use odds to express their opinion about very infrequent events (1 to 99 odds, rather than a probability of 0.01) or very common events (99 to 1 odds, rather than a probability of 0.99).

Probability Assessment

Probability assessment means asking a person to use a number to express how strongly he/she believes that an event will occur .

 When to estimate probability: If the probability of a disease is very low, doing nothing will be the best choice. Treating without further testing is the best choice if the probability of the target condition is relatively high. Testing ( getting more information ) is best when the probability of disease is intermediate.

 How to estimate probability:   Subjective Probability Assessment Objective Probability Estimates

Probability Assessment - The source of information

1.

Personal experience

: When estimating probability, a clinician relies on personal experience with similar events. For example, a surgeon uses her experience with similar patients when she estimates the probability that Mr. Jones will survive an open heart operation.

2.

Published experience

: Published articles report the frequency of death after surgical procedures. These reports provide an average frequency for a large but not necessarily diverse population, raising questions about its applicability to a specific population.

3. Attributes of the patient

: The experienced clinician uses published reports and personal experience to make an estimate that applies to the average patient. She/he then adjusts the estimate upward or downward starting from this average figure if the patient has unusual characteristics that might affect his risk (e.g., advanced age or many chronic conditions).

Subjective Probability Assessment

(1)

It is based on personal experience . An unconscious mental processes that have been described and studied by cognitive psychologists. These processes are termed cognitive heuristics . A cognitive heuristic is a mental process by which we learn, recall, or process information; we can think of heuristics as rules of thumb (guide or principle based on experience or practice). We may make mistakes in estimating probability in deceptive clinical situations.

Subjective Probability Assessment

(2)

Three heuristics have been identified as important in estimation of probability:

1) Representativeness

. Are judged by the degree to which A is representative of, or similar to, B. For instance, what is the probability that this patient who has a swollen leg belongs to the class of patients who have blood clots? To answer, If the patient has all the classic findings (signs and symptoms) associated with a blood clot, the clinician judges that the patient is highly likely to have a blood clot. Difficulties occur with the use of this heuristic when the disease is rare (very low prior probability, or prevalence). when the clinician’s previous experience with the disease is atypical, thus giving an incorrect mental representation; when the patient’s clinical profile is atypical; and when the probability of certain findings depends on whether other findings are present.

(More examples can be found in

Medical Decision Making, Harold C. Sox, Michael C.Higgins, Douglas K. Owens, 2nd Ed., 2013, P. 38-44

.)

Subjective Probability Assessment

(3) 2) 3) Availability.

Our estimate of the probability of an event is influenced by the ease with which we remember similar events. Events more easily remembered are judged more probable; this rule is the availability heuristic, and it is often misleading. We remember dramatic, atypical, or emotion-laden events more easily and therefore are likely to overestimate their probability. A clinician who had cared for a patient who had a swollen leg and who then died from a blood clot would vividly remember thrombosis as a cause of a swollen leg. The clinician would remember other causes of swollen legs less easily, and he or she would tend to overestimate the probability of a blood clot in patients with a swollen leg.

Anchoring and adjustment

. A clinician makes an initial probability estimate (the anchor) and then adjusts the estimate based on further information. For instance, the clinician makes an initial estimate of the probability of heart disease as 0.5. If he or she then learns that all the patient’s brothers had died of heart disease, the clinician should raise the estimate because the patient’s strong family history of heart disease increases the probability that he or she has heart disease, a fact the clinician could ascertain from the literature. The usual mistake is to adjust the initial estimate (the anchor) insufficiently in light of the new information.

Objective Probability Estimates

(1)

Published research results can serve as a guide for more objective estimates of probabilities. We can use the prevalence of disease in the population or in a subgroup of the population, or clinical prediction rules , to estimate the probability of disease. Estimates of disease prevalence in a defined population often are available in the medical literature .

The prevalence of a disease in patients who have in common a symptom , physical finding , or diagnostic test result helps a clinician to diagnose the disease.

Symptoms , such as difficulty with urination, or signs , such as a palpable prostate nodule, can be used to place patients into a clinical subgroup in which the probability of disease is known. This approach may be limited by difficulty in placing a patient in the correct clinically defined subgroup, especially if the criteria for classifying patients are ill-defined . A trend has been to develop guidelines, known as clinical prediction rules , to help clinicians assign patients to well-defined subgroups in which the probability of disease is known.

(2)

Objective Probability Estimates– Example

A medical student evaluates a young man with abdominal pain. She is concerned about the possibility of appendicitis. The pain is present throughout the abdomen and is associated with loose bowel movements. The patient does not have localized abdominal tenderness, fever, or an increased blood leukocyte count. The medical student presents the patient to the chief surgical resident who, to the student’s surprise, discharges the patient from the emergency room.

The chief surgical resident knows that the prevalence of appendicitis among self-referred adult males with abdominal pain is only 1%. The student should use this information as a starting point as she uses the patient’s clinical findings to estimate the probability of appendicitis. If the history and physical examination do not suggest appendicitis, the probability of appendicitis is very low, since it was 1% in the average patient with abdominal pain. If the examination does suggest appendicitis, the student’s estimate of probability must reflect the low prevalence of appendicitis in all men with abdominal pain.

Objective Probability Estimates

(3) Clinical prediction rules

are developed from systematic study of patients who have a particular diagnostic problem; they define how clinicians can use combinations of clinical findings to estimate probability. Clinical prediction rules are statistical models of the diagnostic process. The symptoms or signs that make an independent contribution to the probability that a patient has a disease are identified and assigned numerical weights based on statistical analysis (Regression analysis, assigns a numerical weight to each predictor) of the finding’s contribution. The result is a list of symptoms and signs for an individual patient, each with a corresponding numerical contribution to a total score ( Discriminant Score ). The total score places a patient in a subgroup with a known probability of disease.

Clinical prediction rules – Example

Ms. Troy, a 65-year-old woman who had heart attack 4 months ago, has abnormal heart rhythm (arrhythmia), is in poor medical condition, and is about to undergo elective surgery.

What is the probability that Ms. Troy will suffer a cardiac complication? Table 3.1 lists clinical findings and their corresponding diagnostic weights. We add the diagnostic weights for each of the patient’s clinical findings to obtain the total score. The total score places the patient in a group with a defined probability of cardiac complications, as shown in Table 3.2. Ms. Troy receives a score of 20; thus, the clinician can estimate that the patient has a 27 % chance of developing a severe cardiac complication.

Recursive partitioning

Recursive partitioning is a statistical process that leads to an algorithm for classifying patients. In recursive partitioning, the diagnostic process is represented by a series of yes –no decision points. If a patient has a finding, he is placed in one group; if not, he is placed in a second group. Each of the two groups resulting from the first yes –no decision point is subjected to a second yes–no question about another finding. The process continues until it reaches a pre-defined stopping point. The goal of the process is to place each patient into a group in which the prevalence of disease is either very high or very low. Typically, the finding that is used at each yes –no decision point is the one that best discriminates between the diseased and non-diseased patients at that point in the partitioning process.

Recursive partitioning – Example

A modified version of recursive partitioning was used to categorize adults with a sore throat as having a high, medium, or low probability of having a beta hemolytic streptococcal infection (Figure 3.8). According to some authors, patients with a high likelihood of infection should have treatment without obtaining a throat culture. In patients with a low likelihood of infection, neither throat culture nor treatment may be indicated.

Decision Tree

http://www.alihadianfard.info/download.html

Decision Tree

The decision tree, a method for representing and comparing the expected outcomes of each decision alternative. It is one way to display an algorithm . It can be used when the outcomes are uncertain , e.g. the results of a surgical operation are unknown. This technique help clinicians to clarify the decision problem and thus to choose the alternative that is most likely to help the patient.

Example: There are two available therapies for a fatal illness. The length of a patient’s life after either therapy is unpredictable, as illustrated by the frequency distribution shown in

Fig. 3.6

with uncertainty: regardless of which therapy a patient receives, he will die by the end of the fourth year, but there is no way to know which year will be the patient’s last.

Figure 3.6

and summarized in

Table 3.5

shows that survival until the fourth year is more likely with therapy B, but the patient might die in the first year with therapy B or might survive to the fourth year with therapy A. . Each therapy is associated Which of the two therapies is preferable?

Decision tree - component

A decision tree diagram consists of 3 types of nodes: 1) Decision nodes - commonly represented by squares 2) Chance nodes - represented by circles 3) End (terminal) nodes or Outcomes - represented by triangles or solid circles A chance node is shown as a circle from which several lines emanate. Each line represents one of the possible outcomes.

An example of decision tree

The example of decision tree – A chance-node

The outcome of a chance event, unknowable for the individual, can be represented by the

expected value

at the chance node.

Decision Tree – Expected-value Decision Making

Calculating Expected Value:

The term expected value is used to characterize a chance event, such as the outcome of a therapy. If the outcomes of a therapy are measured in units of duration of survival , units of sense of well-being , or dollars , the therapy is characterized by the expected duration of survival, expected sense of well-being, or expected monetary cost that it will confer on, or incur for, the patient, respectively. To use expected-value decision making, we follow this strategy when there are therapy choices with uncertain outcomes: (1) calculate the expected value of each decision alternative and then (2) pick the alternative with the highest expected value.

Calculating Expected Value – Example 1 We use the average duration of life after therapy (survival) as a criterion for choosing among therapies. The first step we take in calculating the mean survival for a therapy is to divide the population receiving the therapy into groups of patients who have similar survival rates. Then, we multiply the survival time in each group by the fraction of the total population in that group. Finally, we sum these products over all possible survival values.

Mean survival for therapy A A = (0.2 × 1) + (0.4 × 2) + (0.3 × 3) + (0.1 × 4) = 2.3 years. Mean survival for therapy B B = (0.05 × 1) + (0.15 × 2) + (0.45 × we should select therapy B.

3) + (0.35 × 4) = 3.1 years.

Calculating Expected Value – Example 2

The person who faced this decision was a 60-year-old man we will call ‘‘Hank.’’ Hank had a history of eczema. Because of this chronic condition, he was unconcerned when a rash first appeared near his anus. However, the persistent discomfort eventually led Hank to seek medical attention. His dermatologist performed a biopsy, which showed that a rare skin cancer, called perianal Paget’s disease , was causing Hank’s newly discovered rash. This disease starts in the epidermis but often will metastasize. Therefore, the dermatologist referred Hank to an oncologist for treatment.

First treatment alternative – traditional surgery: In this case the Hank would lose the function of his rectum and be forced to live the remainder of his life with a colostomy bag.

Second treatment alternative – microscopically directed surgery: the resections might stop short of the anal mucosa, thereby avoiding the risk of a colostomy.

Third treatment alternative – do nothing: This alternative would leave him with untreated local disease, which ultimately could result in an invasive cancer, metastases, and death.

Expected value analysis only captures that single aspect of what can happen to Hank – the length of his life. It ignores the impact of the decision on other factors, such as his quality of life.

Sensitivity Analysis

Is the systematic exploration of how the value of one or more parameters will affect the decision-making implications of a model. This tool can support the validity of decision analysis by revealing how changes in the probabilities will affect the conclusions of the analysis.

Calculating Expected Value – Example 3

Life Expectancy = 20 years (1)

Calculating Expected Value – Example 3

(2)

Corneal edema: (0.3 * 0.7 * 0.003) + (0.7 * 0.7 * 0.003) = 0.0021

Lens dislocation: (0.85 * 0.78 * 0.037) + (0.15 * 0.78 * 0.037) = 0.0288

Opacified PC (Posterior Capsule): (0.0032 * 0.73 * 0.933) + (0.9968 * 0.86 * 0.933) = 0.80199

Retinal detachment: 0.027 * 0.73 = 0.01971

Complications after 3 months: 0.0021 + 0.0288 + 0.80199 + 0.01971 = 0.8526

Surgery: (0.3 * 0.8526) + (0.7 * 0.86) = 0.8579

Life expectancy with surgery: 0.8579 * 20 = 17.16 years Life expectancy without surgery: 0.71 * 20 = 14.2 years

Decision: 17.16 > 14.2 ===> SURGERY

Folding Back: A decision tree when the problem includes more than one decision node

Algorithm for folding back a decision tree

1.

Start with the most distal nodes.

2.

Replace each chance node with its expected value

p1y1 + p2y2 + p3y3 + . . . + pNyN

where

p1, p2, p3, . . . , pN

are the probabilities for the possible outcomes and

y1, y2, y3, . . . , yN

are the corresponding values associated with the outcomes.

3.

Replace each decision node with the maximum expected value for the possible alternatives

Maximum of x1, x2, x3, . . . , xM

where

x1, x2, x3, . . . , xM

are the expected values for the possible alternatives.

4.

Repeat until the initial node is reached.

CLINICAL DECISION SUPPORT SYSTEMS

(CDSSs)

http://www.alihadianfard.info/download.html

DEFINITION

(1)

Any computer system, which utilizes clinical data or medical knowledge to help health care practitioners to make a decision , can be considered as a Clinical Decision Support System (CDSS). CDSSs are computer systems designed to impact clinician decision making about individual patients at the point in time that these decisions are made. (were designed to assist clinicians at the point of care .)

DEFINITION

(2)

Systems that provide CDS do not simply assist with the retrieval of relevant information; they communicate information that takes into consideration the particular clinical context, offering situation-specific information and recommendations . At the same time, such systems do not themselves perform clinical decision making; they provide relevant knowledge and analyses that enable the ultimate decision makers, clinicians, patients, and health care organizations to develop more informed judgments.

If used properly, CDSSs have the potential to change the way medicine has been taught and practiced.

BENEFITS OF CDSS

To influence:  physician behavior,  diagnostic test ordering  other care processes  costs of care  clinical outcomes

THE USES OF CDSS

Although, the use of CDSSs is still not widely accepted among clinicians, they have been applied in many fields. CDSSs were used as:  Diagnostic systems     Reminder and Alert systems Disease management systems Drug–dosing or Prescribing systems In Outpatient services for diverse purposes such as prevention/screening, drug dosing, acute disease management, and chronic disease management

THE RIGHT OF CDSS

CDS systems may be described in terms of five right things that they do: they “provide I.

the right information, II.

to the right person, III.

in the right format, IV.

through the right channel, V.

at the right point in workflow to improve health and health care decisions and outcomes”

TYPES OF CDSSs

(1)

Systems that provide CDS come in three basic varieties: 1) 2) They may use information about the current clinical context to retrieve highly relevant online documents, as with so-called “infobuttons”. They may provide patient-specific, situation-specific alerts, reminders, physician order sets, or other recommendations for direct action.

3) They may organize and present information in a way that facilitates problem solving and decision making, as in dashboards , graphical displays, documentation templates, structured reports, and order sets. Order sets are a good example of the latter because they both may facilitate decision making by providing a mnemonic function and also may enhance workflow by providing a means to select a group of relevant activities quickly. many observers consider knowledge resources that distill the medical literature and that facilitate manual selection of content relevant to the current situation to be simple decision-support systems.

TYPES OF CDSSs

(2)

   The timing at which they provide support : before, during, or after the clinical decision is made.

Active or Passive the support is, that is, whether the CDSS actively provides alerts or passively responds to physician input or patient-specific information.

 Stand-alone systems or part of noncommercial computer based patient record systems or physician order entry systems (CPOE).

They are knowledge-based systems, or non-knowledge based systems that employ machine learning and other statistical pattern recognition approaches.

KNOWLEDGE-BASED SYSTEMS

(1)

A Knowledge-based system (KBS) is a computer program that reasons and uses a knowledge base to solve complex problems. A program that symbolically encodes, in a knowledge base, facts, heuristics, and models derived from experts in a field and uses that knowledge to provide problem analysis or advice that the expert might have provided if asked the same question.

Many of today’s knowledge-based CDSS arose out of earlier expert systems. Many of the earliest systems were diagnostic decision support systems. The intent of these CDSS was no longer to simulate an expert’s decision making, but to assist the clinician in his or her own decision making. The system was expected to provide information for the user, rather than to come up with “the answer,” as was the goal of earlier expert systems.

KNOWLEDGE-BASED SYSTEMS

(2)

There are three parts to most CDSS. These parts are: I.

the knowledge base, II.

the inference or reasoning engine, III.

and a mechanism to communicate with the user

KNOWLEDGE-BASED SYSTEMS

(3)

The knowledge base consists of compiled information that is often, but not always, in the form of

if–then rules ( rule - based system )

. The rules structure contains an antecedent and a consequence . The general form of the rule is in the form of ‘IF {condition} THEN {statement}’ . An example of an if–then rule might be, for instance,

IF

a new order is placed for a particular blood test that tends to change very slowly,

AND IF

that blood test was ordered within the previous 48 hours,

THEN

alert the physician. In this case, the rule is designed to prevent duplicate test ordering. Other types of knowledge bases might include probabilistic associations of signs and symptoms with diagnoses, or known drug–drug or drug–food interactions.

KNOWLEDGE-BASED SYSTEMS

(4)

 The second part of the CDSS is called the inference engine or reasoning mechanism, which contains the formulas for combining the rules or associations in the knowledge base with actual patient data. It operates on the rules to generate the necessary behavior.

 Finally, there has to be a communication mechanism, a way of getting the patient data into the system and getting the output of the system to the user who will make the actual decision.

KNOWLEDGE-BASED SYSTEMS - Example

(5)

The decision support system’s knowledge base contains information about diseases and their signs and symptoms. The inference engine maps the patient signs and symptoms to those diseases and might suggest some diagnoses for the clinicians to consider. These systems generally do not generate only a single diagnosis, but usually generate a set of diagnoses based on the available information. Because the clinician often knows more about the patient than can be put into the computer, the clinician will be able to eliminate some of the choices. Most of the diagnostic systems have been stand-alone systems.

KNOWLEDGE-BASED SYSTEMS - Example

(6)

There are CDSS that are part of computerized physician order entry (CPOE) systems that take a new medication order and the patient’s current medications as input , the knowledge base might include a drug database and the output would be an alert about drug interactions so that the physician could change the order.

NON-KNOWLEDGE-BASED SYSTEMS

Unlike knowledge-based decision support systems, some of the non-knowledge-based CDSS use a form of artificial intelligence called

machine learning

, which allows the computer to learn from past experiences and/or to recognize patterns in the clinical data. Artificial neural networks and genetic algorithms are two types of non knowledge-based systems.

TOP-DOWN AND BOTTOM-UP SYSTEMS

Top-down systems use rules, typically derived from experts. Bottom-up systems use tools like neural networks or machine learning in which “smart” software can find novel or unexpected information by analyzing large datasets for associations.

Top-down systems typically require on-going maintenance and supervision, whereas bottom-up systems can be self teaching.

EARLY CDSSs

(1)

Applying diagnosis system dates back to many years ago. Starting in the late 1960s, F. T. de Dombal and his associates at the University of Leeds. the

Leeds abdominal pain system

, used sensitivity, specificity, and disease- prevalence data for various signs, symptoms, and test results to calculate, using Bayes’ theorem , the probability of seven possible explanations for acute abdominal pain (appendicitis, diverticulitis, perforated ulcer, cholecystitis, small-bowel obstruction, pancreatitis, and nonspecific abdominal pain). Using surgical or pathologic diagnoses as the gold standard.

EARLY CDSSs

In the years 1972 and 1973,

Internist-1

,

a diagnostic consultant system, was designed by Myers, Pople, and Miller at the University of Pittsburgh, Pennsylvania, United States. The system was introduced as a stand-alone system and an expert system knowledge- based.

(2)

EARLY CDSSs

(3)

In 1985-1986, the

QMR

(Quick Medical Reference) system was developed based on the Internist-1. The system included about 600 diseases and 4500 clinical findings or disease manifestations related to the diseases in general internal medicine . The clinical findings obtained from medical history, symptoms, signs, and laboratory results. QMR used many-to-many relationship data model between the diseases and the findings, for example , fever, as a disease manifestation was associated with multiple disease. QMR was designed based on asking questions in order to formulate differential diagnosis in a passive role .

EARLY CDSSs

(4) MYCIN

, a stand-alone system, developed at Stanford University in mid-1970s by Dr. Edward Shortliffe as a knowledge-based expert system to diagnose and recommend suitable treatments for infectious diseases . Artificial intelligence (AI) was used to design MYCIN program and the inference engine was made based on a set of ‘IF-THEN’ rules . MYCIN as a consultation system assisted clinicians to make decision about the best option to take with a passive role in giving advice, for example, determining the bacteria cause of Bacteremia and Meningitis, and recommending proper antibiotics and dosage.

EARLY CDSSs

In the 1980s

DXplain

was developed at the Massachusetts General Hospital, Boston, United States as a diagnosis support system to aid practitioners in making differential diagnosis based on manifestations entered into the system. DXplain was an expert knowledge-based system containing about 5000 clinical manifestations (signs, symptoms, medical laboratory results) associated with about 2200 diseases.

(5)

EARLY CDSSs

(6)

LDS (latter-Day Saints) hospital in Salt Lake City, Utah, United State established a hospital information system under the name

HELP

(Health Evaluation through Logical Processing) in the 1970s. Unlike the stand alone diagnostic systems such as QMR, MYCIN, and DXplain, the HELP system was an attempt to integrate a diagnosis system with the EPR. Therefore, if an abnormality was documented into the patient record, the HELP system as a monitoring program could alert users. As an example, for a patient who is allergic to penicillin, the HELP system generates an alert and recommends an alternative preformation plan that may be preferable if the patient is prescribed a drug in the penicillin class. The HELP system provided a model of active role in the clinical decision support.

FACTORS ASSOCIATED WITH CDSS SUCCESS

  Providing alerts/reminders automatically as part of the workflow; Providing the suggestions at a time and location where the decisions were being made;       Providing actionable recommendations; Computerizing the entire process.

How the data are entered. The development and maintenance of the knowledge base The vocabulary and user interface. Since these systems may represent a change in the usual way patient care.

Guidelines for Selecting and Implementing Clinical Decision Support Systems

(1)

 Assuring that users understand the limitations : for example, when the knowledge base and/or reasoning mechanism of the CDSS is not transparent to the user.

 Assuring that the knowledge is from reputable sources . As an example, What rules are actually included in the system and what is the evidence behind the rules?

 Assuring that the system is appropriate for the local site : for example, Does the clinical vocabulary in the system match that in the EMR? What are the normal values assumed by a system alerting to abnormal laboratory tests?

Guidelines for Selecting and Implementing Clinical Decision Support Systems

(2)

 Assuring that users are properly trained : for example, vendors of CDSS need to be clear about what expertise is assumed in using the system. Clinician training is needed for physicians to use the system appropriately.

 Assuring the knowledge base is monitored and maintained . As an example, they must be calibrated to alert the user often enough to prevent serious errors. the responsibility for updating the knowledge base in a timely manner. New diseases are discovered, new medications come on the market.

SOFT COMPUTING TECHNIQUES

SOFT COMPUTING - DEFINITION Soft computing is a term applied to a field within computer science which is characterized by the use of inexact solutions to computationally hard tasks. The solutions are unpredictable and uncertain . Earlier computational approaches could model and precisely analyze only relatively simple systems. More complex systems arising in biology, medicine , the humanities, management sciences.

Fuzzy Logic , Neural Networks , Genetic Algorithms , and Bayesian Network (an idea about probability) are discussed under soft computing field.

FUZZY LOGIC

(1)

Fuzzy logic is modeled on the human reasoning process. Fuzzy logic was presented by Zadeh (Professor Dr. Lotfali Askar Zadeh). In order to implement or simulate fuzzy systems, it is almost unavoidable to write computer programs. MATLAB is used exclusively for simulations due to its ease of programming matrix manipulations and plotting. MATLAB is a high-level language and interactive environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications.

FUZZY LOGIC

(2)

Fuzzy identification and control methods are used in many systems. Some automobile manufacturers use fuzzy logic to control automatic braking Systems. employing a fuzzy system to control turbidity in washing machine water. Fuzzy set theory and fuzzy logic are a highly suitable and applicable basis for developing knowledge-based systems in medicine for tasks such as the interpretation of sets of medical findings and diagnosis of diseases.

FUZZY LOGIC

(3)

A fuzzy set is a collection of real numbers having partial membership in the set. This is in contrast with conventional, or crisp sets , to which a number can belong or not belong, but not partially belong. A crisp set can be described by a characteristic function μ: X → {0,1}. A fuzzy set can be described as a function μ: X → [0,1]. The value μ(x) indicates the degree to which x has the property.

Total membership in the set is specified by a membership value of 1, absolute exclusion from the set is specified by a membership value of 0, and partial membership in the set is specified by a membership value between 0 and 1.

FUZZY LOGIC

(4)

An example of representing a medical concept “high fever” as a fuzzy set is illustrated in the figure. a) if x is grater than 39o C, then membership function μ(x) of medical concept “High Fever” is 1 i.e. means that x has surely “high fever” b) if x is less than 38.5o C, then membership function μ(x) of medical concept “High Fever” is 0 i.e. means that x has surely not “high fever” c) if x is in the interval [38.5o C, 39o C], then x has a property “high fever” with some degree in [0,1].

FUZZY LOGIC

(5)

If the physical quantity under consideration is described by a word rather than a symbol, that word is called a linguistic variable . Fuzzy sets are also usually given linguistic names, called linguistic values. For instance , for the “ height ” variable, we could define a fuzzy set with linguistic value “ tall ” We could also define fuzzy sets with linguistic values “ short ” and “ medium ” Linguistic names are used for variables and their values in fuzzy logic because people usually think and speak in linguistic terms, not mathematical symbols.

Another example , linguistic values for nutrition status variable are severe malnutrition, moderate malnutrition, mild malnutrition, and normal.

FUZZY LOGIC

BMI value

< 18.5 18.5 – 24.9 25 – 30 > 30

linguistic value

Underweight Normal Overweight Obesity

(6)

FUZZY LOGIC

(7)

A fuzzy logic model comprises of four parts Fuzzifier, Fuzzy rules, Fuzzy Inference Engine and Defuzzifier. The fuzzifier is the first phase of the fuzzy logic and is also known as the fuzzification or fuzzy classification. In this step a crisp set of input data was converted into a fuzzy variable through using fuzzy linguistic values instead of original numerical values and determining the type of membership function. The membership value was considered from 0 to 1 for minimum and maximum membership respectively.

FUZZY LOGIC

(8)

The fuzzy rules structure contains an antecedent and a consequence. The general form of the rule is in the form of ‘IF {condition} THEN {statement}’ . The logical operations including ‘AND’, ‘OR’, ‘NOT’ were also used to build up the rules. For example, Rule:

if (BMI is underweight) or (BMI is overweight) or (BMI is obesity) then (the patient has malnutrition)

The inference engine includes the process of interpreting and reasoning the fuzzy rules to generate the output from the input. There are two types of fuzzy inference systems: the Mamdani method and Takagi-Sugeno-Kang (TSK) method. Both methods are similar in many aspects.

FUZZY LOGIC

(9)

The main difference between the Mamdani and the Sugeno methods is in the output membership functions so that in the Mamdani method the output is a fuzzy set that needs the defuzzification while in the Sugeno method the output membership functions are either linear or constant.

FUZZY LOGIC

(10)

The defuzzifier also named the defuzzification is the last phase in the fuzzy logic model to return the output fuzzy linguistic value into a single crisp or original numerical value. As shown in the figure, there are several defuzzification methods including the largest of max , the centroid of area , the bisector of area, and the mean of max. The centroid defuzzification is a popular method and indicates the center of the area under the membership function curve of the output variable.

FUZZY LOGIC

(11)

Artificial Neural Networks (ANN) (1)

The history of neural networking arguably started in the late 1800s with scientific attempts to study the workings of the human brain. In 1890, William James published the first work about brain activity patterns. In 1943 , McCulloch and Pitts produced a model of the neuron that is still used today in artificial neural networking. In 1949 , Donald Hebb published The Organization of Behavior, which outlined a law for synaptic neuron learning. MATLAB is an ideal tool for working with artificial neural networks for a number of reasons. First, MATLAB is highly efficient in performing vector and matrix calculations. Second, MATLAB comes with a specialized Neural Network Toolbox which contains a number of useful tools for working with artificial neural networks.

Artificial Neural Networks (ANN) (2)

ANN simulate human thinking and learn from examples. An ANN consists of nodes called neurodes (which correspond to neurons) and weighted connections (which correspond to nerve synapses) that transmit signals between the neurodes in a unidirectional manner.

Artificial Neural Networks (ANN) (3)

An ANN contains 3 layers, which include the input layer , output layer, and hidden (middle) layer . The input layer is the data receiver and the output layer communicates the results, while the hidden layer processes the incoming data and determines the results.

Artificial Neural Networks (ANN) (4)

An artificial neural network may be described as a set of neurons or nodes X i , each transforming its total or net input x_in i into an output or activity x i according to an activation function (or transfer function) f(x_in i ) . Each node X i sends its output to other units X j through connections each having a certain effectiveness or weight w ij . The net input to any unit x j is usually modelled as a sum of all the outputs xi from other units (and, in recurrent nets, from itself), weighted by the weights w ij of the respective connections.

Artificial Neural Networks (ANN) (6)

The most important distinction is that between feed-forward and feed-back (or recurrent) nets . In the former, the information is thought to pass just once through the net, starting in an input layer of units and ending in an output layer. Between the input and the output layers, hidden layers of neurons may exist, which as a rule enhances the computational power of the ANN. Recurrent nets have a more complicated dynamics, with signals going back and forth between the nodes for some time.

Artificial Neural Networks (ANN) (7)

Artificial Neural Networks (ANN) (8)

Artificial Neural Networks (ANN) (9)

These systems can learn from examples when supplied with known results for a large amount of data. The system will study this information, make guesses for the correct output, compare the guesses to the given results, find patterns that match the input to the correct output, and adjust the weights of the connections between the neurodes accordingly, in order, to produce the correct results. This iterative process is known as training the artificial network.

Artificial Neural Networks (ANN) (10)

An example with myocardial infarction , for instance, the data including a variety of signs and symptoms from large numbers of patients who are known to either have or not have a myocardial infarction can be used to train the neural network. Once the network is trained, i.e., once the weighted associations of signs and symptoms with the diagnosis are determined, the system can be used on new cases to determine if the patient has a myocardial infarction.

Artificial Neural Networks (ANN) (11)

There are many advantages and disadvantages to using artificial neural networks. Advantages include eliminating the need to program IF –THEN rules and eliminating the need for direct input from experts. ANNs can also process incomplete data by inferring what the data should be and can improve every time they are used because of their dynamic nature. ANNs also do not require a large database to make predictions about outcomes, but the more comprehensive the training data set is, the more accurate the ANN is likely to be.

Artificial Neural Networks (ANN) (12)

Even though all of these advantages exist, there are some disadvantages. The training process involved can be time consuming. ANNs follow a statistical pattern recognition approach to derive their formulas for weighting and combining data. The resulting formulas and weights are often not easily interpretable, and the system cannot explain or justify why it uses certain data the way it does, which can make the reliability and accountability of these systems a concern.

Artificial Neural Networks (ANN) (13)

Despite the above concerns, artificial neural networks have many applications in the medical field . In a review article on the use of neural networks in health care, Baxt provides a chart that shows various applications of ANNs, which include the diagnosis of appendicitis, back pain, dementia, myocardial infarction, psychiatric emergencies, sexually transmitted diseases, skin disorders, and temporal arteritis. ANNs can predict which patients are at high risk for cancers such as oral cancer.

Genetic Algorithm (GA)

GAs were developed in the 1940s by John Holland at the Massachusetts Institute of Technology, and are based on the evolutionary theories by Darwin that dealt with natural selection and survival of the fittest. Just as species change to adapt to their environment, GAs ‘reproduce’ themselves in various recombination in an effort to find a new recombinant that is better adapted than its predecessors..

(1)

Genetic Algorithm (GA)

In other words, without any domain-specific knowledge, components of random sets of solutions to a problem are evaluated, the best ones are kept and are then recombined and mutated to form the next set of possible solutions to be evaluated, and this continues until the proper solution is discovered. The fitness function is used to determine which solutions are good and which ones should be eliminated.

(2)

Genetic Algorithm (GA) (3)

GAs are similar to neural networks in that they derive their knowledge from patient data. Genetic algorithms have also been applied in health care , but there are fewer examples of this type of CDSS than those based on neural networks. However, GAs have proved to be a helpful aid in the diagnosis of female urinary incontinence. Genetic Algorithms are explored in medical applications to characterize patterns and results.

Genetic Algorithm (GA) (4)

For example , optimizing image analysis such as, assessing classes of cells in blood cell microscope images or for facilitating magnetic resonance tomography (MRT) treatment planning and 3D visualization of image data. Genetic algorithms developed for mammography were adapted for mining patient’s having abdominal aortic aneurysms by analyzing abdominal computed tomography (CT) scan reports for common patterns and features of successful and unsuccessful surgeries.

Genetic Algorithm (GA) (5)

Genetic algorithms can be used for optimizing pharmaceutical products. Recently, it was shown that Genetic Algorithms were able to identify additional anti bacterial peptides with a high activity during a study.

Finally, it was shown that Genetic Algorithms enhance the precision of artificial neural networks (ANNs) such as for hip-bone fracture prediction or for optimizing efficient search strategies of ANNs to predict and discriminate pneumonia within a training group.

Genetic Algorithm (GA)

It is suggested that combining Genetic Algorithms and artificial neural networks to form genetic algorithm neural networks (GANNs) is an important approach for improving the analysis of medical data.

(6)

Measurement of the Operating Characteristics of Decision Models (Diagnostic Tests)

http://www.alihadianfard.info/download.html

Classification of Test Results

Reality Yes Test Positive True Positive No False Positive Negative False Negative True Negative

• • A

True Positive (TP)

is a positive test result obtained for a patient in whom the disease is present (the test result correctly classifies the patient as having the disease).

A

True Negative (TN)

is a negative test result obtained for a patient in whom the disease is absent (the test result correctly classifies the patient as not having the disease).

• A

False Positive (FP)

is a positive test result obtained for a patient in whom the disease • is absent (the test result incorrectly classifies the patient as having the disease).

A

False Negative (FN)

is a negative test result obtained for a patient in whom the disease is present (the test result incorrectly classifies the patient as not having the disease).

How to measure the performance of a decision model (test) To measure the performance of a test for a disease, first perform the test in patients who are known to have the target condition and in patients who are known to be free of the target condition (but might have other diseases). Then, calculate the frequency of a result in patients with the target condition and in patients who do not have the target condition. We also need a gold standard test (The procedure that defines the true state of the patient in a study of test performance. Also known as ‘‘diagnostic reference standard’’.). Sensitivity , Specificity , Accuracy , Precision , false-negative rate , and false-positive rate are calculated to measure the concordance and discordance between index test (decision model) and disease state.

Sensitivity

The ability to detect true positives (really sick).

The likelihood that a diseased patient has a positive test.

In conditional probability notation, the true-positive rate of a test result is

P[positive test result|disease] or P[+|D]

The true-positive rate (TPR) of a model is

Specificity

The ability to detect true negatives (really healthy). The likelihood that a patient that does not have the target condition has a negative test.

In conditional probability notation, the true-negative rate of a test result is

P[negative test result|no disease] or P[−| no D]

The true-negative rate (TNR) of a model is

The ability to detect true results (true positives and true negatives).

𝐓𝐏 + 𝐓𝐍 𝐀𝐜𝐜𝐮𝐫𝐚𝐜𝐲 = 𝐓𝐏 + 𝐅𝐍 + 𝐅𝐏 + 𝐓𝐍

Precision (Positive Predictive Value)

The ability to detect true positives (really sick) of all positive results.

PV+ is the fraction of patients with a positive test who also have the target condition.

Negative Predictive Value

The ability to detect true negatives (really healthy) of all negative results. PV− is the fraction of patients with a negative test result who do not have disease.

False-Negative Rate (FNR)

The likelihood that a patient who have the target condition has a negative test.

P[negative test result|disease ] or P[−|D] 𝑭𝒂𝒍𝒔𝒆 𝑵𝒂𝒈𝒆𝒕𝒊𝒗𝒆 𝑹𝒂𝒕𝒆 = 𝟏 − 𝑺𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚

False Positive Rate (FPR)

The likelihood that a patient that does not have the target condition has a positive test.

P[positive test result|no disease] or P[+|no D] 𝑭𝒂𝒍𝒔𝒆 𝑷𝒐𝒔𝒊𝒕𝒊𝒗𝒆 𝑹𝒂𝒕𝒆 = 𝟏 − 𝑺𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚

Likelihood Ratio

The LR of a test combines the measures of test discrimination to give one number that characterizes the discriminatory

power of a test

.

The LR indicates the amount that the odds of disease change based on the test result. We describe the performance of a test that has only two possible outcomes (e.g., positive or negative) by two LRs: one corresponding to a positive test result and the other corresponding to a negative test. These ratios are abbreviated

LR +

and

LR−

, respectively.

Likelihood Ratio

𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑅𝑎𝑡𝑖𝑜 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝐿𝑅+) = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 1 − 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑅𝑎𝑡𝑖𝑜 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝐿𝑅 − = 1 − 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦

Likelihood Ratio

The LR of a test combines the measures of test discrimination to give one number that characterizes the discriminatory power of a test.

𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑅𝑎𝑡𝑖𝑜 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝐿𝑅+) = 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 1 − 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑅𝑎𝑡𝑖𝑜 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝐿𝑅 − = 1 − 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦

Example 1

One test used to screen blood donors for HIV antibody is an enzyme-linked immunoassay (EIA). So that the performance of the EIA can be measured, the test is performed on 400 patients.

Example 2

A mammogram, or breast X-ray, is the diagnostic test used in screening for breast cancer. Imagine that mammography has a sensitivity of 77% and a specificity of 95%, and assume that the incidence of breast cancer in the screening population is 0.6%. If the mammogram is positive, how likely is it that the patient has breast cancer?

An incidence of 0.6% would mean that in a population of 10 000 there would be 60 cases of cancer. A sensitivity of 77% would mean that out of 60 cases of cancer 46 would be correctly identified. A specificity of 95% would mean that out of 9940 disease-free cases, 9443 would be correctly identified as disease-free, leaving 497 false positives. It follows that a total of 543 cases would be identified as cancer: 497 disease-free cases plus 46 disease cases. If you had a positive mammogram, the likelihood of your having cancer is therefore 46/543 or around 0.08.

We can do the same calculation using Bayes’ theorem. In the above data, the probability of a positive mammogram if cancer is present, p(s|d), is 0.77. The prior probability of cancer, p(d), is 6/1000 = 0.006. The probability of a positive mammogram, p(s), is 543/10 000 = 0.054.

By Bayes’ theorem

, p(d|s) = p(s|d)  p(d)/p(s) = 0.77  0.006/0.054

= 0.08.