Final Presentation: Coreference Resolution

Download Report

Transcript Final Presentation: Coreference Resolution

I2B2 Shared Task 2011
Coreference Resolution in Clinical Text
David Hinote
Carlos Ramirez
What is coreference resolution?
• Nouns, pronouns, and phrases that refer to the same object,
person or idea are coreferent.
o Example: "Alexander was playing soccer yesterday. He
fell and broke his knee."
o "Alexander", "he", and "his" refer to the same person, so
they are said to be coreferent.
The i2b2 Challenge
I2B2: "Informatics for Integrating Biology and the Bedside"
• This program has issued a challenge in NLP involving
Coreference resolution.
• The challenge is to find co-referential relations within a given
medical document.
• The concepts that can be corefered are all annotated.
• There are 5 classes of concepts:
o Problem
o Person
o Treatment
o Test
o Pronoun
Concept Mentions
People
• Any mention that refers to a person, or group of people
o Dr. Lightman, The patient, cardiology
Problems
• A mention that refers to the reason the subject of the
document is in the hospital
o Heart attack, blood pressure, broken leg
Tests
• Tests performed by doctors
o EKG, temperature, CAT scan
Treatments
• Solutions to the problem mentions, or work performed to cure
patients
o Brain surgery, ice pack, Tylenol
Pronouns can refer to any of the four other types of mentions
Approaches for Competing
• Using tools already made & publicly available
o Stanford NLP
o BART Coreference
o LingPipe
o CherryPicker
o Reconcile
o ARKRef
o Apache Open NLP
• Coding our own Coreference Tool
Other Coreference Tools
• We obtained versions of other Coreference tools and tested
them on our data.
• All tools we found were either still in their initial development
stages, or were built for their specific purpose and left alone
after. (i.e. Coreference on the MUC datasets)
• Testing shows that at best, the other tools we found do not
perform acceptably with our data.
• After attempts to train other tools using our data failed, we
felt it best to code our own approach.
Other Tools Statistics
Algorithm
• Because the data we are working on is so specific, we chose
to use a rule based approach to coreference resolution.
• This means that we try to learn the characteristics of each
coreferent link ourselves, and program a method for the link
manually.
• We examine concepts in a file, and if they meet our criteria,
we create a method to link them.
• The idea is to create specific rules, yet generalized enough
to apply to similar mentions in all documents.
Our Application
• To help visualize coreferent links and see what links our
program detects, we use a GUI created with Java.
• Our program is developed by us using the Mecurial
version control system to allow us to keep each others
code up to date.
• Uses our coded algorithms to determine coreferent
links between the given concept mentions.
• It displays coreferent links as lines.
o Blue for true links.
o Red for links that are detected by our algorithm.
Our Application
Programmed in Java, our application can utilize databases, and
the internet to gather information about concept mentions being
tested.
• We have set up a database to hold data that gives meaning
to concept mentions being tested, or to certain key words in
a sentence that contains a mention. If words or phrases
meet our criteria, they can be added to the appropriate table
straight from the program window.
• For each mention, information is extracted by the program
from Google.com searches as well, which can give the
program a wealth of information about the mentions.
Sample file
• Viewing Concepts & I2B2 Chain
File with both UHD and I2B2 Links Shown
Statistics for our System
Progress
• We are currently at around 75% F1 score. (Averaged over
all test files.)
• Most algorithms for resolving coreference tend to have
accuracy in the 60% range.
• With the time we have left, we will definitely increase this
score.
• We still haven't added detection for "Treatment" type
concepts, which constitute a significant percentage of the
concepts not found when computing our F1 score.
• Detection for "Test" type concepts still needs work.
Current work
Test Mentions
• Precision on "Test" type concepts is relatively low (30%).
• Mainly this is because many of the tests involve specific
body parts (e. g. "chest x-ray" and "chest CT" are
sometimes linked by our rules).
• Tests also often involve times (e. g. "an x-ray was performed
on 5 Aug." would link with "the x-ray on... December 10,
2010").
• They also involve position (e.g. "x-ray on left lung" "x-ray on
right lung")
Current Work
Problem Mentions
• Work on these mentions is about 50% complete
• To finish, a few more database tables will need to be set up,
and certain types of medical vocabulary loaded into them.
• We will also need a system for finding phrases made of
different words, but mean the same thing AKA a thesaurus
Possible future problems
• The main risk with a rule based approach is that our rules
might be too specific to work with the contest data once it's
distributed.
• Given the execution speed of our program, we should have
enough time to do any necessary modifications in the three
days between contest data being sent and results submitted.
• There is also a slight problem with the fact that our
application is made for a very specific purpose and is
probably hard to generalize beyond the context of medical
documents.
• Most coreference resolution tools are this way though.
• Not being able to code fast enough!
Future Necessities
• A reliable way to find the temporal setting of a particular
sentence.
o Did an injury described happen 20 years ago, or is the
doctor giving instructions for a future case? These are
not coreferent even though they may be the same word
• Thesaurus work
o finding phrases that mean the same thing, but
use completely different words
• Output
o The program will not output files in the I2B2 competition
format, we will have this feature made as the competition
deadline draws near.