UIMA - Mayo Clinic Informatics

Transcript UIMA - Mayo Clinic Informatics

UIMA SHARP 4 - NLP May 25, 2010

Outline • UIMA Terminology (not just TLAs) • Parts of a UIMA pipeline • Running a pipeline • Viewing annotations • Creating a new annotator

UIMA terminology • •

CAS XCAS JCAS View Analysis Engine

(

) / Annotator –

Aggregate Analysis Engine

• XML output:

XCAS

XMI •

Type System JCasGen

•

CAS Visual Debugger

(

CVD

) •

CPE

(

Collection Processing Engine

)

UIMA and Eclipse • UIMA plugin for Eclipse requires EMF • UIMA plugin provides visual editors for descriptors • An “Update site” exists for installing plugin

UIMA Pipeline Flow •

Collection Reader

• (

CAS Initializer

- deprecated) •

Analysis Engine

(

) / Annotator •

CAS Consumer

Pipeline Example UIMA term

Collection Reader Analysis Engine Analysis Engine CAS Consumer

Example Read files from a dir Sentence annotator Tokenizer annotator Output tokens to a DB

Options for running UIMA tools • Tools: –

CPE Configurator

–

CVD

• Options: – Command line scripts/.bat files – Run within Eclipse

Tying together a UIMA pipeline •

Type System

– Defines the data types passed along •

CAS

(Common Analysis Structure) – Container for the data

Tying together a UIMA pipeline •

CPE

descriptor – select the parts –

Collection Reader

–

Analysis Engine

(s) –

CAS Consumer

•

Aggregate analysis engine

– Multiple

Analysis Engines

and their order

Options for running a pipeline •

CVD

GUI – Single

Aggregate Analysis Engine

– No

Collection Reader

•

CPE

GUI • Instantiate a CpeDescription and invoke the process() method 2.3. Running a CPE from Your Own Java Application

Example: Running a pipeline Running cTAKES within Eclipse using a

CPE

Use run configuration UIMA_CPE_GUI--clinical_documents_pipeline

CPE

test1.xml

from clinical documents pipeline\desc\collection_processing_engine

Options for viewing annotations •

CVD

•

Annotation viewer

• XML viewer • Text editor

Example: Viewing annotations Viewing annotations using the

CVD

• Load the

Type System

• Load the

XCAS

or XMI

Example: Running an AE in CVD Using

CVD

to run an

Analysis Engine

– No

Collection Reader

– Single

Analysis Engine

(can be an aggregate) – No

CAS Consumer

– Just paste/type in text to process Family history of hyperlipidemia.

Creating a New Annotator • Create Java project • Right click -> Add UIMA Nature • Add UIMA jars to .classpath (Build Path) • Create

Analysis Engine

(

) descriptor • Add types to

descriptor, or optionally create separate

Type System

descriptor • Write code!

Questions?

Supplemental slides follow

Example: Creating a PEAR file • Right click -> Add UIMA Nature • Right click -> Generate Pear • Select

Analysis Engine

descriptor • Select OS and JDK • Modify Properties if needed • Select what to include

Example: Modifying a parameter UIMA’s descriptor editors allow you to modify

most

parameters without looking at the XML itself.

Links • Getting started with UIMA http://uima.apache.org/doc-uima-annotator.html

• UIMA Update site for use in Eclipse http://www.apache.org/dist/incubator/uima/eclipse-update-site/

Email address [email protected]

UIMA - Mayo Clinic Informatics

Transcript UIMA - Mayo Clinic Informatics

Directory