Object Orie’d Data Analysis, Last Time • SiZer Analysis – Zooming version, -- Dependent version – Mass flux data, -- Cell cycle data • Image Analysis –
Download ReportTranscript Object Orie’d Data Analysis, Last Time • SiZer Analysis – Zooming version, -- Dependent version – Mass flux data, -- Cell cycle data • Image Analysis –
Object Orie’d Data Analysis, Last Time • SiZer Analysis – Zooming version, -- Dependent version – Mass flux data, -- Cell cycle data • Image Analysis – 1st Generation -- 2nd Generation • Object Representation – Landmarks – Boundaries – Medial OODA in Image Analysis First Generation Problems: • Denoising • Segmentation • Registration (find object boundaries) (align objects) (all about single images) OODA in Image Analysis Second Generation Problems: • Populations of Images – Understanding Population Variation – Discrimination (a.k.a. Classification) • Complex Data Structures (& Spaces) • HDLSS Statistics Image Object Representation Major Approaches for Images: • Landmark Representations • Boundary Representations • Medial Representations Landmark Representations Landmarks for fly wing data: Landmark Representations Major Drawback of Landmarks: • Need to always find each landmark • Need same relationship • I.e. Landmarks need to correspond • Often fails for medical images • E.g. How many corresponding landmarks on a set of kidneys, livers or brains??? Boundary Representations Major sets of ideas: • Triangular Meshes – • Active Shape Models – • Survey: Owen (1998) Cootes, et al (1993) Fourier Boundary Representations – Keleman, et al (1997 & 1999) Boundary Representations Example of triangular mesh rep’n: From:www.geometry.caltech.edu/pubs.html Boundary Representations Main Drawback: Correspondence • For OODA (on vectors of parameters): Need to “match up points” • Easy to find triangular mesh – • Lots of research on this driven by gamers Challenge to match mesh across objects – There are some interesting ideas… Medial Representations Main Idea: Represent Objects as: • Discretized skeletons (medial atoms) • Plus spokes from center to edge • Which imply a boundary Very accessible early reference: • Yushkevich, et al (2001) Medial Representations 2-d M-Rep Example: Corpus Callosum (Yushkevich) Medial Representations 2-d M-Rep Example: Corpus Callosum (Yushkevich) Atoms Spokes Implied Boundary Medial Representations 3-d M-Rep Example: From Ja-Yeon Jeong Bladder – Prostate - Rectum Atoms - Spokes - Implied Boundary Medial Representations 3-d M-reps: there are several variations Two choices: From Fletcher (2004) Medial Representations Statistical Challenge • M-rep parameters are: – Locations 2 , 3 0 – Radii – Angles (not comparable) • Stuffed into a long vector • I.e. many direct products of these Medial Representations Statistical Challenge: • How to analyze angles as data? • E.g. what is the average of: 3 , 4 , 358 , 359 – 181 ??? (average of the numbers) – 1 (of course!) • Correct View of angular data: Consider as points on the unit circle Medial Representations What is the average (181o?) or (1o?) of: 3, 4, 358 , 359 Medial Representations Statistical Challenge • Many direct products of: – Locations 2 , 3 – Radii 0 – Angles (not comparable) • Appropriate View: Data Lie on Curved Manifold Embedded in higher dim’al Eucl’n Space Medial Representations Data on Curved Manifold Toy Example: Medial Representations Data on Curved Manifold Viewpoint: • Very Simple Toy Example (last movie) • Data on a Cylinder = 1 S 1 • Notes: – – – – – • Simplest non-Euclidean Example 3 2-d data, embedded on manifold in R Can flatten the cylinder, to a plane Have periodic representation Movie by: Suman Sen Same idea for more complex direct prod’s A Challenging Example • Male Pelvis – – – • Bladder – Prostate – Rectum How do they move over time (days)? Critical to Radiation Treatment (cancer) Work with 3-d CT – – – Very Challenging to Segment Find boundary of each object? Represent each Object? Male Pelvis – Raw Data One CT Slice (in 3d image) Tail Bone Rectum Prostate Male Pelvis – Raw Data Prostate: manual segmentation Slice by slice Reassembled Male Pelvis – Raw Data Prostate: Slices: Reassembled in 3d How to represent? Thanks: Ja-Yeon Jeong Object Representation • Landmarks (hard to find) • Boundary Rep’ns (no correspondence) • Medial representations – Find “skeleton” – Discretize as “atoms” called M-reps 3-d m-reps Bladder – Prostate – Rectum (multiple objects, J. Y. Jeong) • Medial Atoms provide “skeleton” • Implied Boundary from “spokes” “surface” 3-d m-reps M-rep model fitting • Easy, when starting from binary (blue) • But very expensive (30 – 40 minutes technician’s time) • Want automatic approach • Challenging, because of poor contrast, noise, … • Need to borrow information across training sample • Use Bayes approach: prior & likelihood posterior • ~Conjugate Gaussians, but there are issues: • Major HLDSS challenges • Manifold aspect of data Mildly Non-Euclidean Spaces Statistical Analysis of M-rep Data Recall: Many direct products of: • Locations • Radii • Angles I.e. points on smooth manifold Data in non-Euclidean Space But only mildly non-Euclidean Mildly Non-Euclidean Spaces Good source for statistical analysis of Mildly non-Euclidean Data Fletcher (2004), Fletcher, et al (2004) Main ideas: • Work with geodesic distances • I.e. distances along surface of manifold Mildly Non-Euclidean Spaces What is the mean of data on a manifold? • Bad choice: – – – • • Mean in embedded space Since will probably leave manifold Think about unit circle How to improve? Approach study characterizations of mean – – There are many Most fruitful: Frechét mean Mildly Non-Euclidean Spaces Fréchet mean of numbers: n X arg min X i x x 2 i 1 Fréchet mean in Euclidean Space: X arg min X i x arg min d X i , x n x i 1 n 2 x 2 i 1 Fréchet mean on a manifold: Replace Euclidean d by Geodesic d Mildly Non-Euclidean Spaces Fréchet Mean: • Only requires a metric (distance) space • Geodesic distance gives geodesic mean Well known in robust statistics: • Replace Euclidean distance 2 1 • With Robust distance, e.g. L with L • Reduces influence of outliers • Gives another notion of robust median Mildly Non-Euclidean Spaces E.g. Fréchet Mean for data on a circle Mildly Non-Euclidean Spaces E.g. Fréchet Mean for data on a circle: • Not always easily interpretable – – – • Think about “distances along arc” 2 Not about “points in ” Sum of squared distances “strongly feels the largest” Not always unique – – – But unique “with probability one” Non-unique requires strong symmetry But possible to have many means Mildly Non-Euclidean Spaces E.g. Fréchet Mean for data on a circle: • Not always sensible notion of center – – • Not continuous Function of Data – – • • Sometimes prefer “top & bottom”? At end: farthest points from data Jump from 1 – 2 Jump from 2 – 8 All false for Euclidean Mean But all happen generally for manifold data Mildly Non-Euclidean Spaces E.g. Fréchet Mean for data on a circle: • Also of interest is Fréchet Variance: n 1 2 2 ˆ min d X i , x x n i 1 • • • Works like sample variance Note values in movie, reflecting spread in data Note theoretical version: min E X d X , x 2 2 x • Useful for Laws of Large Numbers, etc. Mildly Non-Euclidean Spaces Useful Viewpoint for data on manifolds: • Tangent Space • Plane touching at one point • At which point? Geodesic (Fréchet) Mean Hence terminology “mildly non-Euclidean” (pic next page) Mildly Non-Euclidean Spaces Pics from: Fletcher (2004) Mildly Non-Euclidean Spaces “Exponential Map” Terminology: From Complex Exponential Function Exponential Map: i e cos i sin In Tangent Space On Manifold Mildly Non-Euclidean Spaces Exponential Map Terminology Memory Trick: • Exponential Map Tangent Plane Curved Manifold • Log Map (Inverse) Curved Manifold Tangent Plane Mildly Non-Euclidean Spaces Analog of PCA? Principal geodesics (PGA): • Replace line that best fits data • By geodesic that best fits the data (geodesic through Fréchet mean) • Implemented as PCA in tangent space • But mapped back to surface • Fletcher (2004) PGA for m-reps, BladderProstate-Rectum Bladder – Prostate – Rectum, 1 person, 17 days PG 1 PG 2 (analysis by Ja Yeon Jeong) PG 3 PGA for m-reps, BladderProstate-Rectum Bladder – Prostate – Rectum, 1 person, 17 days PG 1 PG 2 (analysis by Ja Yeon Jeong) PG 3 PGA for m-reps, BladderProstate-Rectum Bladder – Prostate – Rectum, 1 person, 17 days PG 1 PG 2 (analysis by Ja Yeon Jeong) PG 3 Mildly Non-Euclidean Spaces Other Analogs of PCA??? • Why pass through geodesic mean? • Sensible for Euclidean space • But obvious for non-Euclidean? Perhaps “geodesic that explains data as well as possible” (no mean constraint)? • Does this add anything? • All same for Euclidean case (since least squares fit contains mean) Mildly Non-Euclidean Spaces E.g. PGA on the unit sphere: Unit Sphere Data Mildly Non-Euclidean Spaces E.g. PGA on the unit sphere: Unit Sphere Data Geodesic Mean Mildly Non-Euclidean Spaces E.g. PGA on the unit sphere: Unit Sphere Data Geodesic Mean PG 1 Mildly Non-Euclidean Spaces E.g. PGA on the unit sphere: Unit Sphere Data Geodesic Mean PG 1 Best Fit Geodesic Mildly Non-Euclidean Spaces E.g. PGA on the unit sphere: Which is “best”? • Perhaps best fit? • What about PG2? – • What about PG3? – – • Should go through geo mean? Should cross PG1 & PG2 at same point? Need constrained optimization Gaussian Distribution on Manifold???