The Needle in the Haystack - Office of Information Technology

Download Report

Transcript The Needle in the Haystack - Office of Information Technology

The Needle in the Haystack: Find the Offending File

Robert K. Henry  CISSP, GCIH, GCFA  Information Security Officer

HR Has an Employee Grievance  Hostile Workplace – Sexual Harassment  Inappropriate/offensive files stored on web server and displayed in office  College Staff Already Involved

College Investigation  Course Site Files Deleted  Six weeks prior to HR grievance report  No Backups!

 Backup System on the fritz at time files were deleted

College Investigation  How do we get the goods?

 College systems admin made manual backups to local PC drive  Not removed from local drive after backup system was repaired

The Mission:

 Find inappropriate material among 6 GB of mixed images, word-processed, and text files.  Identify owner/creator of files > 7000 files

Search Options       Manual grep ssdeep foremost sorter Content Based Image Retrieval, CBIR  Evaluation Criteria:   Easy!

Free!

Search Options  Manual (The First Responder's Strategy)    Thumbnails Slide Show One-at-a-time  zzzzzzzzzzzzzzzzzz!

 Too much room for error  Pretty Inefficient (32 hours of searching)  Two people spent two workdays each going through the DVD's

Search Options  But . . .  it worked!

 Identified inappropriate word-processed files and images in one directory on one of the DVD’s  Due to multiple file copying, creator/owner of files doesn't show up in Windows file properties  Did I mention the files were uploaded via ftp with shared userID’s?

 Not much accountability!

Search Options

There’s gotta be an easier way!

Search Options-- grep  Built-in *nix string search command also available for Windows  Steps to conduct search with grep (1)  Make a forensic image of the disks

#dd if=/dev/sr0 of=dvdimage.img conv=noerror,sync

Search Options--grep  Steps to conduct search with grep (2)  Extract Strings  Ascii strings first

#cat dvdimage.img | strings --radix=d dvdimage.img > dvdimage.str

 Unicode strings second

#cat dvdimage.img | srch_strings -t d -e > dvdimage.uni.str

Search Options--grep  Steps to conduct search with grep (3)  Examine Strings Files  Create “dirty word” file  Use “dirty word” file to search strings for, well, dirty words

#grep -f dirtyWords.txt dvdimage.str > grepOutput.txt

#grep -f dirtyWords.txt dvdimage.uni.str > grepOutput.uni.txt

Search Options--grep  Results  process sounds a little involved, however . . .

 Took about 30 minutes to image DVD ’s and run commands.  Not Bad!

 Identified Word-Processed files with inappropriate jokes  Doesn't get image files (didn't expect it to)  Doesn't Identify Creator of files   Zero non-repudiation Doesn't help investigation confirm or deny ownership of files  Bonus: found survey data with Too Much Information  Protected student information in clear text

Search Options--ssdeep  linux and Windows  http://ssdeep.sourceforge.net/  Uses fuzzy hashing  A “partial” or “inexact” hashing of files to identify similar files  Its author, Jesse Kornblum, even uses the phrase “finding needles in haystacks” in his documentation!

 Haven't heard of it being used to find questionable pictures, but why not give it a try?

Search Options--ssdeep  “ssdeep! Go find files in the test directory that look like files in the “homeStuff” directory!”

#ssdeep -lrd test homeStuff

 Bummer-  Identified exact matches only

Search Options--ssdeep  Need to try carving out portion of file for true fuzziness  Skip the first 20 blocks (header info and more) of file and cut out the next 70 blocks for the hash comparison:

#dd if=dsc00219.jpg of=219partial.jpg skip=20 count=70

 Create file for comparison

#ssdeep dsc00219partial.jpg > testhash.txt

 Compare fuzzy hash of image to images in directory

#ssdeep -lrm testhash.txt homeStuff

Search Options--ssdeep  Results:  Not Promising  Can check for similarities in files on a file-by-file basis, but that's too much like a manual search  Can easily find exact matches  so you must have the file you are looking for ???

 However . . .

 Useful for an intellectual property issue or finding known bad files

Search Options--foremost  linux and Windows   http://sourceforge.net/projects/foremost/ Identifies files based on a database of file headers and footers  Find a list of most file headers at http://www.wotsit.org

Search Options--foremost This is the header of a gzip file displayed in a hex editor The gzip header is 0x1f 0x8b 0x08

Search Options--foremost

#foremost –o pathToOuptutFile –c pathToConfigFile pathToImageFile

foremost--Results

Search Options--sorter  linux and Windows  perl wrapper for several Sleuthkit tools http://www.sleuthkit.org/  Runs against a disk image  Finds active or deleted files  Then displays thumbnail view of the files

Search Options--sorter

#sorter –s –d pathToutputFile pathToInputFile

Search Options--sorter  Results  Save many steps compared to foremost  Still have a bunch of thumbnails to look through

Search Options

There ’s gotta be an easier way!

Search Options--CBIR  Content Based Image Retrieval  Commercial Versions Available  My Office (me) too cheap —didn’t even look into commercial options!

 Free and Open Source  imgSeek  Linux and Windows http://www.imgseek.net/  Gnu Image Finding Tool  Linux http://www.gnu.org/software/gift/gift.html

Search Options--CBIR  ImgSeek Demo

Lessons Learned  Mission Accomplished!

 Not so much  Found inappropriate material among 6 GB of mixed images, word-processed, and text files  Failed to identify owner/creator of files  Identified a potentially useful tool

Lessons Learned  Need to develop incident response procedure for entire organization  Procedure for breaches of Personally Identifiable Information and Payment Card data are on the books  Procedures for responding to HR requests needs documentation  And needs distribution to de-centralized IT units

References:       The Sleuthkit (includes sorter)  http://www.sleuthkit.org/ foremost  http://sourceforge.net/projects/foremost/ ssdeep  http://ssdeep.sourceforge.net/ imgSeek  http://www.imgseek.net/ GIFT (Gnu Image Finding Tool)  http://www.gnu.org/software/gift/gift.html

Presentation available at:  http://boisestate.edu/oit/iso/HTCIA&CBIR.ppt

Questions?

[email protected]