Getting Control of Your Instances with Policy

Download Report

Transcript Getting Control of Your Instances with Policy

Searching Binary Data in SQL Server 2012

Steve Jones SQLServerCentral.com

Coming up…

Speaker

Bob Ward

Title

Windows Azure SQL Database Troubleshooting

Room

Theatre Chris Webb DAXMD: SSAS Multidimensional meets DAX and Power View Exhibition B Argenis Fernandez Lean and Mean: Running SQL Server on Windows Server Core Suite 3 Tim Mitchell Mark Broadbent Andre Kamman Cleaning Up Dirty Data with SSIS Moves Like Jagger: Upgrading to SQL Server 2012 ETL shootout-SSIS vs Powershell Suite 1 Suite 2 Suite 4 # SQLBITS

Agenda

• • • • Binary Data Full Text in SQL Server 2012 Basic Searches Semantic Search

Agenda

• • • • Binary Data Full Text in SQL Server 2012 Basic Searches Semantic Search

Binary Data

• Types of data – Structured (normal, RDBMS tables) – Semi-structured (XML) – Unstructured (BLOBs, music, images, documents)

Demo Binary Data

Binary Data

Binary Data

Binary Data

• • • Unstructured data in SQL Server – Notes, memos?

– XML – Varchar(max)/varbinary(max) Filestream Filetable

Filestream

• • • • Introduced in SQL Server 2008 Improves management of file-like data by integrating backup/restore/transactions Improves performance by storing the data in the file system.

Ex: AdventureWorks.Production.Document

FileTable

• • • • • New in SQL Server 2012 Built on Filestream Allows a folder to appear as a table Explorer style access to the table Avoids complex programming to access Filestream data.

Demo Filestream/Filetable

Agenda

• • • • Binary Data Full Text in SQL Server 2012 Basic Searches Semantic Search

Full Text in SQL Server 2012

• • • • • Major rewrite of Full Text Indexing and Search in SQL Server 2008.

FTS -> iFTS Process is now integrated inside SQL Server – Sqlservr.exe (searching) – Fdhost.exe (filters) Index stored as an internal table Backup/restore now integrated

Full Text in SQL Server 2012

• • • • • Performance increases – – Better scalability (350mm), parallelism, indexing Max full-text crawl range (CPU) – Master merge DOP New languages (Czech, Greek) New word breakers/stemmers Property Lists Customizable NEAR

Full Text in SQL Server 2012

• • • • Word breakers Stemmers Stoplists Thesaurus file

Full Text in SQL Server 2012

• • • Full Text Search Programming – – CONTAINS CONTAINSTABLE – FREETEXT – FREETEXTTABLE Language specific searches – multi-language – use UNION Some objects do not allow FTS

Agenda

• • • • Binary Data Full Text in SQL Server 2012 Basic Searches Semantic Search

iFilters

• iFilters – Filter to allow you to search the content of unstructured data.

– Standard format ( iFilter Interface ) – Basic Office 2007 filters included.

– Download pdf, Office 2010 filters

Searching Binary Data

• • Searching really requires Full Text Search subsystem.

Need iFilters to ignore the metadata

Searching Binary Data

• Property Lists – Allow searches of standard properties for documents • i.e. Title, Name, Author, etc.

– Can be varbinary/image or Filestream documents – Troubleshoot TF 7603

Demo Binary Data Search

Agenda

• • • • Binary Data Full Text in SQL Server 2012 Basic Searches Semantic Search

Semantic Search

• • • New in 2012 – V1.0

Find the meaning of the documents and use that for matching. Not just keywords

Semantic Search

Semantics (from Greek : sēmantiká, neuter plural of sēmantikós) [1][2] is the study of meaning . It focuses on the relation between signifiers, such as words , phrases , signs , and symbols , and what they stand for, their denotata .

Semantic Search

• • • • How does this work?

TF-IDF (term frequency - inverse document frequency) Document Similarity Index – Cosine similarity algorithm Based on “keyword distribution in the language”

Semantic Search

Semantic Search

• • • • SQL Server 2012 – Need to use semanticsdb from Microsoft – Set of phrases for each language – Hard coded (no learning!) Only unigrams in SQL Server 2012 Look for ngrams in the future Supported in query plans and extended events

Demo Semantic Search

Coming up…

Speaker

Bob Ward

Title

Windows Azure SQL Database Troubleshooting

Room

Theatre Chris Webb DAXMD: SSAS Multidimensional meets DAX and Power View Exhibition B Argenis Fernandez Lean and Mean: Running SQL Server on Windows Server Core Suite 3 Tim Mitchell Mark Broadbent Andre Kamman Cleaning Up Dirty Data with SSIS Moves Like Jagger: Upgrading to SQL Server 2012 ETL shootout-SSIS vs Powershell Suite 1 Suite 2 Suite 4 # SQLBITS

The End

• • • • Questions?

Please fill out your evaluations www.sqlservercentral.com/forums www.voiceofthedba.com/talks

References

• • • • • Full Text Search http://msdn.microsoft.com/en us/library/ms142571 What’s New - http://msdn.microsoft.com/en us/library/cc645577 Behavior Changes to Full Text Search http://msdn.microsoft.com/en us/library/ms143272.aspx

Breaking Changes in Full Text Search http://msdn.microsoft.com/en us/library/ms143709.aspx

Sp_fulltext_service http://msdn.microsoft.com/en us/library/ms175058.aspx

References

• • • • iFilter Interface http://msdn.microsoft.com/en us/library/ms691105%28v=vs.85%29.aspx

Office 2012 Filter Pack http://www.microsoft.com/en us/download/details.aspx?id=17062 How to register filter packs in SQL Server http://support.microsoft.com/kb/945934 Adobe PDF iFilter http://www.adobe.com/support/downloads/det ail.jsp?ftpID=2611

References

• • • • Find Property Set GUIDs and Property Integer IDs for Search Properties http://msdn.microsoft.com/en us/library/ee677618 Configure and Manage Word Breakers and Stemmers for Search http://msdn.microsoft.com/en us/library/ms142509 Configure and Manage Stopwords and Stoplists for Full-Text Search http://msdn.microsoft.com/en us/library/ms142551 Configure and Manage Thesaurus Files for Full-Text Search http://msdn.microsoft.com/en us/library/ms142491

References

• • • Semantic Search – http://msdn.microsoft.com/en us/library/gg492075 Beyond Relational – Semantic Search http://blogs.technet.com/b/andrew/archive/2 011/09/06/beyond-relational-semantic search-with-sql-server-filetable.aspx

MySemanticSearch – Codeplex http://mysemanticsearch.codeplex.com/

References

• • Full text and Semantic Search in SQL Server 2008 and 2012 http://channel9.msdn.com/Events/TechDays/ Techdays-2012-the Netherlands/2297?format=html5 http://users.cis.fiu.edu/~lzhen001/activities/K DD2011Program/docs/p213.pdf

Images