SQL/MED and More

Download Report

Transcript SQL/MED and More

SQL/MED AND MORE
Management of External Data
Database Seminar HS11/12
Overview

Introduction

SQL/MED

Linking PostgreSQL & MSSQL

Further Information (about SQL/MED)

Conclusion
Introduction (1/2)



Different Database Managemenent Systems
Each system has different benefits
Possible scenarios
…
…
Introduction (2/2)



SQL/MED gives new opportunities
Use other systems as needed
That’s possible? Really?
SQL/MED (1/3)

SQL/MED defined in ISO/IEC 9075-9:2003
 Management

of External Data
Two concepts
 Foreign
Data Wrappers
 Datalinks

At least 10 years old
 Not
very widespread
 Most “googled” information is PostgreSQL related
SQL/MED (2/3)
Foreign Data Wrappers

Access external data

FDW is a library
 Programming
language neutral
 Compile for different OS’s

Good idea – breakthrough?
 API
 Existing
technologies
AdvanceInitRequest
AllocDescriptor
AllocQueryContext
AllocWrapperEnv
Close
ConnectServer
FreeDescriptor
FreeExecutionHandle
FreeFSConnection
FreeQueryContext
FreeReplyHandle
FreeWrapperEnv
GetAuthorizationId
GetBoolVE
GetDescriptor
GetDiagnostics
GetDistinct
GetNextReply
GetNumBoolVE
GetNumChildren
GetNumOrderByElems
GetNumReplyBoolVE
GetNumReplyOrderBy
GetNumReplySelectElems
GetNumReplyTableRefs
GetNumRoutMapOpts
GetNumSelectElems
GetNumServerOpts
GetNumTableColOpts
GetNumTableOpts
GetNumTableRefElems
GetNumUserOpts
GetNumWrapperOpts
GetOpts
GetOrderByElem
GetReplyBoolVE
GetReplyCardinality
GetReplyDistinct
GetReplyExecCost
GetReplyFirstCost
GetReplyOrderElem
GetReplyReExecCost
GetReplySelectElem
GetReplyTableRef
GetRoutineMapping
GetRoutMapOpt
GetRoutMapOptName
GetSelectElem
GetSelectElemType
GetServerName
GetServerOpt
GetServerOptByName
GetServerType
GetServerVersion
GetSPDHandle
GetSQLString
GetSRDHandle
GetStatistics
GetTableColOpt
GetTableColOptByName
GetTableOpt
GetTableOptByName
GetTableRefElem
GetTableRefElemType
GetTableRefTableName
GetTableServerName
GetTRDHandle
GetUserOpt
GetUserOptByName
GetValExprColName
GetValueExpDesc
GetValueExpKind
GetValueExpName
GetValueExpTable
GetVEChild
GetWPDHandle
GetWrapperLibraryNam
e
GetWrapperName
GetWrapperOpt
GetWrapperOptByName
GetWRDHandle
InitRequest
Iterate
Open
ReOpen
SetDescriptor
TransmitRequest
SQL/MED (3/3)
Data links

Link files like cell values

DBMS becomes “manager”
 Only
process allowed to
change the file
 Integrity mechanism

Good idea – breakthrough?
 Very
OS heavy
 Existing technologies
Linking PostgreSQL & MSSQL (1/4)

Microsoft Linked Servers

SQL/MED: Foreign Data Wrappers
Linking PostgreSQL & MSSQL (2/4)
Microsoft Linked Servers

OLE DB
Very similar to Foreign Data Wrappers
 Connection to "wrappers" via interface


Related to ODBC



Not limited to SQL
C++ instead of C
Widespread


Many OLE DB providers
available
Supports ODBC
Linking PostgreSQL & MSSQL (3/4)
PostgreSQL Foreign Data Wrappers (1/2)


Using the OBDC_FDW extension
One time
CREATE FOREIGN DATA WRAPPER odbc_fdw
CREATE EXTENSION odbc_fdw;

odbc_fdw.so
LIBRARY 'odbc_fdw.so‘;
Each time
CREATE SERVER odbc_server
OPTIONS (dsn ‘…DSN…');
FOREIGN DATA WRAPPER odbc_fdw
passed to
FDW odbc_table (
CREATE FOREIGN
TABLE
db_id integer,
SELECT
db_name varchar(255)
)
SERVER odbc_server
OPTIONS (… sql_query 'select id, name from `dbo`.`table`' …);
passed to FDW
Linking PostgreSQL & MSSQL (4/4)
PostgreSQL Foreign Data Wrappers (2/2)


PostgreSQL proprietary API for FDWs
‘C’ Code

Method pointer in header
odbc_fdw_handler(PG_FUNCTION_ARGS)
{
FdwRoutine *fdwroutine = makeNode(FdwRoutine);
fdwroutine->PlanForeignScan = odbcPlanForeignScan;
fdwroutine->ExplainForeignScan = odbcExplainForeignScan;
fdwroutine->BeginForeignScan = odbcBeginForeignScan;
fdwroutine->IterateForeignScan = odbcIterateForeignScan;
fdwroutine->ReScanForeignScan = odbcReScanForeignScan;
fdwroutine->EndForeignScan = odbcEndForeignScan;
PG_RETURN_POINTER(fdwroutine);
}
Further Information about SQL/MED (1/4)

Query costs

Interesting applications
Further Information about SQL/MED
Query costs (1/2)

Consider the following tables
City
Employee
-
500

ID
Name
Gender
*
1..* -
ID
Name
Latitude
Longitude
Row count of a JOIN statement (all employees)
Best case
500 rows
 Worst case 100’000 rows


200
Best execution strategy
External system performs JOIN?
 Perform JOIN locally?

Further Information about SQL/MED
Query costs (2/2)

SQL Server
Costs
PlanForeignScan

1.0 $ per transferred row

0.1 $ per local join operation
FDW
External System

Strategy #A



SELECT * FROM Employee JOIN City
 100’000$
Strategy #B

Worst
Case
SELECT * FROM Employee
 500$

SELECT * FROM City
Scenario
 200$
Clear win for #A

Important to implement PlanForeignScan

Local JOIN 500x200 = 10’000$
Further Information about SQL/MED
Interesting applications

Extension www_fdw to query all Restful Webservices
CREATE SERVER google_server FOREIGN DATA WRAPPER www_fdw
OPTIONS (uri 'https://ajax.googleapis.com/search/web?v=1.0');
CREATE FOREIGN TABLE google_table (
title text, snippet text, link text,
q text)
SERVER google_server;
Field legend
• Response
• Request
select * from google_table where q =’cat dog’ limit 1;
title
|
snippet
|
link
-------------------+----------------------------------------------------CatDog – Wikipedia | CatDog is an American... | http://en.wikipedia...
Conclusion

Great concepts



Drawbacks which prevent the breakthrough




FDW: Accessing external data via standard interfaces
Datalink: Create secure links from tuples to files
Far too complex API
Existing technologies (Microsoft, Oracle)
Documentation
Do we really need it?

Most environments are based on 1 server technology



Use built-in "MED" (Linked Servers, DBLink)
Other ways to solve problems
Many years to stable release
Outlook

Relies on community
 Stable
wrappers needed
 Other DBMS need to push it

Uncertain future
The End

Questions?