Title of Presentation

Download Report

Transcript Title of Presentation

Michael Epprecht IT Pro Evangelist Microsoft Corporation [email protected]

http://blogs.technet.com/chitpro-en

Corruption does happen, mostly caused by IO subsystem problems People don’t realize they have corruption until too late Either they don’t know how to check for corruption or they miss the warning signs People don’t know what to do when they do have corruption, leading to: More data loss and downtime than necessary Monetary and even job losses Overall lowered perception of SQL Server Makes it harder to convince management that SQL Server is Enterprise capable If using SQL Server 2008 and Database Mirroring, Primary can get bad page from Partner if it is OK on the Partner.

Three types 823 (a hard I/O error) 824 (a soft I/O error) 825 (a read-retry error) Nice error messages in 2005+ Msg 824, Level 24, State 2, Line 1 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x7232c940; actual: 0x720e4940). It occurred during a read of page (1:143) in database ID 8 at offset 0x0000000011e000 in file 'c:\sqlDB\CORRUPT.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

Logged in msdb.dbo.suspect_pages

Input into single-page restore operations

SQL Server allows pages to be ‘protected’ on disk from corruptions Allows fast detection of corruptions Set using ALTER DATABASE SET PAGE_VERIFY

SQL Server 6.0 onwards Possible for a page to partially written in the event of a power failure – i.e. a torn page Torn-page detection protects SQL Server against this Takes two-bits from each sector, stores them in the page header and writes an alternating bit pattern in each sector On subsequent read, if the pattern is disrupted, the page is torn Does not detect corruptions within a disk sector

New in SQL Server 2005 Per-page checksum Written as the very last thing SQL Server does on a physical write Checked as the very first thing SQL Server does on a physical read Provides the ‘smoking gun’ that the error is not due to SQL Server On by default for new databases in SQL Server 2005+ Checksum failures result in an 824 error CPU overhead of approx 2% Error-detecting, not error correcting Superset of torn-page detection

Checked when: Page is read normally Page is read during CHECKDB Page is read during BACKUP WITH CHECKSUM Page is contained within a checksum’d backup Be Aware: Not available for TEMPDB until SQL Server 2008 Switching it on doesn’t do anything until pages are written…

The only way to read all allocated pages in the database Use to force page checksums to be checked Choose between full checks and WITH PHYSICAL_ONLY Many algorithms to minimize runtime and run ONLINE since SQL Server 2000 Run it at least weekly on each database

By default, CHECKDB will: Only return the first 200 errors Return lots of info that’s distracting in a corruption situation Use the following command with only these options: DBCC CHECKDB (yourdb) WITH ALL_ERRORMSGS, NO_INFOMSGS If it’s taking longer than usual, that should mean that it found some corruption Check the error log for message 5268 from SQL Server 2005 SP2 onwards to see if it’s rescanning some data Most importantly, wait for it to complete

Restart SQL Server Just wastes time and delays getting back online Immediately jump to a last resort and cause data loss without working through options Running repair Rebuilding the transaction log Detach a suspect database It will fail to attach again – now the situation is even worse!

This is the 2nd worst state to be in

Have a documented plan Test the plan Practice the plan with everyone in the team Do you now how to restore your system databases too? Have you tried it?

Have Microsoft Support number handy

"Backup Strategy" means nothing...do you have a "Restore Strategy" too?

Test the backups...RESTORE DATABASE....WITH VERIFYONLY to test backup on another server Keep at least 2 full backups....you never know if you can't restore the last one Backup Transaction logs very often (10 minutes - 3 hours) Use Backup Compression where possible Don't backup to Data or Log drive, get the backup ASAP off the machine Backup system DB's too The reason you have backups is so you can restore them…

Best way to avoid data loss Not necessarily the best way to avoid downtime Depends what kind of backups are available Although backup compression in SQL Server 2008 helps… Plethora of options available Full database backup is a good starting point Series of transaction log backups as well is much better Remember: Backups have to exist to be useful Backups have to be valid to avoid data loss

Always use Minimum Permissions to do your daily work Don’t use SA account for anything Apply the latest hotfix and service pack If using SQL authentication, use password expiry and password complexity rules Don't give anyone but the DBA team sysadmin, not even developers Increase the number of error logs kept Audit failed access to objects

Disconnect from a SQL Server when you are not using it...mistakes happen if you select the wrong server Turn on failed login auditing at a minimal Don't allow applications to use SA or sysadmin role Use the lowest possible permissions for your application Use Windows Authentication where possible Use SQL Server 2008’s Policy Based Management to ensure all rules are adhered to.

Method to pass rogue SQL statements into SQL Server Allows a hacker to potentially access the rest of the network, probe the SQL Server or see data Runs under the authentication of whatever account the application uses Not just a SQL Server issue!

Physical Security Limiting access to the machine itself, backups, and copies of data Encryption of data files and backups – Transparent Data Encryption Authentication Logins – Windows Authentication, SQL Server Authentication Strong passwords, password expiration policies Endpoints – restrict connections by protocol, login, etc.

Authorization Separation of duties Permissions, users, roles, access through SPs or views only No direct access to tables No permissions directly to users; grant to roles and put users in roles Separation of data Instances, databases, schemas, views – or perhaps encrypt it with certificates or keys Principle of least privilege – from service accounts to users Auditing – tracking who did what when – Built into SQL Server 2008

What SQL Server Logs SQL Server Agent Logs Operating System Logs SQL Server Agent Jobs SQL Server Alerts (Severity of 11 and higher should be configured to send alerts. Very Critical are severity 19 (fatal) and higher) Disk and Database Free Space Performance

How System Center Operations Manager System Center Essentials Write your own scripts E-mail Alerts for SQL Server Agent Jobs In Windows 2008 Server, you can create alerts on OS events that can be sent to you via e mail.

Management Data Warehouse in SQL Server 200 8

Benchmark Disk Performance (SQLIOSim) Test Server Performance before you go live with a server Create a small standard benchmark Use it to compare different servers Run your benchmark when you think there are performance issues after go-live and compare to pre-go live Run test as part of HotFix, Cumulative Update or Service Pack validation Run test after major OS updates, Firmware, BIOS and driver updates.

Very useful, but don’t Misuse Autoshrink is BAD.

Send notifications from within the Plan. Don’t rely on the Job (What if you run the Plan manually)

Auto Create Statistics: On Auto Update Statistics: On Auto Shrink: Off Page Verify: Checksum Autogrowth: Use mainly for catching mistakes. File growth should be managed manually. Use fixed amount, not percentage growth.

Production databases should be set to FULL RECOVERY.

Indexes need to be rebuilt or reorganized regularly to minimize fragmentation and reduce wasted space.

Rebuild an index if it is heavily fragmented (>30%). In Enterprise Edition, can perform online. If Standard Edition, consider it an off-line job. This automatically updates statistics, do you don’t need to do this again.

Reorganize an index if it is not heavily fragmented (>5% and <= 30%). This is an online operation and doesn’t use a lot of resources. You must update statistics afterwards, as this is not automatically done for you.

Ideally, you should only rebuild or reorganize indexes that need rebuilding, especially for very large databases. Use sys.dm_db_index_physical_stats to identify what tables/indexes need to be rebuilt/reorganized.

If databases are small, or you don’t know how to identify and correct individual indexes that are fragmented, then consider running a weekly job to rebuild or reorganize all indexes in all of your user databases.

Catch Corruption before it catches you Your backups and plans are worthless, unless you have tested them, recently. The reason you have backups is so you can restore them.

Secure your platform and data before someone plublishes it on the Internet for you Automate, but don’t forget about your servers, let them tell you when they are sick Test your server before you let it loose Look after your databases, they are all you have

http://blogs.technet.com/chitpro-en

Swiss IT Pro Blog: http://blogs.technet.com/chitpro-en Schweizer IT Pro Blog: http://blogs.technet.com/chitpro-de Schweizer TechNet: http://technet.microsoft.com/de-ch/

check out these websites, blogs & more!

Presentations

TechDays: www.techdays.ch

MSDN Events: http://www.microsoft.com/switzerland/msdn/de/presentationfinder.mspx

MSDN Webcasts: http://www.microsoft.com/switzerland/msdn/de/finder/default.mspx

MSDN Events

MSDN Events: http://www.microsoft.com/switzerland/msdn/de/events/default.mspx

Save the date: Tech•Ed 2009 Europe, 9-13 November 2009, Berlin

MSDN Flash (our by weekly newsletter)

Subscribe: http://www.microsoft.com/switzerland/msdn/de/flash.mspx

MSDN Team Blog

RSS: http://blogs.msdn.com/swiss_dpe_team/Default.aspx

Developer User Groups & Communities

Mobile Devices: http://www.pocketpc.ch/ Microsoft Solutions User Group Switzerland: www.msugs.ch

.NET Managed User Group of Switzerland: www.dotmugs.ch

FoxPro User Group Switzerland: www.fugs.ch

check out these websites, blogs & more!

Presentations

TechDays: www.techdays.ch

TechNet Events

TechNet Events: http://technet.microsoft.com/de-ch/bb291010.aspx

Save the date: Tech•Ed 2009 Europe, 9-13 November 2009, Berlin

TechNet Flash (our by weekly newsletter)

Subscribe: http://technet.microsoft.com/de-ch/bb898852.aspx

Schweizer IT Professional und TechNet Blog

RSS: http://blogs.technet.com/chitpro-de/

IT Professional User Groups & Communities

SwissITPro User Group: www.swissitpro.ch

NT Anwendergruppe Schweiz: www.nt-ag.ch

PASS (Professional Association for SQL Server): www.sqlpass.ch

7. – 8. April 2010 Congress Center Basel

Premium Sponsoring Partners Classic Sponsoring Partners Media Partner