MySQL 5.5 and Scalability Ligaya Turmelle Senior Technical Support Engineer - MySQL ligaya.turmelle@oracle.com http://joind.in/1584

MySQL 5.5 and Scalability Ligaya Turmelle Senior Technical Support Engineer - MySQL [email protected] http://joind.in/1584

Transcript MySQL 5.5 and Scalability Ligaya Turmelle Senior Technical Support Engineer - MySQL [email protected] http://joind.in/1584

<Insert Picture Here>
MySQL 5.5 and Scalability
Ligaya Turmelle
Senior Technical Support Engineer - MySQL
[email protected]
http://joind.in/1584
The following is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any features
or functionality described for Oracle’s products
remains at the sole discretion of Oracle.
Areas to Cover
•
•
•
•
•
Semi-Synchronous Replication
Performance Schema
SIGNAL/RESIGNAL
More Partitioning Options
InnoDB - LOTS of InnoDB!
– Performance Improvements
– Scalability Improvements
<Insert Picture Here>
<Insert Picture Here>
Semi-Synchronous
Replication
http://dev.mysql.com/doc/refman/5.5/en/replication-semisync.html
History
• By default is asynchronous
– Master writes to binary log
– Slave connects and “pulls” contents of the binary log
• Master crashes
– No guarantee that a slave has all committed transactions
– Potential lost transactions
• Failover to slave
– May be missing transactions from master
– Loss of data integrity
What is Semi-Synchronous Replication?
• Alternative to asynchronous replication
• Midway point between asynchronous and fully
synchronous
• Doesn’t have to wait for all the slaves to *receive* the
event
– The master blocks, waiting for a single slave to acknowledge
the commit or timeout
• Doesn’t have to wait for all slaves to *commit* the
transaction
– The slave acknowledges after it receives all the events for the
transaction, write it to its relay logs, and flush to disk
• Provides improved data integrity at the cost of
performance
How does it works
• Best Case
– COMMIT
– Master blocks waiting for acknowledgment
– On acknowledgement the master returns to the session
• Worst Case
–
–
–
–
COMMIT
Master blocks
Timeout
Master reverts to asynchronous mode until at least one semisynchronous slave catches up
– Revert back to semi-sychronous mode
<Insert Picture Here>
Performance Schema
http://dev.mysql.com/doc/refman/5.5/en/performance-schema.html
Basic Info
• Monitoring MySQL Server execution at a *very* low
level.
– Used to inspect internal execution of the server at run time
• Focuses on performance data
• Uses the Performance Schema storage engine and
the Performance_Schema database
• Tables in Performance_Schema are views or
temporary tables.
• Available on all platforms but some limitations might
apply
• Activation causes no change in server behavior
• Average user has no need to mess with it - designed
for advanced users
<Insert Picture Here>
SIGNAL/RESIGNAL
http://dev.mysql.com/doc/refman/5.5/en/signal-resignal.html
SIGNAL
• Way to “return” an error.
• AKA exception-handling logic for stored procedures,
stored functions, triggers, events, and database
applications - think throw
• Allows you to control error number, SQLSTATE value,
message
• Can indicate errors, warnings, or “not found.”
• Requires no special permissions
CREATE PROCEDURE p (pval INT)
BEGIN
DECLARE specialty CONDITION FOR SQLSTATE '45000';
IF pval = 0 THEN
SIGNAL SQLSTATE '01000';
ELSEIF pval = 1 THEN
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'An error occurred';
ELSEIF pval = 2 THEN
SIGNAL specialty
SET MESSAGE_TEXT = 'An error occurred';
ELSE
SIGNAL SQLSTATE '01000'
SET MESSAGE_TEXT = 'A warning occurred', \
MYSQL_ERRNO = 1000;
SIGNAL SQLSTATE '45000'
SET MESSAGE_TEXT = 'An error occurred', MYSQL_ERRNO
= 1001;
END IF;
END;
RESIGNAL
• passes on error condition information available during
execution of a handler.
• Requires an active handler to execute
• Possible to both handle an error and return the error
information. Think catch/throw block.
• May change some or all information before passing it
on.
• No special permissions needed to use
• Multiple possible uses
– Alone
– With new signal information
– With a condition value and possibly new signal information
• Otherwise follows the rules of SIGNAL
RESIGNAL alone
CREATE PROCEDURE p ()
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
SET @error_count = @error_count + 1;
IF @a = 0 THEN RESIGNAL; END IF;
END;
DROP TABLE xx;
END
• Pass on error with no change
• Allows for the performance of additional actions in the
active handler, and then passing on the error
message without changing the original condition
information
RESIGNAL with New Signal Information
CREATE PROCEDURE p ()
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
SET @error_count = @error_count + 1;
IF @a = 0 THEN RESIGNAL SET MYSQL_ERRNO = 5;
END IF;
END;
DROP TABLE xx;
END
• Pass on error with changes
• Allows for the performance of additional actions in the
active handler, and can change any or all of the signal
information items
RESIGNAL with a Condition Value
and Possibly New Signal Information
CREATE PROCEDURE p ()
BEGIN
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
SET @error_count = @error_count + 1;
IF @a = 0 THEN RESIGNAL SQLSTATE '45000' SET
MYSQL_ERRNO=5; END IF;
END;
DROP TABLE xx;
END
• Push a new “condition” into the current error stack
• Similar effects of previous example, except instead of
rewriting the signal information it adds a new signal to
the top of the error stack
<Insert Picture Here>
Column Partitioning
Options
http://dev.mysql.com/doc/refman/5.5/en/partitioning-columns.html
Column Partitioning
• Variants on RANGE and LIST partitioning
– RANGE COLUMNS
– LIST COLUMNS
• Allows the use of multiple columns in partitioning keys
– All columns taken into account
• for placing rows in partitions
• for use in partition pruning.
• Support the use of non-integer columns
– All integer types
– DATE and DATETIME
– CHAR, VARCHAR, BINARY, and VARBINARY
• Great article at http://dev.mysql.com/techresources/articles/mysql_55_partitioning.html
Range Column
• Allows for multiple column values
• Major differences from just RANGE
– Does not accept expressions, only names of columns
– Accepts a list of one or more columns
– Not restricted to integer columns
• Go to the example
CREATE TABLE rc1 (a INT, b INT)
PARTITION BY RANGE COLUMNS(a, b) (
PARTITION p0 VALUES LESS THAN (5, 12),
PARTITION p3 VALUES LESS THAN (MAXVALUE, MAXVALUE)
);
mysql> INSERT INTO rc1 VALUES (5,10), (5,11), (5,12);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> SELECT PARTITION_NAME,TABLE_ROWS
->
FROM INFORMATION_SCHEMA.PARTITIONS
->
WHERE TABLE_NAME = 'rc1';
+--------------+----------------+------------+
| TABLE_SCHEMA | PARTITION_NAME | TABLE_ROWS |
+--------------+----------------+------------+
| p
| p0
|
2 |
| p
| p1
|
1 |
+--------------+----------------+------------+
List Column
• Again allows for multiple column values
• Allows you to use
– string types
– DATE,
– DATETIME
• Do not need to convert values to integers to work with
• Much easier to read
• Go to the example
CREATE TABLE customers_1 (
first_name VARCHAR(25),
last_name VARCHAR(25),
street_1 VARCHAR(30),
city VARCHAR(15),
renewal DATE
)
PARTITION BY LIST COLUMNS(city) (
PARTITION NorthEast VALUES IN('Boston', 'New York',
'Providence'),
PARTITION South VALUES IN('Miami', 'Atlanta', 'New
Orleans'),
PARTITION MidWest VALUES IN('Chicago', 'Houston',
'Denver'),
PARTITION West VALUES IN('Los Angeles', 'San
Francisco', 'Seattle')
);
<Insert Picture Here>
InnoDB
http://dev.mysql.com/doc/innodb-plugin/1.1/en/innodb-performance.html
System Mutex Library
• Originally used its own implementation of mutexes
and read/write locks with Pthreads
• From 1.0.3, changed to the GNU Compiler Collection
(GCC) for atomic memory access instead of using the
Pthreads approach previously used.
• Faster more efficient locking
• Greater scalability on multi-core platforms.
• Enabled out of the box on platforms where supported
• Freebie
Memory Allocator
• History:
– Originally built its own and protected by a single mutex
– Also included a wrapper around the system allocator (malloc
and free)
• also protected with a single mutex
– All bad for multi-core computers
• Great improvements have occurred since then
– Perform better and are more scalable than they were in the
past
• Most workloads benefit from using the new
schedulers.
• innodb_use_sys_malloc
Change Buffer
• Most relevant to IO bound systems
– Will cache changes to secondary index entries
– Can buffer inserts, delete-marking operations, and purges
• But...
– Resides in buffer pool - reducing memory to cache data
– Increases risk of a purge hold
• Can control whether InnoDB performs this buffering
using innodb_change_buffering.
– Multiple values available to buffer only what you want
Read Ahead
• Prefetch multiple pages in the buffer cache
asynchronously
• Only the Linear algorithm now
– Based on access pattern of pages, not just their number
• Can now control when InnoDB performs a read-ahead
operation with innodb_read_ahead_threshold
– Higher the value, the more strict the access pattern
• Efficiency can be found with
Innodb_buffer_pool_read_ahead_evicted <
Innodb_buffer_pool_read_ahead
Background IO Threads
• History
– 1 read, 1 write, 1 log, 1 insert buffer (non-Windows)
– Windows used innodb_file_io_threads (deprecated)
• Now configurable
– Number of background *helper* threads tasked with servicing
any read and write I/O on the data pages
• Makes InnoDB more scalable on high end systems
INNODB_IO_CAPACITY
• Controls the IO capacity of the InnoDB Master thread.
– It controls how many dirty pages are flushed and the size of
the ibuf contraction
– Also handles the purge but...
• Should be set to approximately the number of I/O
operations that the system can perform per second
Improved Purge Scheduling
• History
– One of the operations handled by the Master Thread
• Now moved to its own thread
– separates main database operations from maintenance work
happening in the background.
• Improves scalability
• May not see improvement on speed since it may hit
new types of contention
– lays groundwork for further tuning and potentially more then
one purge thread.
• Enable with innodb_purge_threads=1
Adaptive Dirty Page Flushing
• History
– Performed in background
– Performed by master thread
– Currently will aggressively flush buffer pool pages if
innodb_max_dirty_pages_pct percentage exceeded
• New algorithm
– Intent is to smooth overall performance, eliminating steep dips
in throughput
– Function of the number of dirty pages in the buffer cache and
the rate at which redo is being generated
• self adapting to sudden changes in workload
• better performance
– Turn on and off with innodb_adaptive_flushing
PAUSE in spin loops
• Most people won’t care about this
• But... if you are CPU bound - you will
– Increases overall performance with CPU-bound workloads
– Added benefit of minimizing power consumption during the
execution of the spin loops.
• Ugly details:
– If the spin loops are executed too quickly, system resources
are wasted, imposing a relatively severe penalty on
transaction throughput
– Uses a PAUSE instruction in its spin loops on all platforms
where such an instruction is available.
• Nothing needs to be done to take advantage of this
Faster Recovery
• A number of optimizations speed up certain steps of
the recovery that happens on the next startup after a
crash.
– Particularly scanning the redo log and applying the redo log
are faster.
• If you kept the size of your logfiles artificially low
because recovery took a long time (1 hour to 1 day),
you can consider increasing the logfile size.
• No action needed to get this
Buffer Cache - Scan Resistant
• Problem: with large scans it is possible to move all
your “hot” data out of the buffer pool.
– More IO loading the hot data back into the buffer pool
– Slow response times
• How to prevent it:
– Mixed OLTP type with periodic batch reporting queries which
result in large scans
• set innodb_old_blocks_time for the duration of the batch
run
– Scanning large tables that cannot fit entirely in the buffer pool
• set innodb_old_blocks_pct to a small value
• Always benchmark to verify the effectiveness
Multiple Buffer Pools
• Primarily useful for people with a large buffer pool
size
• Reduce the chances of encountering bottlenecks
trying to access the same buffer pool
• Pages are stored in or read from random buffer pools
• Each buffer pool manages its own internal structures
and is protected by its own buffer pool mutex
• innodb_buffer_pool_instances
• innodb_buffer_pool_size is divided up among all the
buffer pools
Multiple Rollback Segments
• History:
– A bottleneck on high-capacity systems - currently it can
handle a maximum of 1023 concurrent transactions that
change any data
• Limit on concurrent transactions is greatly expanded
• Improves
– Scalability - higher number of concurrent transactions
– Performance - less contention when different threads access
the rollback segments
• To enable
– Do a slow shutdown (innodb_fast_shutdown=0) before
upgrading
Log Sys Mutex
• Historically did double duty:
– Controlled access to internal data structures related to log
records and the LSN
– Controlled access to pages in the buffer pool that are
changed when a mini-transaction is committed.
• Now broken out into separate mutexs
• Improves performance
• Nothing to enable - it is free
Questions?