Category Archives: MySQL

MySQL IF() statement

MySQL IF() statement is an usual if else statement we used, just that the syntax is different, we shall conform the MySQL standard to write a MySQL if statement. Its used when we wanna check if condition in our sql statement. Example below:
1.$sql="SELECT * FROM users WHERE DATEDIFF(CURDATE(),IF('register_date'=NULL,CURDATE(),'register_date'))='1'";

MySQL IF() statement holds 3 expressions, IF(expr1, expr2, expr3). expr1 used to check condition, if expr1 TRUE, then return expr2, else return expr3. The example above interpreted as get all the users records who register yesterday, where difference between today and register date equals to 1 day. When it performs CURDATE() function, it will checks if register date is NULL, if its NULL then return CURDATE(), else return register date. So, for those register date equals to NULL, it will return today, CURDATE() and hence results in return empty result or force the sql statement not be executed, as DATEDIFF(CURDATE(),CURDATE())<>1.

Multiple Buffer Pools in MySQL 5.5

In our work to improve MySQL scalability we tested many opportunities to scale the MySQL Server and the InnoDB storage engine. The InnoDB buffer pool was often one of the hottest contention points in the server. This is very natural since every access to a data page, UNDO page, index page uses the buffer pool and even more so when those pages are read from disk and written to disk.

As usual there were two ways to split the buffer pool mutex, one way is to split it functionally into different mutexes protecting different parts. There have been experiments in particular on splitting out the buffer pool page hash table, the flush list. Other parts that have been broken out in experiments are the LRU list, the free list and other data structures internally in the buffer pool. Additionally it is as usual possible to split the buffer pool into multiple buffer pools. Interestingly one can also combine using multiple buffer pools with splitting the buffer pool mutex into smaller parts. The advantage of using multiple buffer pools is that it is very rare that it is necessary to grab multiple mutexes for the buffer pool operation which quickly becomes the case when splitting the buffer pool into multiple mutex protection areas.

After working on scalability improvements in MySQL and InnoDB I noted that all the discussion was around how to split the buffer pool mutex and no dicsussion centered around how to make multiple buffer pools out of the buffer pool. I decided to investigate how difficult it would be to make this change. I quickly realised that it needed a thorugh walk through of the code. It required a code check that required checking about 150 methods and their interaction. This sounds like a very big task, but fortunately the InnoDB code is well structured and have fairly simple dependencies between its methods. After this walk through of the buffer pool code one quickly found that there were 3 different ways of getting hold of the buffer pool, one method was to calculate it using the space id and page id. This is the normal method in most methods used in the external buffer pool interface. However there were numerous occasions where we only had access to the block or page data structure and it would be a bit useless to recalculate the hash value in every method that needed access to the buffer pool data structure. So it was decided to leave a reference to the buffer pool in every page data structure. There were also a few occasions where one needed to access all buffer pools.

The analysis proved that most of the accesses to the buffer pool was completely independent of other accesses to the buffer pool for other pages. InnoDB uses read-ahead and neighbour writes in the IO operations that are started from the buffer pool. These always operate on an extent of 64 pages. Thus it made sense to map the pages of 64 pages into one buffer pool to avoid having to operate on multiple buffer pools on every IO operation.

With these design ideas there were only a few occasions where it was necessary to operate on all buffer pools. One such operation was when the log required knowledge of the page with the oldest LSN of the buffer pool. Now this operation requires looping over all buffer pools and checking the minimum LSN of each buffer pool instance. This is a fairly rare operation so isn’t a scalability issue.

The other operation with requirement to loop over all pages needed a bit more care, this operation is the background operation flushing buffer pool pages to disk. A couple of problems needs consideration here. First it is necessary to flush pages regularly from all buffer pool instances, secondly it’s still important to flush neighbours. Given that many disks are fairly slow, it can be problematic to spread the load in this manner to many buffer pools. This is an important consideration when deciding how many buffer pool instances to configure.

The default number of buffer pool is one and for most small configurations with less than 8 cores it’s mostly a good idea not to increase this value. If you have an installation that uses 8 cores or more one should also pay attention to the disk subsystem that is used. Given that InnoDB often writes up to 64 neighbours in each operation and that the flushing should happen each second, it makes sense to have a disk subsystem capable of having 500 IO operations per second to use 8 buffer pool instances. This can be set in the innodb_io_capacity configuration variable. One SSD drive should be capable of handling this, two fast hard drives or 3 slow ones.

In our experiments we have mostly used 8 buffer pools, more buffer pools can be useful at times. The main problem with many buffer pools is related to the IO operations. It is important to have a balanced IO load in the MySQL server.

Our analysis of using multiple buffer pool instances have shown some interesting facts. First the accesses to the buffer pools is in no way evenly spread out. This is not surprising given that e.g. the root page of an index is a very frequently accessed page. So using sysbench with only one table, there will obviously be much more accesses to certain buffer pool instances. Our experiments shows that in sysbench using 8 buffer pools, the hottest buffer pool receives about one third of all accesses. Given that sysbench is a worst case scenario for the multiple buffer pool case, this means that most applications that tend to use more tables and more indexes should have a much more even load on the buffer pools.

So how much does multiple buffer pools improve the scalability of the MySQL Server. The answer is as usual dependent on application, OS, HW and so forth. But some general ideas can be found from our experiments. In sysbench using a load which is entirely held in main memory, so the disk is only used for flushing data pages and logging, in this system the multiple buffer pools can provide up to 10% improvement of the throughput in the system. In dbStress, the benchmark Dimitri uses, we have seen all the way up to 30% improvement. The reason here is most likely that dbStress uses more tables and have avoided many other bottlenecks in the MySQL Server and thus the buffer pool was a worse bottleneck in dbStress compared to sysbench. From the code it is also easy to see that the more IO operations the buffer pool performs, the more the buffer pool mutex will be acquired and also often held for a longer time. One such example is the search for a free page on the LRU list every time a read is performed into the buffer pool from the disk.

Furthermore the use of multiple buffer pool opens up for many more improvements and also it doesn’t remove the possibility to split the buffer pool mutex even more.

Another manner of displaying the importance of using multiple buffer pools is the mutex statistics on the buffer pool mutex. With one buffer pool the buffer pool had about 750k accesses per second in a sysbench test where the MySQL Server had access to 16 cores. 50% of those accesses met a mutex already held, so it’s obvious that the InnoDB mutex subsystem is very well aligned with the buffer pool mutex which have very short duration which makes spinning waiting for it very fruitful. Anyways a mutex which is held 50% of the time makes the buffer pool mutex a limiting factor of the MySQL Server. Quite a few threads will often spend time in the queue waiting for the buffer pool mutex. So splitting the buffer pool into 8 instances even in sysbench means that the hottest buffer pool receives about one third of the 750k accesses so should be held about 17% of the time. Our later experiments shows that the hottest buffer pool mutexes are now held up to about 14-15% of the time. So the theory matches the real world fairly well. This means that the buffer pool is still a major factor in the MySQL Scalability equation but is now more on par with the other bottlenecks in the MySQL Server.

The development project of multiple buffer pools happened at a time when the MySQL and InnoDB teams could start working together. I was impressed by the willingness to cooperate and the competence in the InnoDB team that made it possible to introduce multiple buffer pools into MySQL 5.5. Our cooperation has continued since then and this has led to improvements in productivity on both parts. So for you as a MySQL user this spells good times going forward.

Ubuntu Upstart for automatic MySQL start and stop

Here at Recorded Future we use Ubuntu (running on Amazon EC2), but so far we have not explored Ubuntu Upstart that much. During the holidays I made an effort to get acquainted with Upstart and to implement proper MySQL start and stop with it.

If you do not know Upstart, this is the way you start and stop services in Ubuntu, and it serves the same purpose as the old /etc/init.dscripts, but are a bit more structured and powerful. That said, Upstart is regrettably far from complete, although the functionality is much better and Upstart has some cool features, some things do not work that well. For one thing, documentation, where it exists, is useless, at best. Secondly, there is very limited ability to test and develop Upstart scripts. And this is made worse by the fact that the documentation is so bad. Another thing is that Upstart insist on stopping services, by default, by sending a brutal kill signal. Not good for databases, mostly.

In the /etc/init directory are the Upstart scripts you have. In difference to the old init.d scripts, you cannot disable a service in Upstart curenntly. If it is in /etc/init it will be started at system start. That’s it. And this is something that I am sure will be fixed, but for now, again, is something we have to live with. Upstart scripts have the suffix .conf (don’t ask me why), so the default MySQL Upstart script, for example, is called /etc/init/mysql.conf.

In an Upstart script, there are Stanzas that determine what to do. Like the exec Stanza that runs a program for example. And you may then ask, when is it run? Startup? Shutdown? And the answer is startup. For shutting things down, as I said before, Upstart will by default just send a kill -9 signal.

The minimal startup script you can have, and this actually works in a reasonable way, is to just have one line with an exec stanza, like this:
exec /usr/bin/mydaemon
Which will start the daemon. For stopping the daemon, Upstart will send a -9 signal to the started process by default, and nothing more is needed in the Upstart script.

For MySQL, we need to make things a bit more complicated. The default mysql.conf Upstart script really is not good. For one thing, it will not do a controlled shutdown of MySQL (this is possible even if Upstart will eventually send a kill -9 anyway). Secondly, this script assumes that what we use is a standard Ubunty installed MySQL distribution, so if you have installed MySQL in /usr/bin/mysql5147 or somethings like that, you are out of luck.

So what I wanted to create was an Upstart script for MySQL that fullfilled these requirements:

  1. Starts MySQL automatically.
  2. Waits for MySQL to be available before exiting.
  3. Be configurable to support different MySQL install locations, data directories etc.
  4. Do a clean shutdown of MySQL when stopping the MySQL services.

Before I show you what I ended up with, I want to comment on the points 2 and 4 above. With Upstart, you can define a script or command to run just before or after a services has been started or stopped, and this is what I use to wait for MySQL to become available, and to send a SIGTERM to the MySQL Server when stopping (which will do a clean MySQL shutdown).

So here we go, a complete MySQL Upstart script, the way I want it to work:

#
# MySQL Service for Recorded Future
#
description     "MySQL Server"
author          "Anders Karlsson, Recorded Future"

start on (net-device-up
          and local-filesystems
          and runlevel [2345])
stop on runlevel [016]

expect fork
kill timeout 2

# Set variables.
env MYSQL_ETC=/etc/mysql
env MYSQL_PIDFILE=/var/run/mysql.pid
env MYSQL_HOME=/usr/local/mysql5.5
env MYSQL_INSTANCE=my
umask 007

exec $MYSQL_HOME/bin/mysqld_safe --defaults-file=$MYSQL_ETC/$MYSQL_INSTANCE.cnf >> /tmp/x.out &

post-start script
    loop=600
# Wait for MySQL to start.
    while [ $loop -gt 0 ]; do
        if $MYSQL_HOME/bin/mysqladmin --defaults-file=$MYSQL_ETC/$MYSQL_INSTANCE.cnf ping; then
            break
        fi
        loop=$(($loop - 1))
        sleep 1
    done
    exit 0
end script

# Send a soft SIGTERM to MySQL before Upstart will kill it.
# A Sigterm to mysqld will cause a controlled shutdown.
pre-stop script
    exec kill -SIGTERM `cat $MYSQL_PIDFILE`

# Wait for MySQL to end. Flushing buffers and all.
    loop=600
    while [ $loop -gt 0 ]; do
# If the pidfile is found, then continue waiting.
        if [ -e $MYSQL_PIDFILE ] ; then
            loop=$((loop - 1))
            sleep 1
            continue
        fi
        break
    done
end script

To be honest, this is not what I create for all our MySQL servers. Instead I used this to create a chef template, chef is what we use for configuration management here (see http://www.opscode.com/ for more on chef), and here it is put to good ude to generate an Upstart script for MySQL. The above is just an example.

Optimizing Queries on Cluster

On the cluster mailing list (cluster@lists.mysql.com , thread “slow selects”) there was recently a very good example of how to optimize queries on Cluster. Thanks to Nick Keefen for raising this problem and Jeff Sturm for the answer how to solve it!

In short the problem is that the Optimizer in the MySQL server does not get adequate statistics from the data nodes about table sizes, indexes etc. This makes the Optimizer clueless in some cases how to order tables in joins, and also in some cases which is the best indexes to use.

So here is the problem that Nick highlighted:

When the tables are stored in MyISAM:
mysql> SELECT v.vid_id, v.vid_title, u.user FROM phpfox_user AS u JOIN phpfox_videos AS v ON(v.vid_userid = u.id) limit 3;
3 rows in set (0.32 sec)

phpfox_user is about 100000 rows and phpfox_videos is 170000 rows large

Trying to run the same query on my cluster machine, i get
3 rows in set (20.47 sec)

Why is this? Let’s look at the EXPLAIN of the queries:

MyISAM explain:

mysql> explain SELECT v.vid_id, v.vid_title, u.user FROM phpfox_user AS u JOIN phpfox_videos AS v ON(v.vid_userid = u.id) ORDER BY v.vid_time DESC LIMIT 3;
+—-+————-+——-+——–+—————+———
+———+———————+——–+—————-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+—-+————-+——-+——–+—————+———
+———+———————+——–+—————-+
| 1 | SIMPLE | v | ALL | vid_userid | NULL | NULL | NULL | 135025 | Using filesort |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | nktest.v.vid_userid | 1 | |
+—-+————-+——-+——–+—————+———
+———+———————+——–+—————-+

NDB explain:

mysql> explain SELECT v.vid_id, v.vid_title, u.user FROM phpfox_user AS u JOIN phpfox_videos AS v ON(v.vid_userid = u.id) ORDER BY v.vid_time DESC LIMIT 3;
+—-+————-+——-+——+—————+———–
-+———+—————–+——-+————————
———+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+—-+————-+——-+——+—————+———–
-+———+—————–+——-+————————
———+
| 1 | SIMPLE | u | ALL | PRIMARY | NULL | NULL | NULL | 82124 | Using temporary; Using filesort |
| 1 | SIMPLE | v | ref | vid_userid | vid_userid | 4 | bb2_phpfox.u.id | 1 | |
+—-+————-+——-+——+—————+———–
-+———+—————–+——-+————————
———+

Note the additional ‘Using temporary’ clause for NDB. Nearly all of the 20 seconds taken by the query fall at ‘copying to temporary table’.

The problem is that the Optimizer has changed the order (in case of NDB) how tables should be joined!

Jeff Sturm replied to Nick with a remedy:

Looks like the query optimizer is putting the tables in the wrong order.
(This isn’t uncommon in my experience, as NDB seems to have less
information than other engines to optimize queries at runtime.)

Try modifying your query with STRAIGHT_JOIN syntax, i.e.:

SELECT v.vid_id, v.vid_title, u.user FROM phpfox_videos AS v
STRAIGHT_JOIN phpfox_user AS u ON v.vid_userid = u.id ORDER BY
v.vid_time DESC LIMIT 3;

Features added to MySQL 5.5

Improved scalability on multi-core CPUs. The trend in hardware development now is toward more cores rather than continued increases in CPU clock speeds, which renders “wait until CPUs get faster” a nonviable means of improving database performance. Instead, it is necessary to make better use of multiple cores to maximally exploit the processing cycles they make available. MySQL 5.5 takes advantage of features of SMP systems and tries to eliminate bottlenecks in MySQL architecture that hinder full use of multiple cores. The focus has been on InnoDB, especially locking and memory management.

InnoDB I/O subsystem changes enable more effective use of available I/O capacity.

Several modifications improve operation of MySQL Server on Solaris.

There is better access to execution and performance information. Diagnostic improvements include DTrace probes, expanded SHOW ENGINE INNODB STATUS output, and a new status variable.

Support for an interface for semisynchronous replication: A commit performed on the master side blocks before returning to the session that performed the transaction until at least one slave acknowledges that it has received and logged the events for the transaction. Semisynchronous replication is implemented through an optional plugin component. See Section 16.3.8, “Semisynchronous Replication”

Support for the SQL standard SIGNAL and RESIGNAL statements. See Section 12.7.8, “SIGNAL and RESIGNAL”.

Support for Performance Schema, a feature for monitoring MySQL Server execution at a low level. See Chapter 20, MySQL Performance Schema.

Support for additional Unicode character sets: utf16, utf32, and utf8mb4. These character sets support supplementary Unicode characters; that is, characters outside the Basic Multilingual Plane (BMP). See Section 9.1.10, “Unicode Support”.

Enhancements to XML functionality, including a new LOAD XML statement.

Two new types of user-defined partitioning are supported: RANGE COLUMNS partitioning is an extension to RANGE partitioning; LIST COLUMNS partitioning is an extension to LIST partitioning. Each of these extensions provides two enhancements to MySQL partitioning capabilities:

It is possible to define partitioning ranges or lists based on DATE, DATETIME, or string values (such as CHAR or VARCHAR).

You can also define ranges or lists based on multiple column values when partitioning tables by RANGE COLUMNS or LIST COLUMNS, respectively. Such a range or list may refer to up to 16 columns.

For tables defined using these partitioning types, partition pruning can now optimize queries with WHERE conditions that use multiple comparisons between (different) column values and constants, such as a = 10 AND b > 5 or a < “2005-11-25″ AND b = 10 AND c = 50.

For more information, see Section 17.2.1, “RANGE Partitioning”, and Section 17.2.2, “LIST Partitioning”.

It is now possible to delete all rows from one or more partitions of a partitioned table using the ALTER TABLE … TRUNCATE PARTITION statement. Executing the statement deletes rows without affecting the structure of the table. The partitions named in the TRUNCATE PARTITION clause do not have to be contiguous.

Key caches are now supported for indexes on partitioned MyISAM tables, using the CACHE INDEX and LOAD INDEX INTO CACHE statements. In addition, a key cache can be defined for and loaded with indexes from an entire partitioned table, or for one or more partitions. In the latter case, the partitions are not required to be contiguous.

The TO_SECONDS() function is added. This function converts a date or datetime expression to a number of seconds since the year 0. You may use this function in partitioning expressions, and partition pruning is supported for tables defined using such expressions.

The following constructs are obsolete and have been removed in MySQL 5.5. Where alternatives are shown, applications should be updated to use them.

The log_bin_trust_routine_creators system variable (use log_bin_trust_function_creators).

The myisam_max_extra_sort_file_size system variable.

The record_buffer system variable (use read_buffer_size).

The sql_log_update system variable.

The table_type system variable (use storage_engine).

The FRAC_SECOND modifier for the TIMESTAMPADD() function.

The TYPE table option to specify the storage engine for CREATE TABLE or ALTER TABLE (use ENGINE).

The SHOW TABLE TYPES SQL statement (use SHOW ENGINES).

The SHOW INNODB STATUS and SHOW MUTEX STATUS SQL statements (use SHOW ENGINE INNODB STATUS SHOW ENGINE INNODB MUTEX).

The SHOW PLUGIN SQL statement (use SHOW PLUGINS).

The LOAD TABLE … FROM MASTER and LOAD DATA FROM MASTER SQL statements (use mysqldump or mysqlhotcopy to dump tables and mysql to reload dump files).

The BACKUP TABLE and RESTORE TABLE SQL statements (use mysqldump or mysqlhotcopy to dump tables and mysql to reload dump files).

TIMESTAMP(N) data type: The ability to specify a display width of N (use without N).

The –default-character-set and –default-collation server options (use –character-set-server and –collation-server).

The –delay-key-write-for-all-tables server option (use –delay-key-write=ALL).

The –enable-locking and –skip-locking server options (use –external-locking and –skip-external-locking).

The –log-bin-trust-routine-creators server option (use –log-bin-trust-function-creators).

The –log-long-format server option.

The –log-update server option.

The –master-xxx server options to set replication parameters (use the CHANGE MASTER TO statement instead): –master-host, –master-user, –master-password, –master-port, –master-connect-retry, –master-ssl, –master-ssl-ca, –master-ssl-capath, –master-ssl-cert, –master-ssl-cipher, –master-ssl-key.

The –safe-show-database server option.

The –skip-symlink and –use-symbolic-links server options (use –skip-symbolic-links and –symbolic-links).

The –sql-bin-update-same server option.

The –warnings server option (use –log-warnings).

The –no-named-commands option for mysql (use –skip-named-commands

The –no-pager option for mysql (use –skip-pager).

The –no-tee option for mysql (use –skip-tee).

The –position option for mysqlbinlog (use –start-position).

The –all option for mysqldump (use –create-options).

The –first-slave option for mysqldump (use –lock-all-tables).

The –config-file option for mysqld_multi (use –defaults-extra-file).

The –set-variable=var_name=value and -O var_name=value general-purpose options for setting program variables (use –var_name=value).

Efficient Use of concat_ws in Mysql query

Try out the following queries with concat_ws function: many a times we need to display Concatenated values for First name and Last name OR concatenated values for Addresses; generally we do this in PHP but that can be done in MySQL query as well

Syntax: concat_ws(‘separator’, ‘string 1’, ‘string 2’, ‘string 3’, ‘string N’, ……)

1) SELECT concat_ws(‘ ‘, ‘Firstname’, ‘Lastname’)

2) SELECT concat_ws(‘<br>’, ‘Address Line 1′, ‘Address Line 2′, ‘City’, ‘State’, ‘Zipcode’, ‘Country’)

3) SELECT concat_ws(‘<br>’, ‘Address Line 1′, NULL, ‘City’, ‘State’, ‘Zipcode’, ‘Country’)

Note: in the above examples, instead of the string values you can use the table fields.

IMPORTANT POINT to be noted in Query #3 is that if any of the values in the list is having NULL value then this function simply ignores that value and returns the concatenated values of the remaining values with the separator specified; in the above example the separator is a string ‘<br>’ it can be any string.

Where and how to use:

SELECT concat_ws(‘ ‘, u.firstname, u.lastname) as fullname, concat_ws(‘<br>’, u.address1, u.address2, u.city_name, s.state_name, u.zipcode, c.country_name) as fulladdress

FROM user_master u

LEFT JOIN state_master s ON s. state_id =u. state_id

LEFT JOIN country_master c ON c. countries_id =u.countries_id

WHERE user_id = ‘123’

Isn’t it helpful???

Why to use MySQL as database?

1. High Performance
A unique storage-engine architecture allows database professionals to configure the MySQL database server specifically for particular applications, with the end result being amazing performance results. Whether the intended application is a high-speed transactional processing system or a high-volume web site that services a billion queries a day, MySQL can meet the most demanding performance expectations of any system. With high-speed load utilities, distinctive memory caches, full text indexes, and other performance-enhancing mechanisms, MySQL offers all the right ammunition for today’s critical business systems.

2. Scalability and Flexibility
The MySQL database server provides the ultimate in scalability, sporting the capacity to handle deeply embedded applications with a footprint of only 1MB to running massive data warehouses holding terabytes of information. Platform flexibility is a stalwart feature of MySQL with all flavors of Linux, UNIX, and Windows being supported. And, of course, the open source nature of MySQL allows complete customization for those wanting to add unique requirements to the database server.

3. Robust Transactional Support
MySQL offers one of the most powerful transactional database engines on the market. Features include complete ACID (atomic, consistent, isolated, durable) transaction support, unlimited row-level locking, distributed transaction capability, and multi-version transaction support where readers never block writers and vice-versa. Full data integrity is also assured through server-enforced referential integrity, specialized transaction isolation levels, and instant deadlock detection.

4. High Availability
Rock-solid reliability and constant availability are hallmarks of MySQL, with customers relying on MySQL to guarantee around-the-clock uptime. MySQL offers a variety of high-availability options from high-speed master/slave replication configurations, to specialized Cluster servers offering instant failover, to third party vendors offering unique high-availability solutions for the MySQL database server.

5. Web and Data Warehouse Strengths
MySQL is the de-facto standard for high-traffic web sites because of its high-performance query engine, tremendously fast data insert capability, and strong support for specialized web functions like fast full text searches. These same strengths also apply to data warehousing environments where MySQL scales up into the terabyte range for either single servers or scale-out architectures. Other features like main memory tables, B-tree and hash indexes, and compressed archive tables that reduce storage requirements by up to eighty-percent make MySQL a strong standout for both web and business intelligence applications.

6. Strong Data Protection
Because guarding the data assets of corporations is the number one job of database professionals, MySQL offers exceptional security features that ensure absolute data protection. In terms of database authentication, MySQL provides powerful mechanisms for ensuring only authorized users have entry to the database server, with the ability to block users down to the client machine level being possible. SSH and SSL support are also provided to ensure safe and secure connections. A granular object privilege framework is present so that users only see the data they should, and powerful data encryption and decryption functions ensure that sensitive data is protected from unauthorized viewing. Finally, backup and recovery utilities provided through MySQL and third party software vendors allow for complete logical and physical backup as well as full and point-in-time recovery.

7. Comprehensive Application Development
One of the reasons MySQL is the world’s most popular open source database is that it provides comprehensive support for every application development need. Within the database, support can be found for stored procedures, triggers, functions, views, cursors, ANSI-standard SQL, and more. For embedded applications, plug-in libraries are available to embed MySQL database support into nearly any application. MySQL also provides connectors and drivers (ODBC, JDBC, etc.) that allow all forms of applications to make use of MySQL as a preferred data management server. It doesn’t matter if it’s PHP, Perl, Java, Visual Basic, or .NET, MySQL offers application developers everything they need to be successful in building database-driven information systems.

8. Management Ease
MySQL offers exceptional quick-start capability with the average time from software download to installation completion being less than fifteen minutes. This rule holds true whether the platform is Microsoft Windows, Linux, Macintosh, or UNIX. Once installed, self-management features like automatic space expansion, auto-restart, and dynamic configuration changes take much of the burden off already overworked database administrators. MySQL also provides a complete suite of graphical management and migration tools that allow a DBA to manage, troubleshoot, and control the operation of many MySQL servers from a single workstation. Many third party software vendor tools are also available for MySQL that handle tasks ranging from data design and ETL, to complete database administration, job management, and performance monitoring.

9. Lowest Total Cost of Ownership
By migrating current database-drive applications to MySQL, or using MySQL for new development projects, corporations are realizing cost savings that many times stretch into seven figures. Accomplished through the use of the MySQL database server and scale-out architectures that utilize low-cost commodity hardware, corporations are finding that they can achieve amazing levels of scalability and performance, all at a cost that is far less than those offered by proprietary and scale-up software vendors. In addition, the reliability and easy maintainability of MySQL means that database administrators don’t waste time troubleshooting performance or downtime issues, but instead can concentrate on making a positive impact on higher level tasks that involve the business side of data.

10.Easy support.

You can get easy support from thousands of mysql administrators worldwide. Sometimes for free and sometimes for very negligible charges. You can contact me anytime for mysql support at nilesh.pawar@kaizenz.com or visit at http://www.kaizenz.com

Introducing the MySQL Librarian

The MySQL Librarian is a collection of community-generated and cross referenced content related to MySQL. It’s a place where the community, collaboratively, builds and maintains MySQL content.

The idea started two years ago. During the MySQL conference, the blog posts were going fast and furiously down the screen. There were so many blog posts on Planet MySQL that their average life was 1 hour or less. Some of them went to the second page without having enjoyed a single minute of top page exposure. And then there were the presentation files, hidden or forgotten on some site, and never to be seen again, especially when you need them months later. After the conference, between travel, catching up with email, and whatever happens next, you lose track of most the interesting content that was generated by some of the best community members in the ecosystem. And the same happens when you go on vacation for two weeks. When you come back, catching up with the good stuff is hard. You should read hundred of less than interesting posts to get the important ones.
Improving the search on Planet MySQL helps a lot. But it won’t let you easily find an article that was published in a blog that is not aggregated, because the author writes about MySQL only once in a while. And the planet won’t let you find the presentations about interesting stuff, unless the accompanying text and tags reflect your search query.

The Librarian changes it all. The good content from conferences, stray presentations, videos, articles, can all be referenced in one place.
We started planning this tool in November 2008. Its implementation required a radical change of the Planet MySQL code, with months of thankless work to refactoring the existing features with a more flexible infrastructure. The small visible changes that appeared on Planet MySQL from February to June 2009 are the tip of the iceberg of a huge code change (kudos to Dups, who had the vision and the perseverance to tackle the task). If you were wondering why planetmysql.org started redirecting to planet.mysql.com, here’s the reason. We needed a single login, which could only be achieved by taking the planet under the main domain.

Click here to view more information on this.

Backend process for create table in MyIsam by MySQL Expert

When you say:

CREATE TABLE Table1 ...

MySQL creates files named Table1.MYD (“MySQL Data”), Table1.MYI (“MySQL Index”), and Table1.frm (“Format”). These files will be in the directory:

///

For example, if you use Linux, you might find the files in the /usr/local/var/test directory (assuming your database name is test). if you use Windows, you might find the files in the \mysql\data\test\ directory.

Let’s look at the .MYD Data file (MyISAM SQL Data file) more closely. There are three possible formats — fixed, dynamic, and packed. First, let’s discuss the fixed format.

  • Page SizeUnlike most DBMSs, MySQL doesn’t store on disk using pages. Therefore you will not see filler space between rows. (Reminder: This does not refer to BDB and InnoDB tables, which do use pages).
  • Record HeaderThe minimal record header is a set of flags:
    • “X bit” = 0 if row is deleted, = 1 if row is not deleted
    • “Null Bits” = 0 if column is not NULL, = 1 if column is NULL
    • “Filler Bits” = 1

The length of the record header is thus:

(1 + number of NULL columns + 7) / 8 bytes

After the header, all columns are stored in the order that they were created, which is the same order that you would get from SHOW COLUMNS.

Here’s an example. Suppose you say:

CREATE TABLE Table1 (column1 CHAR(1), column2 CHAR(1), column3 CHAR(1));
INSERT INTO Table1 VALUES ('a', 'b', 'c');
INSERT INTO Table1 VALUES ('d', NULL, 'e');

A CHAR(1) column takes precisely one byte (plus one bit of overhead that is assigned to every column — I’ll describe the details of column storage later). So the file Table1.MYD looks like this:

Hexadecimal Display of Table1.MYD file

F1 61 62 63 00 F5 64 00 66 00              ... .abc..d e.

Here’s how to read this hexadecimal-dump display:

  • The hexadecimal numbers F1 61 62 63 00 F5 64 20 66 00 are byte values and the column on the right is an attempt to show the same bytes in ASCII.
  • The F1 byte means that there are no null fields in the first row.
  • The F5 byte means that the second column of the second row is NULL.

(It’s probably easier to understand the flag setting if you restate F5 as 11110101 binary, and (a) notice that the third flag bit from the right is on, and (b) remember that the first flag bit is the X bit.)

There are complications — the record header is more complex if there are variable-length fields — but the simple display shown in the example is exactly what you’d see if you looked at the MySQL Data file with a debugger or a hexadecimal file dumper.

So much for the fixed format. Now, let’s discuss the dynamic format.

The dynamic file format is necessary if rows can vary in size. That will be the case if there are BLOB columns, or “true” VARCHAR columns. (Remember that MySQL may treat VARCHAR columns as if they’re CHAR columns, in which case the fixed format is used.) A dynamic row has more fields in the header. The important ones are “the actual length”, “the unused length”, and “the overflow pointer”. The actual length is the total number of bytes in all the columns. The unused length is the total number of bytes between one physical record and the next one. The overflow pointer is the location of the rest of the record if there are multiple parts.

For example, here is a dynamic row:

03, 00             start of header
04                 actual length
0c                 unused length
01, fc             flags + overflow pointer
****               data in the row
************       unused bytes
                  <-- next row starts here)

In the example, the actual length and the unused length are short (one byte each) because the table definition says that the columns are short — if the columns were potentially large, then the actual length and the unused length could be two bytes each, three bytes each, and so on. In this case, actual length plus unused length is 10 hexadecimal (sixteen decimal), which is a minimum.

As for the third format — packed — we will only say briefly that:

  • Numeric values are stored in a form that depends on the range (start/end values) for the data type.
  • All columns are packed using either Huffman or enum coding.

For details, see the source files /myisam/mi_statrec.c (for fixed format), /myisam/mi_dynrec.c (for dynamic format), and /myisam/mi_packrec.c (for packed format).

Note: Internally, MySQL uses a format much like the fixed format which it uses for disk storage. The main differences are:

  1. BLOB values have a length and a memory pointer rather than being stored inline.
  2. “True VARCHAR” (a column storage which will be fully implemented in version 5.0) will have a 16-bit length plus the data.
  3. All integer or floating-point numbers are stored with the low byte first. Point (3) does not apply for ISAM storage or internals.

MySQL Storage Engines by MySQL Expert

Introduction

Data in MySQL is stored in files (or memory) using a variety of different techniques. Each of these techniques employs different storage mechanisms, indexing facilities, locking levels and ultimately provides a range of different functions and capabilities. By choosing a different technique you can gain additional speed or functionality benefits that will improve the overall functionality of your application.

For example, if you work with a large amount of temporary data, you may want to make use of the MEMORY storage engine, which stores all of the table data in memory. Alternatively, you may want a database that supports transactions (to ensure data resilience).

Each of these different techniques and suites of functionality within the MySQL system is referred to as a storage engine (also known as a table type). By default, MySQL comes with a number of different storage engines pre-configured and enabled in the MySQL server. You can select the storage engine to use on a server, database and even table basis, providing you with the maximum amount of flexibility when it comes to choosing how your information is stored, how it is indexed and what combination of performance and functionality you want to use with your data.

This flexibility to choose how your data is stored and indexed is a major reason why MySQL is so popular; other database systems, including most of the commercial options, support only a single type of database storage. Unfortunately the ‘one size fits all approach’ in these other solutions means that either you sacrifice performance for functionality, or have to spend hours or even days finely tuning your database. With MySQL, we can just change the engine we are using.

In this article, we’re not going to concentrate on the technical aspects of the different storage engines (although we will inevitably have to look at some of these elements); instead we will concentrate on how and where these different engines can be best employed. To achieve this, we’ll have to look at some of the fundamental issues before moving on to the specifics of each engine type.

Determining Available Engines

You can determine a list of engines by using the show engines command within MySQL (assuming a MySQL server version later than 4.1.2).

Code:
mysql> show engines; +------------+---------+------------------------------------------------------------+ | Engine | Support | Comment | +------------+---------+------------------------------------------------------------+ | MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great performance | | HEAP | YES | Alias for MEMORY | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | | MERGE | YES | Collection of identical MyISAM tables | | MRG_MYISAM | YES | Alias for MERGE | | ISAM | NO | Obsolete storage engine, now replaced by MyISAM | | MRG_ISAM | NO | Obsolete storage engine, now replaced by MERGE | | InnoDB | YES | Supports transactions, row-level locking, and foreign keys | | INNOBASE | YES | Alias for INNODB | | BDB | NO | Supports transactions and page-level locking | | BERKELEYDB | NO | Alias for BDB | | NDBCLUSTER | NO | Clustered, fault-tolerant, memory-based tables | | NDB | NO | Alias for NDBCLUSTER | | EXAMPLE | NO | Example storage engine | | ARCHIVE | NO | Archive storage engine | | CSV | NO | CSV storage engine | +------------+---------+------------------------------------------------------------+ 16 rows in set (0.01 sec)

The listing shows the full list of available database engines, and whether the support is available in the current database server.

For versions of MySQL earlier than 4.1.2, use

Code:
mysql> show variables like "have_%"; +------------------+----------+ | Variable_name | Value | +------------------+----------+ | have_bdb | YES | | have_crypt | YES | | have_innodb | DISABLED | | have_isam | YES | | have_raid | YES | | have_symlink | YES | | have_openssl | YES | | have_query_cache | YES | +------------------+----------+ 8 rows in set (0.01 sec)

You can configure the available engines within a MySQL installation by changing the options to the configure script. If you are using a pre-packaged MySQL binary distribution then most of the commonly used engines are included. Note, however, that if you want access to some of the more unusual types (particularly the CSV, ARCHIVE and BLACKHOLE engines) then you may want to build your MySQL database by hand.

Using an Engine

There are a number of ways you can specify the storage engine to use. The simplest method, if you have a preference for a engine type that fits most of your database needs to set the default engine type within the MySQL configuration file (using the option storage_engine or when starting the database server by supplying the –default-storage-engine or –default-table-type options on the command line).

More flexibility is offered by allowing you specify the storage engine to be used MySQL, the most obvious is to specify the engine type when creating the table:

Code: SQL
CREATE TABLE mytable (id int, title char(20)) ENGINE = INNODB

You can also alter the storage engine used for an existing table:

Code: SQL
ALTER TABLE mytable ENGINE = MyISAM

However, you should be careful when altering table types in this way as making a modification to a type that does not support the same indexes, field types or sizes may mean that you lose data. If you specify a storage engine that doesn’t exist in the current database then a table of type MyISAM (the default) is created instead.

Differentiating the Engines

In order to make a decision about which engine to choose, we first need to think about the different core functionality provided in each engine that allows us to differentiate between them. We can generally divide up the core functionality into four areas; the supported field and data types, locking types, indexing and transactions. Some engines have unique functionality that can also drive your decision, we’ll be taking a closer look at these specifics in a moment.

Field and Data Types

Although all of the engines support the common data types, i.e., integers, reals and character based storage, not all engines support other field types, particularly the BLOB (binary large object) or TEXT types. Other engines may support only limited character widths and data sizes.

These limitations, while directly affecting the information you store may also have a related effect to the types of searches you perform, or the indexes you create on that information. In turn, these differences can affect the performance and functionality of your application as you may have to make decisions about functionality based on the storage engine choice you make for the type of data you are storing.

Locking Types

Locking within database engines defines how access and updates to information are controlled. When an object in the database is locked for updating, other processes cannot modify (or in some cases read) the data until the update has completed.

Locking not only affects how many different applications can update the information in the database, it can also affect queries on that data. The reason for this is that the queries may be accessing data that may be being altered or updated. In general, such delays are minimal. The bulk of the locking mechanism is devoted to preventing multiple processes updating the same data. Since both additions (INSERT statements) and alterations (UPDATE statements) to the data require locking, you can imagine that multiple applications using the same database can have a significant impact.

Locks are supported by different storage engines at different object levels, and these levels affect the concurrency of access to the information. Three different levels are supported, table locking, block locking and row locking. Table locking is most commonly supported and is the locking provided in MyISAM. It locks an entire table during an update. This will limit the number of applications that are updating a specific table to just one, and this can affect heavily used multi-user databases because it introduces delays into the update process.

Page level locking is used by the Berkeley DB storage engine and locks data according to the page (8Kb) of information that is being uploaded. When performing updates across a range of locations within the database, the locking is not a problem, but because adding rows involves locking the final 8Kb of the data structure, adding large numbers of rows, particularly of small data, can be a problem.

Row level locking provides the best concurrency; only individual rows within a table are locked, which means that many applications can be updating different rows of the same table without causing a lock situation. Only the InnoDB storage engine supports row level locking.

Indexing

Indexing can dramatically increase the performance when searching and recovering data from the database. Different storage engines provide different indexing techniques and some may be better suited for the type of data you are storing.

Some storage engines simply do not support indexing at all either because they use the indexing of the underlying tables (in the MERGE engine for example) or because the data storage method does not allow indexing (FEDERATED or BLACKHOLE engines).

Transactions

Transactions provide data reliability during the update or insert of information by enabling you to add data to the database, but only to commit that data when other conditions and stages in the application execution have completed successfully. For example, when transferring information from one account to another you would use transactions to ensure that both the debit from one account and the credit to the other completed successfully. If either process failed, you could cancel the transaction and the changes would be lost. If the process completed,then we would confirm it by committing the changes.

Storage Engines

MyISAM

The MyISAM engine is the default engine in most MySQL installations and is a derivative of the original ISAM engine type supported in the early versions of the MySQL system. The engine provides the best combination of performance and functionality, although it lacks transaction capabilities (use the InnoDB or BDB engines) and uses table-level locking.

Unless you need transactions, there are few databases and applications that cannot effectively be stored using the MyISAM engine. However, very high-performance applications where there are large numbers of data inserts/updates compared to the number of reads can cause performance problems for the MyISAM engine. It was originally designed with the idea that more than 90% of the database access to a MyISAM table would be reads, rather than writes.

With table-level locking, a database with a high number of row inserts or updates becomes a performance bottleneck as the table is locked while data is added. Luckily this limitation also works well within the restrictions of a non-transaction database.

MyISAM Summary
Name MyISAM
Introduced v3.23
Default install Yes
Data limitations None
Index limitations 64 indexes per table (32 pre 4.1.2); Max 16 columns per index
Transaction No
Locking level Table

MERGE

The MERGE engine type allows you to combine a number of identical tables into a single table. You can then execute queries that return the results from multiple tables as if they were just one table. Each table merged must have the same table definition.

The MERGE table is particularly effective if you are logging data directly or indirectly into a MySQL database and create an individual table per day, week or month and want to be able to produce aggregate queries from multiple tables. There are limitations to this however, you can only merge MyISAM tables and the identical table definition restriction is strictly enforced. Although this seems like a major issue, if you had used one of the other table types (for example InnoDB) then the merge probably wouldn’t be required.

MERGE Summary
Name MERGE
Introduced v3.23.25
Default install Yes
Data limitations Underlying tables must be MyISAM
Index limitations N/A
Transaction No
Locking level Table

MEMORY

The MEMORY storage engine (previously known as the HEAP storage engine) stores all data in memory; once the MySQL server has been shut down any information stored in a MEMORY database will have been lost. However, the format of the individual tables is kept and this enables you to create temporary tables that can be used to store information for quick access without having to recreate the tables each time the database server is started.

Long term use of the MEMORY storage engine is not generally a good idea, because the data could so easily be lost. However, providing you have the RAM to support the databases you are working on, use of MEMORY based tables is an efficient way of running complex queries on large data sets and benefiting from the performance gains.

The best way to use MEMORY tables is to use a SELECT statement to select a larger data set from your original, disk-based, tables and then sub-analyse that information for the specific elements you want. I’ve used this technique in the past to extract a month worth of web log data, actually from tables using the ARCHIVE storage engine, and then run the queries on specific URLs, sites and other focus points.

MEMORY Summary
Name MEMORY (HEAP, deprecated)
Introduced 1.0 (only known as MEMORY since 4.1)
Default install Yes
Data limitations BLOB and TEXT types not supported
Index limitations None
Transaction No
Locking level Table

EXAMPLE

The EXAMPLE engine is actually a programming example of a storage engine that can be used as the basis for other engines within the MySQL system. It does not support data inserts and isn’t a practical engine for any form of database access. It is, however, a good guide to how to develop your own storage engine, and is therefore an effective guide for programmers.

EXAMPLE Summary
Name EXAMPLE
Introduced v4.1.3
Default install No
Data limitations N/A
Index limitations N/A
Transaction N/A
Locking level N/A

FEDERATED

The FEDERATED storage engine (added in MySQL 5.03) enables you to access data from remote MySQL database (other databases may be supported in the future) as if it were a local database. In effect, the MySQL server acts as a proxy to the remote server, using the MySQL client access library to connect to the remote host, execute queries and then reformat the data into the localized format.

In essence, it is a way for a server, rather than a client, to access a remote database and can be an effective way of combining data from multiple hosts or of copying specific data from remote databases into local tables without the use of data exports and imports.

FEDERATED Summary
Name FEDERATED
Introduced v5.0
Default install No
Data limitations Limited by remote database
Index limitations N/A
Transaction No
Locking level No

ARCHIVE

The ARCHIVE storage engine supports only the INSERT and SELECT statements, but does support most of the MySQL field types. Information stored in an ARCHIVE storage engine table is compressed and cannot be modified and so ARCHIVE tables are perfect for storing log data (which you don’t want to be able to change) or information that is no longer in active use (for example, old invoicing or sales data).

While the information is stored very efficient, care should be taken when accessing data stored in the ARCHIVE tables. Because the information is compressed, selects have to read the entire table, and that also means decompressing the information. This can obviously increase the time taken to perform complex searches and retrievals. If you are performing a large number of queries on the information in these tables it may be easier to temporarily copy your data to another, uncompressed, data type such as MyISAM.

ARCHIVE Summary
Name ARCHIVE
Introduced v4.1.3
Default install No
Data limitations Data can only be inserted (no updates)
Index limitations N/A
Transaction No
Locking level N/A

CSV

The CSV storage engine stores data not in a binary format, but in the form a CSV (Command Separated Values) file. Because of this, there are limitations to the data stored. It is not an efficient method for storing large volumes of data, or larger data types like BLOB, although such types are supported. There is also no indexing. However, because the data is stored in the CSV format it is exceedingly portable; these CSV files generated can easily be imported into many different software packages, including Excel, OpenOffice and database systems like Access or FileMaker.

In general, the CSV engine is impractical as a general database engine. It is, however, probably the most effective and easiest method for data exchange. What makes it so convenient is that we can use SELECT and INSERT statements to create the database, which in turn means that we can easily produce CSV files based on queries of other data.

With some careful work, the CSV storage engine can also be used as an effective way of getting information into MySQL. Here, you can create the tables first, shutdown the MySQL server, copy over CSV files that you have exported from Excel, Access or another database, and you can then import the data and copy it over to MyISAM or InnoDB tables.

CSV Summary
Name CSV
Introduced v4.1.4
Default install No
Data limitations None
Index limitations Indexing is not supported
Transaction No
Locking level Table

BLACKHOLE

Strange though it may seem, the BLACKHOLE engine does not actually store any data. Although you can create tables and indexes, all SQL statements that would add or update information to the database are executed without actually writing any data. The database structure is retained, however, and you can create any indexes on the (non-existent) information that you want.

Although this seems like a futile exercise, it does allow you to test out database structures and play with table definitions without actually creating any data. Even more useful, however, is that SQL statements on BLACKHOLE databases are written to the binary log, and therefore are replicated to slave databases.

You can use this functionality to update one or more slaves directly without writing any local data. There are a number of potential uses for this functionality. One such use I have employed in past is to write log data to a BLACKHOLE table, which is then echoed to two slaves. Because the write is instantaneous (there are no local disk files or indexes to update), I can maintain a high logging rate, and rely on the binary logging and slave replication to distribute the data. An extension of this, as suggested in the MySQL manual, is to use filtering to control the distribution of records to the slaves.

BLACKHOLE Summary
Name BLACKHOLE
Introduced 4.1.11
Default install No
Data limitations No data is stored, but statements are written to the binary log (and therefore distributed to slave databases)
Index limitations N/A
Transaction No
Locking level N/A

ISAM

The ISAM storage engine was the original engine type available with versions of MySQL up until MySQL 3.23, when the MyISAM storage engine was introduced. ISAM has a number of different limitations that make it impractical as a database engine. These include the storage format, which is native to the platform (and therefore not portable between systems), a maximum table size of just 4GB and limited text searching facilities. Indexes are also more limited. Since MyISAM is supported on the same platforms as ISAM, and provides better compatibility, portability and performance.

ISAM is included for backwards compatibility, you certainly shouldn’t use ISAM for new databases, use MyISAM instead.

ISAM Summary
Name ISAM
Introduced v1.0
Default install Yes
Data limitations Limited maximum database size (4GB)
Index limitations Maximum 16 indexes per table, 16 parts per key
Transaction No
Locking level Table

Berkeley DB (BDB)

The Berkeley DB (or BDB) engine is based on the technology provided by the Berkeley DB storage system developed by SleepyCat software. BDB is a hash based storage mechanism, and the keys to the hash values are stored very efficiently. This makes the recovery of information–especially when accessed directly using a unique key incredibly quick, and by far the quickest of the available database types. Recovering full records is even quicker if you the data is short enough to be stored with the unique key (i.e., under 1024 bytes long). BDB is also one of only two types of storage engine that support transactions.

BDB is, however, limited in other ways. Although it uses page locking, locking only 8192 bytes of a table, rather than the entire table, during an update this can cause problems if you are performing a large number of updates in the same page (for example, inserting many rows). There is unfortunately no way round this. Sequential data access–for example a large quantity of rows matching non-indexed data–can be a lot slower because the data needs to be scanned row by row.

Recovery of information with BDB tables can also be a problem. Data in BDB is stored in a combination of the key index, the data file and binary data logs. A loss of data in any of these sections, even just one of the data logs, can make the data in the database totally unrecoverable.

Where BDB shines therefore is in locations where you can access specific blocks of data by a unique key that does not frequently change. I’ve successfully used BDB tables in the past to store look up information for data like categories or option lists where the small size and unique key structure make it quick and easy to recover information that is not often changed from its initial definition.

Berkeley DB (BDB) Summary
Name BDB
Introduced v3.23.34a
Default install No
Data limitations None
Index limitations Max 31 indexes per table, 16 columns per index;max key size 1024 bytes
Transaction Yes
Locking level Page (8192 bytes)

InnoDB

The InnoDB Engine is provided by Innobase Oy and supports all of the database functionality (and more) of MyISAM engine and also adds full transaction capabilities (with full ACID (Atomicity, Consistency, Isolation, and Durability) compliance) and row level locking of data.

The key to the InnoDB system is a database, caching and indexing structure where both indexes and data are cached in memory as well as being stored on disk. This enables very fast recovery, and works even on very large data sets. By supporting row level locking, you can add data to an InnoDB table without the engine locking the table with each insert and this speeds up both the recovery and storage of information in the database.

As with MyISAM, there are few data types that cannot effectively be stored in an InnoDB database. In fact, there are no significant reasons why you shouldn’t always use an InnoDB database. The management overhead for InnoDB is slightly more onerous, and getting the optimization right for the sizes of in-memory and on disk caches and database files can be complex at first. However, it also means that you get more flexibility over these values and once set, the performance benefits can easily outweigh the initial time spent. Alternatively, you can let MySQL manage this automatically for you.

If you are willing (and able) to configure the InnoDB settings for your server, then I would recommend that you spend the time to optimize your server configuration and then use the InnoDB engine as the default.

InnoDB Summary
Name InnoDB
Introduced v3.23 (source only), v4.0 (source and binary)
Default install No
Data limitations None
Index limitations None
Transaction support Yes (ACID compliant)
Locking level Row

Summary

As you may have been able to conclude from the above summary of the different storage engines available, there are few reasons not to use either the MyISAM or InnoDB engine types. MyISAM will do in most situations, but if you have a high number of updates or inserts compared to your searches and selects then you will get better performance out of the InnoDB engine. To get the best performance out of InnoDB you need to tweak the parameters for your server, otherwise there is no reason not to use it.

But if both MyISAM and InnoDB are so great, why even consider using the other engine types! Simply because they provide specific functionality that is not otherwise available. The MERGE engine is an exceedingly effective way of querying data from multiple, identically defined, tables. The MEMORY engine is the best way to perform a large number of complex queries on data that would be inefficient to search on a disk based engine. The CSV engine is a great way to export data that could be used in other applications. BDB is excellent for data that has a unique key that is frequently accessed.

Some of these are possible to do with InnoDB (our separate MyISAM logs, for example, could be combined into a single InnoDB table), but the flexibility to choose an engine type that suits you and the data you are working with is what differentiates MySQL from other solutions.