Sunday, February 11, 2018

Fun with Bugs #61 - On MySQL Bug Reports I am Subscribed to, Part III

Since my previous post on this topic I've subscribed to 19 more MySQL bugs, so it's time for yet another review of these reports. I am trying to pick up important, funny or hard to process reports every day, and here is the list of the most interesting ones starting from the latest:
  • Bug #89607 - "MySQL crash in debug, PFS thread not handling singals." We have a patch contributed by Robert Golebiowski.
  • Bug #89583 - "no rpm build instructions from source git tree". As Simon Mudd put it:
    "... The sources are supposed to be in the git tree so just tell me how I can use that to produce the files needed to make the rpm SOURCES.

    Without that information only Oracle can build src rpms and no-one else can repeat the process they use and any changes required by people building rpms can't be done in a separate branch of a cloned copy of github.com/mysql/mysql-server. That would be better as from there I can much more easily provide PRs which should improve the MySQL sources."
    I hope one day Oracle will start to share all the code for open source MySQL, properly, including all test cases and detailed, repeatable build instructions. Having proper, up to date and working build instructions that let one end up with binaries built the same way as vendor builds them is not an easy task in general, and KB articles like this is just a step towards the goal...
  • Bug #89559 - "P_S recording wrong digest/digest_text for select statements using views". Everybody knows how much I like Performance Schema, but I like to notice bugs in it even more This is a funny bug that is still "Open" (even though it was noticed by my dear friend Sinisa Milivojevic already) for some reason.
  • Bug #89534 - "Crash during innodb recovery when working with encryption". Yet another bug report (this time "Verified", in 5.7.21) from Robert Golebiowski. Purge thread running during startup may cause problems.
  • Bug #89519 - "Documentation for SSL/TLS and replication is incomplete for 8.0". In this report  Daniël van Eeden noted that --ssl-verify-server-cert option is not documented.
  • Bug #89444 - "8.0.4rc --initialize-insecure is prohibitively slow". I've noted the same immediately while working on tests for my planned presentation at FOSDEM, but I do not care much about MySQL 8. Fortunately,  Roel Van de Paar cares, so we have a "Verified" bug report about this new behavior.
  • Bug #89441 - "Foreign keys constraints ignored after RENAME TABLE". I am sure we'll see many regressions related to the new data dictionary in MySQL 8. This bug report by Carlos Salguero highlights one of them.
  • Bug #89430 - "Release notes are missing important CVE fixes". Oracle and transparency in anything related to security issues seems to be historically incompatible things. Still, Roel Van de Paar has a hope they may improve their release notes... In the meantime check this great document with all the relevant details.
  • Bug #89375 - "Parallel replication always fails with specific workload from sysbench". It's always great to see public bug report from Oracle employee. Thank you, Frederic Descamps, for sharing your findings with community!
  • Bug #89372 - "Using --no-dd-upgrade seems to have no effect". This bug report by Geert Vanderkelen is still "Open" for some reason.
  • Bug #89367 - "Storing result in a variable(UDV) causes query on a view to use derived tables". Nice optimizer bug reported by Jaime Sicam.
  • Bug #89331 - "Changing lock/release order in group commit causes a deadlock". I think there are at most 10 people in the world who understands all the details of this bug report by Aliaksei Sandryhaila (and I am not one of them). The goal they are trying to achieve is interesting:
    "We are implementing START TRANSACTION WITH CONSTISTENT INNODB SNAPSHOT functionality in 8.0, and want to change it to first taking the next stage's lock, then releasing the previous stage's lock. The implementation of this feature for 5.6 is here: https://github.com/facebook/mysql-5.6/commit/c2b8ced ..."
    I see active discussion with Oracle developer in that report, so I am sure they'll figure out how to proceed.
  • Bug #89272 - "Binlog and Engine become inconsistent when binlog cache file gets out of space". Great finding by Yoshinori Matsunobu and Jeff Jiang. If you ever had "Errcode: 28 - No space left on device" on master while doing some large transactions, make sure to check if slave is in sync. It may be NOT.
  • Bug #89267 - "Unable to access 8.0.4 server after starting on top of 5.7.20 database". I am sure we'll get a lot more bug reports when users actively start to upgrade. For now check comments for possible workaround.
  • Bug #89247 - "Deadlock with MTS when slave_preserve_commit_order = ON." There are many problems with parallel replication/multi-threaded slaves, and Jean-François Gagné probably knows more about them than anyone else. Check also his another bug report about this feature, Bug #89141 - "Error in Group Replication caused by bad Write Set tracking."
  • Bug #89126 - "create table panic on innobase_parse_hint_from_comment". Nice bug report and patch from Yan Huang.
  • Bug #89101 - "MySQL server hang when gtid_mode=on and innodb_thread_concurrency>0". This is a second reason for me to think twice from now on before suggesting to limit innodb_thread_concurrency ever again. Great bug report by Seunguck Lee and MTR test case (and argumentation) by Przemyslaw Malkowski! This is a masterpiece of bug reporting art, Sinisa had to give up...
  • Bug #89098 - "Adding an auto_increment column to existing table creates gaps". Really weird behavior noticed by Riccardo Pizzi. I hope it is truly repeatable literally (but I had not checked myself).
That's all new bug reports in my list of subscriptions. Next time I hope to review few older ones that were not presented in this series yet.

Sunday, January 28, 2018

On InnoDB's Online DDL

I am completing my preparations for the upcoming FOSDEM talk, and one of last things I wanted to do in frames of them is a quick review of known bugs and problems in current (as in MySQL 5.7 GA) implementation of so called "online" DDL for InnoDB tables.

In my previous post I already shared my view on another important InnoDB feature, persistent statistics. Unlike that, I do not really hate online DDL. I just try to avoid it if possible and use tools like pt-online-schema-change or gh-ost instead. Not because it is not documented properly (the documentation is quite detailed, there are still things to clarify though) or does not work as designed, but mostly because the term "online" (if we understand it as "not blocking", or "without blocking/affecting the application and read/write operations to the table being changed is available") is a bit misleading (it is more like "less blocking" or "blocking for shorter periods of time", faster and in-place, sometimes), and because it does not work the way one might expect in any kind of replication setups.

To be more specific:
  • Replication ignores LOCK=NONE :) Slave will only start to apply "concurrent" DML after commit, and this leads to a huge replication lag.
  • In too many cases the entire table is rebuilt (data are (re-)written), in place or by creating a copy, while notable writes in the process of running ALTER TABLE are really required only if we are introducing stricter constraints (and even in this case we can just validate the table, return error if some row does not satisfy new constraint, too long to fit, for example, and then change metadata if all rows are OK) or adding new indexes (that in any case can not be used until they are built).
  • The online log has to be kept (in memory or in temporary file). There is one such log file for each index being created or table being altered. Manual says:
    "This log file stores data inserted, updated, or deleted in the table during the DDL operation. The temporary log file is extended when needed by the value of innodb_sort_buffer_size, up to the maximum specified by innodb_online_alter_log_max_size. If a temporary log file exceeds the upper size limit, the ALTER TABLE operation fails and all uncommitted concurrent DML operations are rolled back. Thus, a large value for this option allows more DML to happen during an online DDL operation, but also extends the period of time at the end of the DDL operation when the table is locked to apply the data from the log."
    The problem is that the size depends on the concurrent DML workload and is hard to predict. Note also "when the table is locked" above to understand how much "online" is this...
There are also bugs, and I'd like to discuss some of them:
  • Bug #82997, "Online DDL fails with". There are not enough public details to be sure with what exactly, but maybe the problems (several are reported) happen when the table altered has generated column. if this is really so, the bug may be fixed in MySQL 5.7.19+.
  • Bug #73196, "Allow ALTER TABLE to run concurrently on master and slave". I can not put this better than Andrew Morgan did it in this verified feature request:
    "With online ALTER TABLE it is possible for the DDL operation to run for many hours while still processing DML on that same table. The ALTER TABLE is not started on the slave until after it has completed on the master and it will again take many hours to run on the slave. While the DDL runs on the slave, it is not possible for it to process any transactions which followed the ALTER TABLE on the master as they may be dependent on the changes that were made to the table's schema. This means that the slave will lag the master by many hours while the ALTER TABLE runs and then while it catches up on the backlog of DML sent from the master while that was happening."
    Both pt-osc and gh-ost resolve this problem, as they take replication topology into account and can throttle changes on master if needed. See also this documentation request by Daniël van Eeden, Bug #77619 , that lists more limitations of "online" DDL, and check how it helped to clarify them here.
  • Bug #67286, "InnoDB Online DDL hangs". It ended up as "Not a bug", but there is a good explanation of exclusive metadata lock set by the "online" ALTER in the comments:
    "The final (short) phase of ALTER where the internal data dictionary is updated requires exclusive access. That's why the ALTER was blocked by the active transaction having a shared lock on the table."
    I once studied similar (and even simpler) case in a lot of details with gdb, see this blog post. I've clearly see MDL_EXCLUSIVE lock request for simple ALTER TABLE ... STATS_AUTO_RECALC=1 that (according to the manual) "permits concurrent DML". Other manual page clarifies:
    "In most cases, an online DDL operation on a table waits for currently executing transactions that are accessing the table to commit or roll back because it requires exclusive access to the table for a brief period while the DDL statement is being prepared. Likewise, the online DDL operation requires exclusive access to the table for a brief time before finishing. Thus, an online DDL statement also waits for transactions that are started while the DDL is in progress to commit or roll back before completing."
    Dear MySQL Oracle developers, just remove "In most cases" (or clarify it), and this would be fair enough!
  • Bug #84004, "Manual misses details on MDL locks set and released for online ALTER TABLE". That's my documentation request I filed after spending some time tracing metadata locks usage in gdb. My request is simple (typos corrected):
    "Describe all kinds of metadata locks used by MySQL, their interactions and order of acquisition and release for most important SQL statements, including (but not limited to) all kinds of online ALTER TABLE statements for InnoDB tables."
  • Bug #68498, "can online ddl for innodb be more online?". This report by Mark Callaghan that refers to this detailed study is still "Verified". Based on the comments to that blog post, it is "enough online", but the details of implementation were not clearly documented at the moment. Check for the details and clarifications in the comments!
  • Bug #72109, "Avoid table rebuild when adding or removing of auto_increment settings". The bug report from Simon Mudd is still "Verified".
  • Bug #57583, "fast index create not used during "alter table foo engine=innodb"". The bug is still "Verified" and I can not tell from the manual if this is implemented in MySQL 5.7 or not.
  • Bug #83557, "Can't use LOCK=NONE to drop columns in table with virtual columns" - nice "Verified" bug report by Monty Solomon.
  • Bug #70790, "ALTER TABLE REBUILD PARTITION SHOULD NOT PREVENT DML IN UNAFFECTED PARTITIONS". My former colleague in Oracle Arnaud Adant simply asked to provide proper and reasonable support of online DDL for partitioned tables. This bug report is still "Verified", but at least we have a separate manual page now that explains the details and limitations of online DDL with partitioned tables (most of Arnaud's requests are still NOT implemented).
  • Bug #81819, "ALTER TABLE...LOCK=NONE is not allowed when FULLTEXT INDEX exists". As Marko Mäkelä explains in the last comment of this "Verified" feature request:
    "However, LOCK=NONE is never supported when a FULLTEXT INDEX exists on the table. Similarly, LOCK=NONE is not supported when SPATIAL INDEX (introduced in MySQL 5.7) exist. Speaking as the author of WL#6255 which implemented ALTER TABLE...LOCK=NONE for InnoDB B-tree indexes in MySQL 5.6, I share the bug reporter's disappointment."
To summarize, online DDL in MySQL 5.7 is surely far above and beyond "fast index creation", but there is still a lot of room from improvements. Real solution (that allows to perform ALTER TABLE fast and without unnecessary changes/writes to data in way more cases) may come with real data dictionary in MySQL and support for multiple table versions there, or from ideas like those implemented in MDEV-11369, "Instant add column for InnoDB", and expressed in MDEV-11424, "Instant ALTER TABLE of failure-free record format changes". Until that all is implemented I'd prefer to rely on good old tools like pt-osc

In any case we speak about backward incompatible changes to the way MySQL works and stores data now.

On InnoDB's Persistent Optimizer Statistics

As I put it in recent Facebook post, one of MySQL features that I truly hate is InnoDB's persistent statistics. I think I should clarify this statement. It's great to have a way to save statistics in between server restarts, to have better control on the way it is estimated (even on a per table basis), set it explicitly, check it with simple SELECT. These all are great additions since MySQL 5.6.2+ that I truly appreciate (even if I may not be happy with some implementation details). They helped to make plans for queries against InnoDB more predictable and allow (with some efforts applied) MySQL query optimizer to really work as "optimizer" instead of "randomizer" or "pessimizer" (as some people called it) for InnoDB tables.

What I hate about it mostly is the way innodb_stats_auto_recalc is implemented, and the fact that it is enabled by default since MySQL 5.6.6+ or so. Here is why:
  1. Even if one enables automatic statistics recalculation, she can not be sure that statistics is correct and up to date. One still really has to run ANALYZE TABLE every time after substantial changes of data to be sure, and this comes with a cost (that Percona tried to finally overcome with the fix to lp:1704195 that appeared in their Percona Server 5.7.20-18+). Or enjoy whatever bits of statistics (taken in the process of background recalculation) may be present at the moment and the resulting execution plans...
  2. The details on automatic statistics recalculation are not properly documented (if only in some comments to some bug reports). This changes to better with time (thanks to continue pressure from MySQL community, including your truly, in a form of bug reports), but still most of MySQL users are far from understanding why something happens or NOT happens when this feature is involved.
  3. Implementation introduced background thread (that does dirty reads) to do recalculation, and separate transactions against InnoDB tables where statistics is stored. This complicates implementation, analysis in gdb etc, and introduced more bugs related to coordination of work performed by this thread and other background and user threads.
  4. Recently nobody from Oracle cares much to fix bugs related to this feature.
Let me try to illustrate and prove the points above with some MySQL bug reports (as usual). Many of these bugs are still "Verified" and not fixed as of recent release of recent GA version, MySQL 5.7. The order is somewhat random:
  • Bug #70741, "InnoDB background stats thread is not properly documented" - that's one of my requests to improve documentation. Some more details were added and the bug is closed, but make sure to read the entire comment "[26 Nov 2013 13:41] Vasil Dimov" if you want to understand better how it all works.
  • Bug #70617, "Default persistent stats can cause unexpected long query times" - this is one of bugs that led me to filing the previous documentation request. Check comments by Vasil Dimov there that he made before closing it as "Not a bug"... His comments are the best documentation of the way feature is implemented that I've seen in public. Make your own conclusions.
  • Bug #78289, "Bad execution plan with innodb_stats_persistent enabled" - note that the problem started after pt-osc was applied (to overcome the problems with another feature I hate, "online" ALTER TABLE, most likely). This utility applies RENAME to the table that is altered at the last stage, and as a result statistics just disappears and you have either to wait until it is calculated again in the background, or run ANALYZE... Surely this is "Not a bug".
  • Bug #80178 and Bug #78066 are about cases when SHOW INDEXES may still give wrong results while (with persistent statistics automatic recalculation disabled) one expects the same values we see in the tables where statistics is stored, or just correct ones. Both bugs are still "Verified", even though from the comment in the latter one may assume that the problem may be fixed in recent MySQL 5.7.x.
  • Bug #75428, "InnoDB persistent statistics not persistent enough". The counter of updated rows since last recalculation does not survive restarts, and 10% threshold is not configurable, so if server restarts often and table is big enough, we may get statistics never updated. Still "Verified".
  • Bug #72368, "Empty/zero statistics for imported tablespace until explicit ANALYZE TABLE". Still "Verified", but may be fixed in versions newer than 5.6. Importing tablespace was NOT a reason for automatic statistics recalculation to ever happen for the table...
  • Bug #84940, "MySQL Server crash possibly introduced in InnoDB statistics calculation". This regression bug in 5.6.35 and 5.7.17 was quickly fixed in the next releases, but still caused troubles for some time.
  • Bug #82184, "Table status inconsistent, requires ANALYZE TABLE executed twice". As Andrii Nikitin stated himself, "Most probably second ANALYZE is needed to give some time to purge thread remove old versions of the rows.", in case the table has huge blobs. The bug is still "Verified".
  • Bug #71814, "Persistent stats activity conflicts with mysqldump import of same info". The bug is "Closed" without any reason stated in public (what a surprise...). Note the following comment by Shane Bester (who actually verified and explained the bug):
    "Personally, I don't like that mysqldump dumps the content of these tables that should be auto-generated."
    He had also suggested a workaround to disable persistent statistics (SET GLOBAL innodb_stats_auto_recalc=0; SET GLOBAL innodb_stats_persistent=0;) before importing a dump. The problem here is a race condition between the importing of mysql database and the background statistics thread that can insert rows into the table between the CREATE and LOCK TABLE in the dump. See Bug #80705, "Mysqlpump in default configuration does not work", also, with a clear request: "Do not dump innodb_index_stats or innodb_table_stats". Something to think about.
  • Bug #84654, "Cardinality reset to 0 with persistent statistics when AUTO_INCREMENT touched".  Still "Verified".
  • Bug #84287, "row inserts, statement updates, persistent stats lead to table scans+ lag slaves". It seems automatic recalculation of statistics on slave is not triggered by inserting more rows via row-based replication events. Still "Verified".
  • Bug #82969 , "InnoDB statistics update may temporarily cause wrong index cardinalities". This bug (still "Verified") is my all times favorite race condition in the implementation of persistent statistics by Oracle (well, this one and Bug #82968 that is fixed at least in recent 5.7.x and in MariaDB).
 There are also bugs related to other details of InnoDB persistent statistics implementation:
  • Bug #78401, "ANALYZE TABLE" may assign temporary values to table stats during its execution". Statistics is not updated atomically, it is first reset and then recaclulated. Still "Verified".
  • Bug #86926, "The field table_name (varchar(64)) from mysql.innodb_table_stats can overflow." - this may be not enough for partitioned table, as partition names may be longer. Still "Verified".
  • Bug #67179, "mysql system tables innodb_table_stats,slave_master_info not accessible on clean". This was a famous bug during early days of MySQL 5.6 that affected many users who tried to upgrade. You may still need this file from it one day, to re-create missing InnoDB tables in the mysql database.
  • Bug #80986, "innodb complains about innodb_table_stats even if persistent stats disabled". Still "Verified".
  • Bug #86927, "Renaming a partitioned table does not update mysql.innodb_table_stats.". Fixed recently in MySQL 5.7.21 and 8.0.4.
  • Bug #84455 - the topic of this bug report is different and not relevant, but Shane Bester noted the following in the error log uploaded:
    [Warning] InnoDB: A transaction id in a record of table `mysql`.`innodb_table_stats` 
    is newer than the system-wide maximum.
    
    This is both suspicious and scary. May be related to the way background thread works.
  • Bug #74747, "Failing assertion: index->id == btr_page_get_index_id(page) btr0cur.cc line 899". Yes, this is a debug assertion only provoked explicitly, but note what is written in the error log before it happens:
    InnoDB: Cannot save table statistics for table "db1"."t1": Too many concurrent transactions
    It means background thread opens a separate transaction (no surprise, but still worth to note).
  • Bug #86702, "please disable persistent stats on the mysql.gtid_executed table". This is a valid and verified request to remove the related overhead for this "system" InnoDB table (as it was correctly done for several others).
So, the implementation of InnoDB's Persistent Optimizer Statistics is far from perfect or well documented. One may ask what I'd suggest instead? I often think and state that only engine-independent persistent statistics (in MariaDB style) should exist, and this should be recalculated only by explicit ANALYZE TABLE statement, maybe with more options to set sample size and other details than we have now. No background threads, nothing automatic until automated by the DBA (for this I'd appreciate a package like Oracle's dbms_stats).

This kind of idea is usually not well accepted. One of recent (valid) complains by Domas here were "No, thanks, don't need more MDL holders." and "I prefer lockless versioned stats, when it gets to what I prefer."

Some of the problems mentioned above may be resolved in MySQL 8 (or not) with its atomic data dictionary operations. Other idea presented in MDEV-15020 is to store statistics with data in the same .ibd file. We shell see what may happen, but current implementation, even though it improved a lot since early MySQL 5.6 days, is hardly long term acceptable.

Tuesday, January 16, 2018

Fun with Bugs #60 - On Some Memory Leaks, Replication and Other Bugs Fixed in MySQL 5.7.21

Oracle had formally released MySQL 5.7.21 yesterday. I do not bother any more to study MySQL release notes carefully and completely, but during a quick review today I've noted several interesting items I'd like you to pay attention to.

I am historically interested in InnoDB implementation details, so I could not miss Bug #87619 - "InnoDB partition table will lock into the near record as a condition in the use ". This was a regression bug in 5.7+, probably caused by new implementation of partitioning in InnoDB.

Another interesting bug is Bug #86927 - "Renaming a partitioned table does not update mysql.innodb_table_stats.", by Jean-François Gagné. It was yet another bug in InnoDB's persistent statistics (that I truly hate). What makes it especially interesting to me, though, is that it's the first public bug report I noted that mentioned MySQL 9.0.0 release as a target for the fix:
"Fixed as of the upcoming 5.7.21, 8.0.4, 9.0.0 release"
So, it's clear that back in October 2017 Oracle had already got a separate branch for upcoming MySQL 9.0.x! It also probably means that MySQL 8.0.x GA is coming really soon.

There are bug reports that are worth reading for technical reasons, others - only if you want to get some fun. Bug #86607 - "InnoDB crashed when master thread evict dict_table_t object" is agood example that covers both cases. Good to know the crash is fixed, but, please, make sure to read all comments there.

In this release I've noted fixes to several public bugs reported by Shane Bester. The first one of them is Bug #86573 - "foreign key cascades use excessive memory". Check how he used memory instrumentation in Performance Schema to demonstrate the problem! In Bug #86482 - "innodb leaks memory, performance_schema file_instances #sql-ib3129987-252773.ibd", he used similar approach to show potential memory leak in the Performance Schema itself ! Yet another bug that mentions 9.0.0 as a target version for the fix, among others... 

Bug #78048 - "INNODB Full text Case sensitive not working", is here both because I recently started to notice problems related to InnoDB FULLTEXT indexing, again (first time was soon after it was introduced), and because it has an MTR  test case contributed by Sveta Smirnova.


XA transactions support had always been problematic in MySQL  (still "Verified" Bug #87526 by Sveta Smirnova is one of recent examples how incomplete or useless it can be, see also MDEV-14593). Check the following bugs fixed in MySQL 5.7.21 if you use XA transactions:
  • Bug #87393 - "xa rollback with wrong xid will be recorded into the binlog". It was reported by HongXiang Jiang, who had also contributed a patch.
  • Bug #83295 - "replication error occurs, use xa transaction(one phase)". Yet another XA transactions problem reported by Hiroyuki Itoh and then confirmed by many affected users. Nice to see it fixed.
There are many fixes in MySQL 5.7.21 related to memory leaks. Two bug reports of this kind were from Przemyslaw Malkowski:
  • Bug #85371 - "Memory leak in multi-source replication when binlog_rows_query_log_events=1". Again, memory instrumentation of Performance Schema was used to demonstrate the problem. Vlad Lesin, also from Percona, contributed the patch for this bug.
  • Bug #85251 - "Memory leak in master-master GTID replication with sync_relay_log_info". Here Vlad Lesin, who had contributed the patch, also used Massif for the detailed analysis.
To summarize, I start to miss memory instrumentation in Performance Schema in MariaDB 10.x... This is a really useful feature.

I usually care about optimizer bugs, and these two attracted my attention:
  • Bug #87207 - "select distinct with secondary key for 'Using index for group-by' bad results". This nice optimizer regression bug was found by Shane Bester. As a workaround, while you do not use 5.7.21, you can try to set optimizer_switch='use_index_extensions=off'. I'd keep it that way by default...
  • Bug #72854 - "Extremely slow performance with outer joins and join buffer". I am happy to see this old optimizer bug reported by Sergey Petrunya from MariaDB finally fixed.
You can find a lot more details, including usual references to MySQL bug reports that are still private, in the Release Notes. Keep reading and consider upgrade :)

Tuesday, January 2, 2018

Fun with Bugs #59 - On MySQL Bug Reports I am Subscribed to, Part II

New Year (that starts on Monday!) gives a good opportunity to change something in our lives, start doing something new, better or different. Let's assume I failed with all these so far, as I am again posting about MySQL bugs here.

Since my previous post on this topic I've subscribed to 15 more MySQL bugs, and being on a combination of public holidays and vacation now gives me a good opportunity to review these bug reports.

Here they are, starting from the most recent:
  • Bug #89065 - "sync_binlog=1 on a busy server and slow binary log filesystem stalls slaves". I do not remember seeing multiple threads in "Finished reading one binlog; switching to next binlog" state, but it would be interesting to see this bug report processed properly.
  • Bug #89051 - "EXPLAIN output is different for same content depending when index was added". The way optimizer decides on "range" vs "ref" access is always interesting. Here, based on a recent comment by Øystein Grøvlen, the bug is actually that "Cost values are not correct when optimizer switch from ref-access to range-access in order to use more key parts".
  • Bug #88914 - "Potential null pointer dereference at pointer node->undo_recs (row0purge.cc)". It's funny to see many bugs becoming private as "security" ones and, at the same time, this bug, where reporter suspects it is exploitable, being "Open" and ignored for more than two weeks...
  • Bug #88891 - "Filtered replication leaves GTID holes with create database if not exists". I can not even explain how much I "like" all kinds of GTIDs I have to deal with, especially such a long lists of GTIDs that may be created in cases described in this report.
  • Bug #88863 - "COUNT(*) can sometimes return bogus value". Now, this is a really funny bug! It must be some race condition, and I'd really prefer to see this bug fixed soon.
  • Bug #88844 - "MySQL crash with InnoDB assertion failure in file pars0pars.cc". Nice crash (that I've never seen before) quickly reproduced by Shane Bester.
  • Bug #88834 - "Uneven slowdown on systems with many users". What can be better to speed up connection than checking the list of users one by one, especially when there are thousands of users?
  • Bug #88827 - "innodb uses too much space in the PK for concurrent inserts into the same table". As Mark Callaghan put it:
    "I am not sure my reproduction details will ever satisfy Sinisa but I don't mind if you don't fix this because I care more about MyRocks today and this bug makes MyRocks look better."
    We (Facebook's MySQL, MariaDB and Percona server users) do have MyRocks, but why poor Oracle MySQL users should suffer? Let's hope Sinisa Milivojevic will process the bug fast, with all the details clarified there :)
  • Bug #88791 - "Binary log for generated column contains new value as WHERE clause, not old value". Generated columns and binary logging, what could went wrong?
  • Bug #88764 - ""ALTER TABLE MODIFY..." takes time even if leaving table as is". Any simple test cases they come to my mind do NOT let to reproduce this problem, but I feel some potential as soon as more exotic cases like partitioning or data directory settings are considered. Let's wait for bug reporter to clarify.
  • Bug #88720 - "Inconsistent and unsafe FLUSH behavior in terms of replication". Nice summary of problems from Przemyslaw Malkowski. One more reason for me to hate GTIDs, honestly.
  • Bug #88694 - "MySQL accepts wildcard for database name for table level grant but won't use it". One more problem with privileges reported by Daniël van Eeden.
  • Bug #88674  - "Regression CREATE TBL from 5.7.17 to 20 (part #2: innodb_file_per_table = OFF)." and Bug #88673 - "Regression CREATE TBL from 5.7.17 to 20 (part #1: innodb_file_per_table = ON)." - these two bugs were reported by Jean-François Gagné and clearly show some things that are done wrong by Oracle when fixing (private, "security") bugs...
  • Bug #88671 - "ALL + BNL chosen over REF in query plan with a simple SELECT + JOIN". In this case optimizer (probably) does not take costs into account properly when choosing block nested loop join vs usual "ref" index access. Maybe just a matter of missing/wrong statistics also. It would be interesting to find out eventually.
  • Bug #88666 - "I_S FILES : all rows are displayed whatever the user privileges". Yet another bug report from Jocelyn Fournier. I am actually surprised with a number of bugs related to privileges that I notice recently.
  • Bug #88633 - "Auto_increment value on a table is less than max(id) can happen". It seems only MySQL 5.7.x is affected, but not 5.6.x.
  • Bug #88623 - "Modifying virtually generated column is not online". May be by design, but still surprising.
  • Bug #88534 - "XA may lost prepared transaction and cause different between master and slave". XA transactions with MySQL is still a sure way to disaster, I think. See Bug #87526 also that should appear in the next part of this series...
Stay tuned to more posts about MySQL bugs from me in the New Year of 2018!

Saturday, December 9, 2017

Fun with Bugs #58 - Bug of the Day From @mysqlbugs

In 2013 I had a habit of writing about MySQL bugs on Facebook almost every day. Typical post looked like this one, link to the bug and few words of wondering with a bit of sarcasm.
By the way, check last comments in Bug #68892 mentioned there - the problem of LOST_EVENTS in master's binary log and a way to workaround it still valid as of MySQL 5.7.17.
At that time I often got private messages from colleagues that Facebook is a wrong media for this kind of posts, these posts make MySQL look "buggy" etc, and eventually I was shut up (for a year or so) in a more or less official way by combined efforts of my employer at that time and Oracle MySQL officials.

A year later I started to write about MySQL bugs again on Facebook, but these posts were not regular any more, and maybe became a bit less sarcastic. Years later, I agree that Facebook should better be used for sharing photos of cats, or nice places, or family members, or even for hot political discussions, as these things annoy people much less than MySQL bugs. So, the need for different media for short annoying messages about MySQL bugs is clear to me. I do think the right media is Twitter (it is annoying by itself), and I am present there as @mysqlbugs since December 2012 actually. I am not a big fan of it historically and used to open it mostly to share information about yet another my blog post, but recently (after they allowed to write longer messages there) I decided to pay more attention to it (until its gone entirely and replaced by something else). So, I post a link to one MySQL bug there every day for a week or so, with the #bugoftheday tag. I quickly noticed that the tag was used by others to share photos of nice insects/bugs, but I don't mind for my posts to end up among those, and I am ready to share some of my related photos as well.

To summarize, I am going to write short messages about MySQL bugs regularly on Twitter. I try to write about some bug I recently notices just because it is new, improperly handled or was involved in some customer issue that I worked on. Let me share 5 last bugs mentioned there, so you can decide if it makes sense for you to follow me or #bugoftheday:
  • Bug #88827 - "innodb uses too much space in the PK for concurrent inserts into the same table". Interesting finding by Mark Callaghan during his recent benchmarking efforts. Still "Open".
  • Bug #88788 - "log_bin is not considered correctly an makes binary logging unusable!". This report by Oli Sennhauser (quickly declared a duplicate of my older Bug #75507) got a really nice number, and it's also interesting how efforts to name hosts in a nicely structured manner may play against poor DBA...
  • Bug #88776 - "Issues found by PVS-Studio static analyzer". Related post with the detailed analysis of problems found by the tool was mentioned few days ago by somebody from MariaDB on Slack, so I immediately noted that the bug comes from the same author, Sergey Vasiliev.
  • Bug #88765 - "This bug reporting form has a ridiculously short character limit for the bug syn". The bug report about MySQL bugs database itself. I also hit the limit there way more often than I'd like to. Discussion continues, but I feel that this is not going to be fixed...
  • Bug #87526 - "The output of 'XA recover convert xid' is not useful". One of our customers hit this problem recently. I can only agree with Sveta Smirnova here, "Output of XA RECOVER should show ID which can be used in XA COMMIT statement."
So, if the list above seems useful or interesting, please, follow me on Twitter. Let's continue to have regular fun with MySQL bugs, now using (hopefully) a more appropriate media!

As a side note, if you are interested in older bugs opened this day years ago and still "Verified", please, check this great page by Hartmut! You may find really great reports like Bug #79581- "Error 1064 on selects from Information Schema if routine name has '\0'", by Sveta Smirnova. This bug is 2 years old today...

Saturday, December 2, 2017

Using strace for MySQL Troubleshooting

I'd say that strace utility is even more useful for MySQL DBAs than lsof. Basically, it is a useful general purpose diagnostic and debugging tool for tracing system calls some process makes and signals it receives. The name of each system call, its arguments and its return value are printed to stderr or to the file specified with the -o option.

In context of MySQL strace is usually used to find what files mysqld process accesses, and to check the details about any I/O errors. For example, if I'd really want to try to verify Bug #88479 - "Unable to start mysqld using a single config file (and avoiding reading defaults)" by Simon Mudd, I'd just run mysqld from some 5.7.x version as a command argument for strace. On my Ubuntu 14.04 netbook I have the following files:
openxs@ao756:~/dbs/5.7$ ls -l /usr/my.cnf
-rw-r--r-- 1 root root 943 лип 19  2013 /usr/my.cnf
openxs@ao756:~/dbs/5.7$ ls -l /etc/my.cnf
-rw-r--r-- 1 root root 260 чер 24 20:40 /etc/my.cnf
openxs@ao756:~/dbs/5.7$ ls -l /etc/mysql/my.cnf
-rw-r--r-- 1 root root 116 лют 26  2016 /etc/mysql/my.cnf
openxs@ao756:~/dbs/5.7$ bin/mysqld --version
bin/mysqld  Ver 5.7.18 for Linux on x86_64 (MySQL Community Server (GPL))
So, what happens if I try to run mysqld --defaults-file=/etc/mysql/my.cnf, like this:
openxs@ao756:~/dbs/5.7$ strace bin/mysqld --defaults-file=/etc/mysql/my.cnf --print-defaults 2>&1 | grep 'my.cnf'
stat("/etc/mysql/my.cnf", {st_mode=S_IFREG|0644, st_size=116, ...}) = 0
open("/etc/mysql/my.cnf", O_RDONLY)     = 3
openxs@ao756:~/dbs/5.7$
It seems we proved that only the file passed as --defaults-file is read (if it exists). By default other locations are also checked in a specific order (note that return codes are mapped to symbolic errors when possible):
openxs@ao756:~/dbs/5.7$ strace bin/mysqld --print-defaults --print-defaults 2>&1 | grep 'my.cnf'
stat("/etc/my.cnf", {st_mode=S_IFREG|0644, st_size=260, ...}) = 0
open("/etc/my.cnf", O_RDONLY)           = 3
stat("/etc/mysql/my.cnf", {st_mode=S_IFREG|0644, st_size=116, ...}) = 0
open("/etc/mysql/my.cnf", O_RDONLY)     = 3
stat("/home/openxs/dbs/5.7/etc/my.cnf", 0x7ffd68f0e020) = -1 ENOENT (No such file or directory)
stat("/home/openxs/.my.cnf", 0x7ffd68f0e020) = -1 ENOENT (No such file or directory)
openxs@ao756:~/dbs/5.7$
If we think that --print-defaults may matter, we can try without it:
openxs@ao756:~/dbs/5.7$ strace bin/mysqld --defaults-file=/etc/mysql/my.cnf 2>&1 | grep 'my.cnf'stat("/etc/mysql/my.cnf", {st_mode=S_IFREG|0644, st_size=116, ...}) = 0
open("/etc/mysql/my.cnf", O_RDONLY)     = 3
^C
openxs@ao756:~/dbs/5.7$
Last example also shows how one can terminate tracing with Ctrl-C.

Now, let me illustrate typical use cases with (surprise!) some public MySQL bug reports where strace was important to find or verify the bug:
  • Bug #20748 - "Configuration files should not be read more than once". In this old bug report Domas Mituzas proved the point by running mysqld as a command via strace. He filtered out lines related to opening my.cnf file with egrep and made it obvious that one file may be read more than once. The bug is closed long time ago, but it is not obvious that all possible cases are covered, based on further comments...
  • Bug #49336 - "mysqlbinlog does not accept input from stdin when stdin is a pipe". Here bug reporter shown how run mysqlbinlog as a command under strace and redirect output to the file with -o option.
  • Bug #62224 - "Setting open-files-limit above the hard limit won't be logged in errorlog". This minor bug in all versions before old 5.6.x (that still remains "Verified", probably forgotten) was reported by Daniël van Eeden. strace allowed him to show what limits are really set by setrlimit() calls. The fact that arguments of system calls are also shown matters sometimes.
  • Bug #62578 - "mysql client aborts connection on terminal resize". This bug report by Jervin Real from Percona is one of my all time favorites! I clearly remember how desperate customer tried to to dump data and load them on a remote server with a nice command line, and after waiting for many days (mysqldump was run for a data set of 5TB+, don't ask, not my idea) complained that they got "Lost connection to MySQL server during query" error message and loading failed. Surely, the command was run from the terminal window on his Mac and in the process of moving it here and there he just resized the window  by chance... You can see nice example of strace usage with MySQL Sandbox in the bug report, as well as some outputs for SIGWINCH signal. Note that the bug is NOT fixed by Oracle in MySQL 5.5.x (and never happened in 5.6+), while Percona fixed it since 5.5.31.
  • Bug #65956 - "client and libmysqlclient VIO drops connection if signal received during read()". In this yet another 5.5 only regression bug that was NOT fixed in 5.5.x by Oracle (until in frames of Bug #82019 - "Is client library supposed to retry EINTR indefinitely or not" the patch was contributed for a special case of it by Laurynas Biveinis from Percona, that in somewhat changed form allowed to, hopefully, get some fix in 5.5.52+), the problem was noted while trying to attach strace to running client program (with -p option).
  • Bug #72340 - "my_print_defaults fails to parse include directive if there is no new line". Here bug reporter used strace (with -f option to trace child/forked processes and -v option get verbose output) to show that file name is NOT properly recognized by my_print_defaults. The bug is still "Verified".
  • Bug #72701 - "mysqlbinlog uses localtime() to print events, causes kernel mutex contention". Yet another nice bug report by Domas Mituzas, who had used -e stat options of strace to trace only stat() calls and show how many times they are applied to /etc/localtime.The bug is fixed since 5.5.41, 5.6.22 and 5.7.6 removed related kernel mutex contention.
  • Bug #76627 - "MySQL/InnoDB mix buffered and direct IO". It was demonstrated with strace that InnoDB was opening each .ibd file twice (this is expected as explained by Marko Mäkelä) in different modes (and this was not any good). The bug is fixed in recent renough MySQL server versions.
  • Bug #80020 - "mysqlfrm doesn't work with 5.7". In this bug report Aleksandr Kuzminsky had to use strace to find out why mysqlfrm utility failed mostly silently even with --verbose option.
  • Bug #80319 - ""flush tables" semantics deviate from manual". Here Jörg Brühe proved with strace that close() system call is not used for individual InnoDB table t when FLUSH TABLES t is executed. manual had to be clarified.
  • Bug #80889 - "PURGE BINARY LOGS TO is reading the whole binlog file and causing MySql to Stall". By attaching strace to running mysqld process and running the command it was shown that when GTID_MODE=OFF the entire file we purge to was read. No reason to do this, really. Note how return value of open(), file descriptor, was further used to track reads from this specific file.
  • Bug #81443 - "mysqld --initialize-insecure silently fails with --user flag". As MySQL DBA, you should be ready to use strace when requested by support or developers. Here bug reporter was asked to use it, and this revealed a permission problem (that was a result of the bug). No need to guess what may go wrong with permissions - just use strace to find out!
  • Bug #84708 - "mysqld fills up error logs with [ERROR] Error in accept: Bad file descriptor". Here strace was used to find out that "When mysqld is secured with tcp_wrappers it will close a socket from an unauthorized ip address and then immediately start polling that socket"... The bug is fixed in MySQL 5.7.19+ and 8.0.2+, so take care!
  • Bug #86117 - "reconnect is very slow". I do not care about MySQL Shell, but in this bug report Daniël van Eeden used -r option of strace to print a relative timestamp for every system call, and this was it was clear how much time was spent during reconnect, and where it was spent. Very useful!
  • Bug #86462 - "mysql_ugprade: improve handling of upgrade errors". In this case strace allowed Simon Mudd to find out what exact SQL statement generated by mysql_upgrade failed.
The last but not the least, note that you may need root or sudo privileges to use strace. On my Ubuntu 14.04 this kind of messages may appear even if I am the user that owns mysqld process (same with gdb):
openxs@ao756:~/dbs/maria10.2$ strace -p 2083
strace: attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
To summarize, strace may help MySQL DBA to find out:

  • what files are accessed by the mysqld process or related utilities, and in what order
  • why some MySQL-related command (silently) fails or hangs
  • why some commands end up with permission denied or other errors
  • what signals MySQL server and tools get
  • what system calls could took a lot of time when something works slow
  • when files are opened and closed, and how much data are read from the files
  • where the error log and other logs are really located (we can look for system calls related to writing to stderr, for example)
  • how MySQL really works with files, ports and sockets
It also helps to find and verify MySQL bugs, and clarify missing details in MySQL manual.

There other similar tools for tracing system calls (maybe among other things) on Linux that I am going to review in this blog some day. Performance impact of running MySQL server under this kind of tracing is also a topic to study.