Saturday, 8 April 2017

MySQL Partition Pruning

Recently, we learned an expensive lesson about MySQL partition pruning. There, it is better to share it here so that others will not repeat our mistake.

Background


In our system, there is a big stats table that does not have primary key and indexes. This table is partitioned, but the lack of indexes often causes the full partition or even full table scan when query. To make things worse, the system still continues writing to this table, making it slower every day.

To fix performance issue, we want to clean the legacy data and add new indexes. However, this is not easy because the table is too big. Therefore, we chose the long approach by migrating only the wanted data from this old table to a new table with proper schema.

Partition by hash


It would have been fine if we only did what we originally intended to do. However, we changed the partition type for convenient and that made the new table slower.

In the original table, the partition is based on a timestamp column that represents the time as a number of hours from epoch. For example, the first second of the year 2017 in GMT is 1483228800 seconds from epoch. To get the number of hours, we divide the number by 3600 to get 1483228800 div 3600) = 412008.

Because of the partition by range type, we need to have a maintenance script that creates the monthly partition for next year. This way of partition is not very ideal because the partition size is big and not even. Hence, we converted monthly to weekly partition but too lazy to define each range and switched from partition by range to partition by hash.

This is a short version of how hash definition will look like if we do the partition by range
PARTITION BY RANGE (hour_epoch)
(PARTITION pOct2016 VALUES LESS THAN (419304),
 PARTITION pNov2017 VALUES LESS THAN (420024) ENGINE = InnoDB,
 PARTITION pDec2017 VALUES LESS THAN (420768) ENGINE = InnoDB,
 PARTITION pMax VALUES LESS THAN MAXVALUE ENGINE = InnoDB)
And this is how the partition definition will look like if we do partition by hash
partition by hash (hour_epoch div 168) partitions 157;
The partition by hash type did more than just shorten the syntax. MySQL will try to split records evenly by applying modulo function to select a partition. However, to make the duration of one partition one week, we divide hour_epoch number by 168 to effectively get week_epoch.

With the new table schema, we were happy with smaller partitions, shorter description, and more indexes.

Performance issue


Because of the huge volume of data, we could not fully migrate data to the new schema to verify performance. We only did the preliminary performance test with the data of 2 weeks and did not detect any performance issue. However, in the final testing, we were surprised to observe mixed result. Most of the queries are faster as expected, but some are slower.

After investigating, we realized that instead of scanning only a few partitions, MySQL does the full table scanning for time range query. It is even stranger that this behavior only happens with the date range smaller than 3 weeks. Totally surprised by this result, we overcame our procrastination to read up MySQL document carefully and realize why.

"For tables that are partitioned by HASH or [LINEAR] KEY, partition pruning is also possible in cases in which the WHERE clause uses a simple = relation against a column used in the partitioning expression"

As the document clearly explained, the partition pruning only works with the equal condition for partition by hash type.  However, we did not detect this issue earlier because of the query optimizer will auto convert range condition to equal condition if the number of distinct values in between of the range condition is short enough. Unfortunately, in our early test, the data of 2 weeks is short enough for the query optimizer to hide the problem from us.

Solution


After learning about the issue, we struggled to find a way to fix the performance issue. There are 2 proposed solutions

  • Trick the query optimizer to do the work by splitting a big range to multiple small ranges, each fit one partition. In this way, the query optimizer will work on each individual small ranges.
  • Rebuild the schema again with the proper partition type. 
The first solution is quick but dirty while the second solution is too time-consuming. Eventually, we almost decided to launch the new table with the first solution until finding a quick way to implement the second solution.

We have dug through MySQL document and learned that re-parititioning is basically a copy and paste operation. However, MySQL also has another command that allows us to do some partition change without too much effort.
ALTER TABLE pt
    EXCHANGE PARTITION p
    WITH TABLE nt;

In this command, MySQL allows us to exchange partition between a table and a partition of another table. Even when this is not a direct exchange between 2 partitions of 2 tables, it is just a matter of inconvenience to do one more middle swap to a temp table.

This is how our partition swapping looks like

ALTER TABLE origin_table EXCHANGE PARTITION p1 WITH TABLE temp_table;
ALTER TABLE final_table EXCHANGE PARTITION p1 WITH TABLE temp_table;

Even though this is not as fast as you may guess as MySQL will do a row by row validation to ensure every record of temp table is elligible for storing in the final table partition. If we use MySQL 5.7, this validation can be turned off by adding "WITHOUT VALIDATION" to the end of the second command.

Because we use Aurora, which only support MySQl 5.6, it still took us 2 days to fully update the partition type. However, this would have been one month if we do not use partition exchange.

Fortunately, we managed to recover from the mistake this time. We hope that you learn from our mistake and do remember to read the document carefully before using any fancy method.



Monday, 2 January 2017

Retrospective

Some folks asked me before that which Agile practice is the most important and my immediate answer is Retrospective. From my own experience, Retrospective plays the biggest role in the success of Agile practicing. Unfortunately, it may not necessarily be a popular practice. This is a bit sad because after trying Agile in different organizations, I see no practice that shows value as early and obviously as the Retrospective. Moreover, it is one of the easiest practice to adopt because it does not require discipline to practice regularly. It can be practiced as little as once a year and still be able to bring the differences.

Why retrospective is so important

Stay true to "agile" spirit

Unless you are hiding a rock, it is hard to ignore the debate about "Agile" versus "agile". Lots of developers are upset with the fact that agile is being seen as a set of ruleset rather than mindset. Unfortunately, trying to adopt agile by following the ruleset may lead to a rigid mindset, which is reversed to Agile manifesto.

Retrospective is not vulnerable to this problem because it is the most flexible practice in Agile. Retrospective stays true to the agile spirit by not specifying the method but only the purpose and benefit of the activity. Therefore, it leaves the team with freedom to conduct the activity in whatever ways that fit. The rule followers still can have it their ways with many techniques available but in general, this practice is very personal. While Planning, User Stories, Backlog and Iteration practices may look pretty the same everywhere, Retrospective is always very unique. Because each team has its own problems and members, following the same format still leads to different outcomes.

First step toward improvement

It is quite obviously that in order to improve, we need to see our weakness and limits. This logic should apply not only to software development but to any other aspect of life as well. Therefore, one of the first thing that one should do before introducing any change is spending time learning about the characteristic of each individual and the dynamic of the team.

The traditional method to understand team through psychology test is overrated. It tends to make teams fall into common stereotypes. It is not that psychology test is a waste of time but in reality, it works better for the individual, especially when the subject of the test is willing to collaborate. Therefore, psychology test is better to be a method of collecting feedback and improvement measurement.

For collecting insights about team dynamic, Retrospective is a more effective method because it is less intrusive. People are normally more comfortable when we ask less and let them talk more about what they are concerning about. Fortunately, that is exactly what Retrospective is about.

Keep a close look at the team well-being

The days where developers need to pray to get a decent job have passed. Nowadays, the demand for good developers is so high that most of the companies turn to headhunters to recruit talents. Hence, it is not only challenging to get more talents, but also to retain talents.

We may not be able to do much if this is paycheck competition. However, job changing is rarely purely paycheck driven. It can be very emotionally difficult to leave a job you love and a caring environment. Therefore, if the leader keeps a close eye on the team and each individual, there will be much greater chance to shield the team from lucrative offers.

How to run retrospective

As mentioned above, a good Retrospective is one that let people voice out their inner concern and thinking. Therefore, anything resembles form filling or interview is counterproductive. The better suggestion should be a flexible format. Retrospective itself need to be interesting and intimate enough to put people in a comfortable zone. Our ultimate goal is to let people share more so that the team can improve.

An effective facilitator needs to know how to stir up the conversation when it goes quiet and be silent when people are having a deep reflection. Any context switching is helpful as well. For example, a retrospective session can be out of office, far from the boss, enjoyable with coffee.

The last thing you need to remember about retrospective is to never ever take any discipline action from what you have learned in retrospective. Otherwise, it will be rightfully viewed as a betrayal of trust. Honestly, this will be the worst thing that can happen to the team.

So, if your team has not had an out of the box, open minded retrospective session for sometimes, please find an opportunity to bring the team to a nice place. I believe you and your team will have a good time.