- Posted by Mike on October 11th, 2007 filed in Technical
In response to my last post, someone asked, via the comments, what the deal was with MySQL on PowWeb. It’s also been asked on the PowWeb forums just shy of a billion times. I figured I’d take this opportunity to write a blog post about it, and sum up the issues, and what we’re doing about it.
First, let me say, from a high level, that we know MySQL is a problem. A rather large one, at that. If you call support, and the level one person you speak to won’t acknowledge the issue, ask to speak to their supervisor. Like I said last time, I have very little control over things from a Tier1 level. I feel your pain, and it bugs me to read some of the things I see people saying about our first level of support. With everything else, it’s a constant struggle to improve.
As I’ve only been here for a few short months now (though it seems like forever), MySQL’s been an issue ever since. I don’t deny that, my boss doesn’t deny it, and his boss doesn’t, either. Let me start by summing up the problem:
PowWeb is a community that is rich in knowledge and programming skill. That means that a lot of you PowWeb folks are utilizing MySQL applications, and even writing your own. That’s great, and I encourage you to continue doing so. The issue comes in when we’ve got too many things happening at once, on the same MySQL server. When there’s too many connections on the server, the server queues up the next connections, until one frees up.
This was an enormous problem when we were running MySQL 4 on all of the servers, as it didn’t limit connections at all. The first step in the right direction was upgrading to MySQL 5, which you all know we did a few weeks/months ago. Now that all the servers are running MySQL 5, we can limit connections based on username. This has helped, somewhat, and is a good start.
The next step that we’re planning on doing is being beta tested right now. According to one of our NetOps guys, MySQL is 20% write, and 80% read. He had the idea that doubling the amount of platters (hard drives) in the MySQL server would decrease the amount of data that needed to be read from one platter. We rolled this out onto one server already, which is our guinea pig, so to speak. (No, it wasn’t a PowWeb server that we’re testing on.) So far, the results have been extremely promising. Kudos to NetOps for coming up with this.
Since the test was so successful, we went ahead and placed orders for the new drives for all of the other MySQL servers, and are waiting for them to arrive at our data center(s) to be installed. This will get MySQL to a point where it’s usable for everyone. There should be a huge improvement across the entire farm of MySQL servers, and everyone should be much happier with performance.
I’m not entirely clear on how they’re setting up these new drives, or how this is going to get around the number of connections that MySQL allows. I didn’t take the time to ask. Once I saw how much of an improvement it made on the server we’re testing it on, I didn’t care about the details.
Ideally, I’d love to tell you the exact time and date you can expect these changes to be in place. Realistically, we don’t have an exact date. All I can say is that it’ll be done as soon as humanly possible.
If you do, however, need to call in to support about slowness with your site, and you think it’s MySQL related, here’s my suggestion: When you get someone on the phone (or chat, or e-mail support), let them know immediately what the issue is. Ie;
You: Hello Support Rep #41573, my site seems to be running extremely slow right now. I’m running a PHP based application with a MySQL backend.
Support Rep #41573: Just a moment, sir/ma’am, I’ll get someone to look into the MySQL server you’re on now.
I’m sure this goes without saying, and you all provide as much detail as possible to the Tier1 reps when you call in. However, there’s still a large number of people who call in, and don’t give that information. That one little sentence “I’m running a PHP based application with a MySQL backend.” will save the Tier1 rep from running half a dozen tests (and keeping you on hold the entire time) and they’ll skip right to checking the MySQL server. This will ultimately result in someone fixing whatever issue is causing the slowness. (Usually someone hogging resources on the MySQL server.)
I hope my summation helps, somewhat. I know it certainly doesn’t fix the issue, and doesn’t show any promise of having the issue fixed yesterday. But I assure you, as soon as we can, we’re going to fix this. For good.