What is happening with ReCom

Urgh, I can’t even find a good way of explaining this on ReCom as the thing just won’t load for me, so I guess I’ll have to do this here for now. If you’re a geek and you want to help me resolve this problem read on.

Currently ReCom looks like this:

There seems to be a problem with the MySQL server, sorry for the inconvenience.

We should be back shortly.

Now what is actually happening behind the scenes, can be very quickly explained by my correspondence with the tech support:

Intermittent connectivity to MySQL database
Ticket Details
Ticket ID: LYS-909847 Department: Support Divison
Status: Open Priority: High
Created On: 06 May 2007 10:14 AM Last Update: 18 May 2007 11:43 AM
SH Account Information
Domain Name: www.recom.org
Account Type Shared Hosting

Ong Jiin Joo Posted On: 06 May 2007 10:14 AM


Dear Sir/Madam,

Recently, our website www.recom.org is experiencing intermittent connectivity to the database. This has been carrying on for 3 full days and we have not seen it resolve automatically.

Please investigate and get back to us urgently. This has been affecting our loyal customers seriously and we would like to solve the issue. Nothing has changed in our software so we don’t believe it is a software error. It is more possible that the database connection pool has been used up of something.

Please help!

Thanks,
-Jiin Joo

Nate H. Posted On: 06 May 2007 10:30 AM


Hello,

MySQL was up and running, however, not properly. I have restarted this service which has corrected this issue. Your site is now resolving properly without errors. Please let us know if we can further assist you in this matter.

Regards,
Nate Ong Jiin Joo Posted On: 07 May 2007 10:01 AM


Hi Nate,

The problem went away after you restarted it, but now it’s back again. Can you please verify that it is the same problem and identify the root cause?

Very much appreciate it.

Thanks,
-Jiin Joo Nate H. Posted On: 07 May 2007 10:36 AM


Hello,

The root cause of this was a high amount of MySQL connections. I have restarted the service again, which will bump those users on the databases of the server that are no longer active, but that are still connected due to persistant connections enabled in someones script. Unfortunately, there is not much we can do about this, however, it should not be a recurring issue. I apologize that this has happened twice within the past few days.

Regards,
Nate Ong Jiin Joo Posted On: 08 May 2007 05:34 AM


Nate,

No luck. The behavior continues on our end.

We would like to help you identify if it is our own scripts that is hogging up all the connection. Can you dive in and find out this info? As I’m not a developer for this application, I need to gather all the information I can in order to advise the team. For example, it would help if you can identify the offending application and contact the owner of that application (if it happens to be recom.org, you should look for me), and either throttle that application on connection pool level or increase the connection pool size if there are no hardware limitations.

This has been 5 days or more – shouldn’t it ring a bell in your data center already? I mean, there must be other applications running on this shared infrastructure right?

HELP~~~!!!

Thank you for your understanding.
-Jiin Joo
Mike Kr. Posted On: 08 May 2007 06:25 AM


Hello,

The problem is your script is hitting the maximum number of simultaneous user connections to MySQL (the global limit is not being reached which is why only your site is affected). The current user limit is 25. Here is a list:

root@sh78 [/home/recom24/public_html]# mysqladmin processlist | grep recom24
| 1158 | recom24_recom | localhost | recom24_main2 | Sleep | 69914 | | |
| 28836 | recom24_recom | localhost | recom24_main2 | Sleep | 24603 | | |
| 30340 | recom24_recom | localhost | recom24_main2 | Sleep | 23361 | | |
| 30681 | recom24_recom | localhost | recom24_main2 | Sleep | 23108 | | |
| 30724 | recom24_recom | localhost | recom24_main2 | Sleep | 23062 | | |
| 31423 | recom24_recom | localhost | recom24_main2 | Sleep | 22566 | | |
| 31465 | recom24_recom | localhost | recom24_main2 | Sleep | 22542 | | |
| 31936 | recom24_recom | localhost | recom24_main2 | Sleep | 22155 | | |
| 32382 | recom24_recom | localhost | recom24_main2 | Sleep | 21721 | | |
| 33182 | recom24_recom | localhost | recom24_main2 | Sleep | 20980 | | |
| 34834 | recom24_recom | localhost | recom24_main2 | Sleep | 19218 | | |
| 35271 | recom24_recom | localhost | recom24_main2 | Sleep | 18740 | | |
| 36316 | recom24_recom | localhost | recom24_main2 | Sleep | 17728 | | |
| 36963 | recom24_recom | localhost | recom24_main2 | Sleep | 17210 | | |
| 38499 | recom24_recom | localhost | recom24_main2 | Sleep | 15814 | | |
| 38727 | recom24_recom | localhost | recom24_main2 | Sleep | 15617 | | |
| 39002 | recom24_recom | localhost | recom24_main2 | Sleep | 15395 | | |
| 39129 | recom24_recom | localhost | recom24_main2 | Sleep | 15281 | | |
| 39262 | recom24_recom | localhost | recom24_main2 | Sleep | 15151 | | |
| 39265 | recom24_recom | localhost | recom24_main2 | Sleep | 15160 | | |
| 39266 | recom24_recom | localhost | recom24_main2 | Sleep | 15145 | | |
| 44598 | recom24_recom | localhost | recom24_main2 | Sleep | 9121 | | |
| 44630 | recom24_recom | localhost | recom24_main2 | Sleep | 9087 | | |
| 53130 | recom24_recom | localhost | recom24_main2 | Sleep | 747 | | |
| 53137 | recom24_recom | localhost | recom24_main2 | Sleep | 724 | |

This is usually caused by improper usage of persistent connections in PHP scripts. I have restarted the server again to clear out all of the stale connections, but this is an issue you should bring up with your developers.

Regards,
MikeOkay, anyone got it? If not read on.

ReCom runs phpNuke, a highly modified version due to the immense amount of feature set that was required by earlier active members. Today the bulk of the traffic goes to the Forum and other associated communication tools. Recently the number of visitors (members / non-members alike) hitting the site has been on the increase to the extent that we’re now seeing very observable peak load during scholarship season.

Unfortunately, due to nature of shared hosting, each application is only allowed up to 25 connections to the database. Now these are concurrent connections, which means that when you visit the site, you don’t actually “hold” one of the connection hostage, but you basically acquire one of these when you click on something until the page is loaded. In the normal circumstances, the connection is then released back into the connection pool so that other applications on the same shared host can grab it.

However, there’s a catch, which is the way PHP implements Persistent Connections. I’ll leave the academic exercise of understanding what persistent connections means to you. Our initial speculation is that our version of phpNuke actually does that, i.e. when one is done loading a particular page, the connection is not released by the application, but handed to the other thread that’s handling another page for another user. This is usually done to increase performance of MySQL, as any Database Administrators (DBA) or Computer Science (CS) students would be able to tell you that locking yourself up to wait for a DB connection from the connection pool is usually a very expensive operation.

There is a downside to doing this, i.e. when you’re actually _not_ using it, the PHP engine might hold on to the connection for a specific timeout period before releasing the connection. So there is a chance that MySQL’s performance is actually worse as non of the connections are available.

However, as Luke has painstakingly went through the code, ReCom does not use persistent connections. This means that the snapshot that tech support gave us can only mean one thing – that at anyone time there are actually more than 25 concurrent users loading a page. This on a system architecture perspective is ridiculous. I run systems for thousand man companies and they load more than just a simple forum page and yet they only need 5 concurrent connections. Or quoting some example for MySQL website:

As with all other configuration rules-of-thumb, the answer is “It depends.” While the optimal size depends on anticipated load and average database transaction time, the optimum connection pool size is smaller than you might expect. If you take Sun’s Java Petstore blueprint application for example, a connection pool of 15-20 connections can serve a relatively moderate load (600 concurrent users) using MySQL and Tomcat with response times that are acceptable.

(FYI, the petstore application is just a generic application – think about buying pets online in a typical e-commerce site – it should have a higher browsing rate than ReCom, where people tend to stop and read instead of look at pet pictures)

I don’t want to be the one to “declare” that ReCom is really getting the popularity it cannot handle, because I didn’t do the diagnostic myself (yet?), but I do think the ReCom Anchors owe many of our loyal members an explanation. This is as detail as I can give now, and with this I’m begging for volunteer resource to help me with this issue. Basically we need to track down exactly how phpNuke close connection just in case there’re alternate code paths in the system that causes connections to be left open (*gasp*) or ascertain (i.e. collect statistics) the right sizing for the servers required to handle the load. There is also a remote possibility that our web hosting company simply sucks (e.g. implemented a buggy version of MySQL or PHP) but with no facts at hand we can’t decide on the best course of action yet.

Thank you for reading.

p/s I know there are a lot of alternative energy – e.g. let’s switch to another CMS, let’s visit another forum etc. Let me assure you that since it is a technical issue, we would treat it as a technical issue, or should I say, we should treat it as a technical issue. Moreover, ReCom’s platform is a multi-year investment, and is not easily replicable by just “downloading” another CMS and pressing the play button. I’ll organize system architecture classes if you want to understand more. 😛

Print Friendly, PDF & Email

3 Responses

  1. Haha there were a couple of interviews ran in the newspapers this year on the Malaysians who got into top universities in the US, and some of them mentioned ReCom during the interviews so that might have caused the traffic to spike! Hahaha

  2. A short-term solution would be to upgrade your hosting service, perhaps moving it to a dedicated server or something? A long-term solution would, alas, involve going through the code to optimise it. (Just two cents from a half-assed webmaster who doesn’t actually know all that much about PHP/mySQL.)

  3. We’ve moved 3 times since inception 3 years ago. It is a non-partisan platform with no big $$$ sponsors, so we’re limited in getting more iron. Optimization is a huge tradeoff when you’re using an open source software. You either stick close to the main branch (to pickup useful updates from the community) or put it upon yourself to do everything right. Problem is – ReCom is neither here nor there at this stage…

Leave a Reply

Your email address will not be published.

Back to Top