English 中文(简体)
Error monitoring/handling on webservers
原标题:

We have a web server that we re about to launch a number of applications onto. They will all share database and memcached servers, but each application has it s own mySQL database and all memcached keys per application, is prefixed.

Possible scenario:

If a memcached server in our cluster goes boom, we want someone (operative system admin) to be automatically contacted by email/iphone push notification or in any other appropriate way.

If we we re about to install 150 identical applications for our customers on our servers, and a memcached server dies - all 150 applications will individually find this out and contact our system admin, which most certainly is going to think about getting a new job where he or she isn t about to be woken up by getting 150 messages sent 4:15 in the morning.

Possible solution:

One idea is to set up an external server for error handling that gets a $_POST or cURL request sent, and handles storage of the error message depending on the seriousness of the actual error message. It would of course check upon receiving the error call, that if the same memcached server have already been reported as offline, there would be no need to spam the system admin with additional reminders...

The questions:

  • What s a good approach on how to handle errors?
  • How does the big guys in the industry handle this?

Thanks!

最佳回答

You might consider using an open source monitoring framework such as Hyperic so you don t need to reinvent the wheel.

Hyperic can monitor many aspects of your system out of the box and it s pretty easy to plug in new monitoring points. It provides rule based alerting and you can configure which types of alerts are once-only until reset vs. once each time it happens.

I have not used it to monitor a PHP app (though presume that it can), but have used it very successfully to monitor a java app and associated MySQL DB.

问题回答

Well, I think your problem is best solved outside of the application.

You want to monitor physical and software servers/services. I d recommend something like http://www.nagios.org/ or http://www.opennms.org/. Set it up to watch each memcached server, mysql server, apache, etc, and send notifications on state change (down, low resources, recovery, etc)





相关问题
Separating Business Layer Errors from API errors

The title is horrible, i know; I m terrible at titles on SO here. I m wondering what would be the best way to present unified error responses in a webapi when errors could be raised deep inside the ...

AsyncTask and error handling on Android

I m converting my code from using Handler to AsyncTask. The latter is great at what it does - asynchronous updates and handling of results in the main UI thread. What s unclear to me is how to handle ...

How to tell why a file deletion fails in Java?

File file = new File(path); if (!file.delete()) { throw new IOException( "Failed to delete the file because: " + getReasonForFileDeletionFailureInPlainEnglish(file)); } Is there a ...

Exceptions: redirect or render?

I m trying to standardize the way I handle exceptions in my web application (homemade framework) but I m not certain of the "correct" way to handle various situations. I m wondering if there is a ...

热门标签