Information Is Everything

Some of the things Software-as-a-Service developers are rightly worried about are uptime and performance.  “How will my system perform when I deploy it?”  “What will happen during peak periods?”  “How will my system scale?”

Without the right tools we cannot manage performance or uptime.  We need information to make the right decisions and then to check if those decisions were indeed right after all.

That’s why we have employed the best tools to manage the C Infinity private cloud.  Let’s start with our virtualisation platform.  If your virtual machine suddenly requires more resources then our monitoring system will detect this within 180 seconds.  It is integrated with our virtualisation management system.  That will be informed of the issue and immediately initiate a move of the machine to another piece of hardware that is more suited for hosting your virtual machine.  There is no downtime and no manual intervention required.  Any alert that was raised will automatically be resolved.

What about performance issues caused by growth?  This normally requires labour intensive baseline charts to be captured and compared with other reports over periods of time.  It’s a slow and time consuming process that usually isn’t done.  Then when a system grows and suddenly performs slowly we don’t know exactly why.  Some guesswork is done to add resources to the system until it suddenly performs well.  This is both expensive and too late.  The system has been slow for the end users who may have become alienated and sent their business elsewhere.

The C Infinity monitoring system not only monitors health but also monitors performance information.  This is all stored in a reporting database where just over 1 year of data is retained.  That allows us to run reports based on how a system is behaving now with X users versus how it was performing a year ago with Y users.

In fact, we’ve just run a series of reports for all of our hosting customers who are using our monitoring system.  Here’s the CPU utilisation of one web hosting company:


Click for full image.

In the report you can see the average CPU utilisation from January 1st until December 15th.  There are spikes to high utilisation rates but the standard deviation clearly shows that those spikes are very brief.  That’s quite acceptable.
We also ran a report on available memory (RAM).  This report is also from that web hosting company’s server and shows the available megabytes over the year:


Click for full image.

This machine has 2GB of RAM and hosts a good number of web sites.  As you can see they’ve averaged around 80% utilisation.  They’ve had a big of a drop lately but that’s because December is a busy time for them.

Armed with this information both we and this customer can accurately predict what their resource requirements will be going into the future.

For us techies, those are nice reports.  What about something the business owner cares about?  Uptime is the big thing there.  Our monitoring has that covered too.  Consider a website.  What is website uptime?  Is it that the services are running?  Is it that the operating system is healthy?  No; website uptime is that the web is available and running correctly for the end users.  There is no way to monitor this on the server itself.  This requires a view from somewhere else.  Our monitoring system checks websites on a regular basis from a different location.  This ensures that they are up and running.


Click for full image.

Above is a report on a monitored website.  This website has had some “downtime”.  Some of it was planned maintenance, e.g. security patching, application maintenance by developers, etc, and some of it was genuine downtime.  For whatever reason, the website stopped working but the services were healthy.  Our advanced techniques alerted both we and the SAAS company in question and an immediate response was initiated.  In this case, the website has a near perfect uptime over the course of the year.

Related posts:

Subscribe to RSS

Comments are closed.