Following on from the post earlier that referred to “points of inflection” in your performance curves, which we also discuss over here in the comments on the excellent Dynatrace blog, I thought it was worth expanding this idea further.
Technically a “point of inflection” is:
“is a point on a curve at which the curvature (second derivative) changes signs. The curve changes from being concave upwards (positive curvature) to concave downwards (negative curvature), or vice versa”
For our purposes what we use it to mean “the point at which things change from a positive scaling trend to a negative scaling trend” and we can use this is two ways – the scaling capacity curve, and the actual application performance curve.
In the early blog post we were talking about the scaling capacity curve – the point where the “return on investment” from horizontal or vertical scaling begins to diminish, or to use the Excel example discussed in the Dynatrace blog comments where it falls off a cliff…
“A simple analogy is Excel’s 65K row limit… you are “linearly scaling” up to 65K and then at 65K+1 (the “point of inflection”) it all breaks. Now you can upgrade to Excel 2007 and increase that limit 16x… but in a web world this equates to updating a major version of your database or web server and from personal experience of having updated both across a large web farm this involves lots of planning, application testing, extra “swing” servers etc. This takes quite some time. Time whilst your application performance is suffering and your customers are going elsewhere…”
The application performance curve is how the current application release (and the hosting platform) scales under load. By definition this is a “performance snapshot” at a particular moment in time (which is why you need to do at a minimum some baseline performance testing for every release. I have seen performance decimated by a “minor point release” because of a “trivial” database change that messed up the indexing and execution plan).
You can see on the graph below that the green line showing the number of successful tests starts to trend downwards at about the 18 minute mark, despite the load (purple line) still increasing. Concurrently the warnings (yellow) and errors (red) start to rise.
The system has reached saturation.
You can see this even more clearly in the results breakdown table below – the green “successful” tests decline from 19 minutes onwards, until at about 37 minutes (now under a constant load of 1000 concurrent users) there are no successful tests and only errors and warnings.![]()
The difficulty, of course, is extrapolating the application performance curve (the “moment in time” snapshot that you can directly measure) into the scaling capacity curve (which, as the Excel example, shows might not be immediately obvious until you hit the boundary).
This is where your technical architect, system architect, database architect or CTO really earns their money – by understanding the limitations of the current system (application and platform) so that you have plenty of time to build a bridge to your new architecture before you fall off that cliff.
0 comments:
Post a Comment