The library contains multiple controller types implementing different throttling algorithms (or methods). Although all throttling controller types essentially limit the number of HTTP requests coming through, each of them is design to solve different technical and business problems. Therefore it is essential to analyse and choose the solution that fits your need the best.

Short Average - limits the number of calls by specified number of requests per configured time interval. This essentially means that when threshold is configured as 20 calls per 2000 ms., regardless of the number of incoming requests, only 20 calls per 2 seconds on average will be allowed; as well as 60 calls per 6 seconds, as well as 600 calls per 1 minute; at the same time, all 20 calls would be "allowed" to come through regardless whether they are equally spread across 2 seconds or arrive within some 35 milliseconds interval. Although this type of throttling implements a smoothing effect on the system loading pattern, it is not trying to prevent "spikes" as its main function, but rather ensures that those spikes are limited and overall number of calls coming through on an average bases is equal.

This is arguably the most useful and thus popular type of throttling algorithm solving multiple problems at the same time without putting a burden on the API client trying to accommodate the API requirements. This type of throttling is useful when you need to limit the spikes and create more-or-less consistent load on the system; it provides load predictability with the least negative effect on a client. This is one of the best method to cut an unexpected floods of the requests.

Long Average - limits the number of calls by specified number of requests per configured duration. This method limits an overall number of calls within long time duration; this is also known as applying penalty to the client for excessive application use. By large this algorithm is very similar to the "Short Average" that is discussed above, but this throttle would "punish" a client by disabling its ability to continue with the service for certain time in case the client exceeds its threshold. The duration of such penalty would depend on the number of excessive requests client has made.
This type of throttling is well suitable for either systems that require recovery time after heavy load, or for dealing with variety of business cases. One of those, for example, access to the log-in page.

Linear throttling - limits the number of calls by enforcing consistent delay between consequent requests. The delay is equal to an average calculated as configured timeIntervalMsec/maxThreshold.
All the excessive requests will be blocked. This type of controller is best suitable when limited and unified load on the underlying resource, that is being exposed through the API, must be provided.

Consider 3 timelines below each representing one of the throttling algorithm above. The "red" colored time-marker denotes a blocked API request. Each other color assumes a successful one. Provided timelines visualize the logical differences the controllers follow.

service_timeline.png


Last edited Jun 3, 2014 at 4:21 AM by lennygran, version 4