Skip to content
Sledgeworx Software

Sledgeworx Software

Software, Projects, Consulting

  • About
  • Contact Me
  • SledgeConf
  • Buy the Getting into Software Handbook
  • Guides
  • Consulting

Recent Posts

  • Code Without Learning to Code
  • Reducing toil with AI
  • An Excellent AI use case: Generating PlantUML and Mermaid schemas.
  • Links October 2024
  • A clear sign you are overdoing microservices 

Buy the Book

Get post updates by email

Recent Comments

    Categories

    • Books
    • Consulting
    • Estimation
    • Guide
    • Links
    • Onboarding
    • Product
    • Programming
    • Software Engineering
    • Software Industry
    • Software Leviathan
    • Uncategorized

    Archives

    • May 2025
    • March 2025
    • February 2025
    • October 2024
    • February 2024
    • July 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • October 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • February 2022
    • January 2022
    • December 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • October 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    © Nicholas Sledgianowski 2022

    Load testing Overview

    [This is a draft from my in progress handbook on running software in production]

    Load testing helps us estimate how much load our system can handle. This helps us decide how much hardware we need, whether we need to switch to a more performant architecture, and prepare for a peak event like a big sale. 

    Natural vs Synthetic traffic 

    When we load test the traffic has to come from somewhere. If we want to load test in preparation for a 10x peak of traffic during a sale, we need to get 10x our regular traffic from somewhere. An easy way to do it is to write a script that makes request to our website. This is called synthetic traffic, its typically very predictable or targets a subset of the systems functionality. Developing synthetic traffic is pretty expensive, but it is also easy, you just write some code and prepare some test accounts. 

    Natural traffic is real traffic generated by customers who want to use your project. This traffic is great for testing your service because it is a 100% real example of your load. The issue with natural traffic is that it cannot be scaled easily. If you have non-peak load of 100 req/s (requests per second) in a load test you are typically preparing for a situation where you get 1000 req/s. Do you really want to use customers to test if your service works with 10x as many users? No, because that is expensive and you’d rather have those customers buy things on a working site. 

    Squeeze tests

    A squeeze test is a way to use natural traffic to load test your services. You modify your load balancer to gradually route all of your traffic to a single host. This only works if you run a multi-host system and has the risk that you cause an outage for actual customers. The last limitation is that you are limited in how much load you can put onto that host by your current natural traffic. If you have 100req/s natural traffic a squeeze test will only allow you to test 100 req/s on a single host. Thats great if you need 3 hosts to serve 100 req/s but if one host can handle 200 req/s a squeeze test doesn’t help. At bigger scales squeeze tests are very effective for evaluating how much a host can handle. 

    Replay Traffic

    Replay traffic is when you record actual customer requests and then replay them later to load test your services. You do need a way to flag replay traffic to ensure it doesn’t actually change customer accounts or place orders. This can be done with an extra field that basically says “replay:true” or you can do replay tests in a beta environment that is distinct from production.

    Artificial Traffic 

     Artificial traffic is generated programmatically. It can be easy to get started writing artificial traffic, but the issue is typically covering all of your test cases. Developing artificial traffic is easy at first, but you quickly end up with lots of code to model all the possible uses of your system. 

    You would think that if you had integration or end-to-end tests you could just run them really fast for load testing. If you are in that situation take advantage of it, but often times load testing has slightly different needs than integration tests that verify functionality. One problem that pops up when you try to repurpose your integration tests is that you need more user accounts to hit your req/s target. 

    Test Accounts

    These are just artificially made accounts that are used as the identity of scripted artificial traffic. Many systems have different behavior for customers when they are logged in vs using the website anonymously. Depending on what you are preparing for you might want to focus on one or the other. You may also want to configure your test accounts in various ways. 

    At my first job we ‘created’ test accounts by filling out an excel spreadsheet which the accounts we needed and sending it to the IT department to actually create the accounts. Unfortunately, the IT department didn’t have coverage to add all of the properties we needed. So we still had to manually configure our test accounts once we got them. 

    Test accounts sometimes expire. At my current job test accounts usually expire after 30 days. We use these accounts constantly, so account expiration meant our integration tests were failing every week on some new account that had expired. Keep your eye on expiration times, you don’t want to spend weeks preparing 10,000 test accounts for the big load test, and then have them all expire because the date was pushed back a week. 

    Test account pools and generators 

    Having pools of test accounts is a common pattern. These are the accounts we prepared for feature A, those are the accounts for feature B, etc. Its a good strategy, but can get away from you as you start to have dozens of account pools for different purposes. 

    Test account generators are either scripts or preferably APIs that create test accounts on request. Ideally, you want the ability to create a test account with any set of attributes your website supports. A big trap for these systems is when they allow you to create account programmatically, except for xyz functionality which has to be configured by hand. The goal is to be able to setup accounts automatically on demand. If you can do that test accounts can become ephemeral and you don’t have to worry about them expiring. 

    When to load test 

    The best times to load test are in preparation for peak events and when launching new features. 

    Peak events are times that you know your system will experience high traffic. It could be a marketing campaign launching on March 14th. It could be that you have seasonal traffic leading up to the 4th of July. You might have a new enterprise client onboarding 10,000 employees next month. 

    Feature launches can effect the performance of your system. Doing a bit of load testing before launching new functionality is useful to avoid gotchas. 

    Running Load Tests

    Depending on which type of load test you are running you will want to prepare differently. For a lightweight test like a squeeze test, you might just configure it to happen in your deployment pipeline.  If its a big test for peak you might want to organize a Gameday. 

    Gamedays

    For big load tests you want to organize a bit ahead of time. Its important to notify people who are oncall or supporting the website so that they know whats going on. If you don’t let them know, people may end up panicking trying to find out where all this load is coming from. “Are we under a DDOS attack?!”

    You want to avoid running your big load tests during busy times for your website. That might mean the load test happens at night. Try not to start too late or your people will be working after midnight. 

    Load tests like this will often find the weakest link in the system. If Service C is an essential part of the chain and can only handle 200 req/s the load test can’t go past that. So you may end up hitting a failure point, ending the load test early and going back to fix the weak point. Then you need to setup another Gameday.  

    Posted on February 9, 2020January 13, 2020Author nsledgianowskiCategories Software EngineeringTags account pool, gameday, load testing, peak, squeeze test, stress test, traffic

    Post navigation

    Previous Previous post: Never get in the way of people with the momentum to get things done
    Next Next post: Matias Ergo Pro review + affiliate link
    Proudly powered by WordPress