A new company in town is trying to make companies’ software better — by getting them to intentionally break it. Gremlin emerged from stealth today with the mission of making chaos engineering more accessible to companies around the globe.
Chaos engineering practices emerged from high-scale internet companies — like Amazon and Netflix — that needed to know their systems could stand up to massive loads without breaking. To help with that, engineers put together software that would stress systems in various ways, with the idea of finding where they might break in a more naturally occurring event. Armed with that knowledge, it’s easier for companies to fix problems before they cause critical outages.
Gremlin wants to scale that practice by providing services that let companies set up stress tests with the press of a button. Using Gremlin’s services, businesses can set up agents that simulate things like intermittent network issues, servers with overtaxed processors, and DNS failures. Administrators can schedule those issues and others, or have agents randomly kick in on a particular environment during a specified range of time.
Gremlin cofounder and CEO Kolton Andrus said in an interview with VentureBeat that he recommends companies start with an internal testing or staging environment where the blast radius of a system failing as a result of the software’s stress would be fairly limited. As businesses harden their systems, they can extend Gremlin’s reach. If something goes sideways, the software has an undo button that lets customers put things back the way they were.
Netflix, one of…