Challenge
Moving from a monolith to microservices in 2014 "solved a problem on the development side, but it pushed that problem to the infrastructure team," says Kevin Lynch, Staff Engineer on the Site Reliability team at Squarespace. "The infrastructure deployment process on our 5,000 VM hosts was slowing everyone down."
Solution
The team experimented with container orchestration platforms, and found that Kubernetes "answered all the questions that we had," says Lynch. The company began running Kubernetes in its data centers in 2016.
Impact
Since Squarespace moved to Kubernetes, in conjunction with modernizing its networking stack, deployment time has been reduced by almost 85%. Before, their VM deployment would take half an hour; now, says Lynch, "someone can generate a templated application, deploy it within five minutes, and have actual instances containerized, running in our staging environment at that point." Because of that, "productivity time is the big cost saver," he adds. "When we started the Kubernetes project, we had probably a dozen microservices. Today there are twice that in the pipeline being actively worked on." Resilience has also been improved with Kubernetes: "If a node goes down, it's rescheduled immediately and there's no performance impact."
Behind the scenes, though, the company's monolithic Java application was making things not so simple for its developers to keep improving the platform. So in 2014, the company decided to "go down the microservices path," says Kevin Lynch, staff engineer on Squarespace's Site Reliability team. "But we were always deploying our applications in vCenter VMware VMs [in our own data centers]. Microservices solved a problem on the development side, but it pushed that problem to the Infrastructure team. The infrastructure deployment process on our 5,000 VM hosts was slowing everyone down."
After experimenting with another container orchestration platform and "breaking it in very painful ways," Lynch says, the team began experimenting with Kubernetes in mid-2016 and found that it "answered all the questions that we had." Deploying it in the data center rather than the public cloud was their biggest challenge, and at the time, not a lot of other companies were doing that. "We had to figure out how to deploy this in our infrastructure for ourselves, and we had to integrate it with our other applications," says Lynch.
At the same time, Squarespace's Network Engineering team was modernizing its networking stack, switching from a traditional layer-two network to a layer-three spine-and-leaf network. "It mapped beautifully with what we wanted to do with Kubernetes," says Lynch. "It gives us the ability to have our servers communicate directly with the top-of-rack switches. We use Calico for