Dyn is a cloud-based Internet Performance Management (IPM) company that provides unrivaled visibility and control into cloud and public internet resources. Dyn’s platform monitors, controls and optimizes applications and infrastructure through Data, Analytics, and Traffic Steering, ensuring traffic gets delivered faster, safer, and more reliably than ever.
In many organizations, temporary fixes become permanent solutions. At Dyn, for example, our technical operations teams had a manual process in place, that included a script that outputs data into a file. This file was then emailed to relevant people in the organization, and it was someone’s task to sporadically check this file and alert everyone when necessary.
“We’ve all been there, and these “temporary” solutions that initially took 10 minutes to deploy, sometimes last for years in the company because properly monitoring every single thing consumes a large amount of time and resources”
Before we began to use Logz.io, we would manually parse daily unreliable hardware status reports to proactively find potential symptoms of future hardware faults. This consumed our staff’s time and we determined that our practices simply weren’t efficient and weren’t going to scale to our growing needs and ultimate focus on delivering reliable services.
As a growing company going through tough regulatory certification processes, it was critical for us to have a scalable log analytics system in place. We’ve always been a fan of open source products and the ELK Stack in particular, so when we met Logz.io at DevOpsDays Boston, we were thrilled to discover a platform that allows us to enjoy the benefits of the open source ELK Stack while taking away the headache of deploying and maintaining it.
“We were also happy to find that Logz.io is not only SOC 2 certified but also fitted to handle our scale and our continued growth. Logz.io provides us not only with the ELK Stack but also with the necessary training and support. This was critical
for us when we deployed the product throughout our organization”
Since we already had an ELK stack in place, the process of migrating was smooth. It took us only a few sprint cycles to ship our files to Logz.io and generate visualizations and alerts based on the content of those files.
Logz.io quickly became a service that we use across all parts of our organization. We currently have more than 130 users that log in constantly. Users include our security groups, developers, product managers, Network Operations and Engineering teams and technical support teams.
Our first two teams to be trained were the NOC and our Sustaining Engineering teams. They were the ones most commonly looking through logs by hand. We then brought in the engineering teams. Lastly, our customer service teams were trained and started running with making dashboards and customer specific searches for fast issue resolution.
We are now alerting on a critical facet of our business that we had limited insight into previously. Our Network Operations team now has a more reliable scalable methodology built around log ingestion and analysis to determine critical system degradations and failures before a minimal issue becomes a potentially impacting incident. After ingesting logs with Logz.io, we were able to easily parse through these syslog messages from our 18 global data centers. We are also able to instrument alerts to proactively notify us when reliable operating conditions are compromised.