How Bazaarvoice Monitors 5.4 Billion Monthly Page Views with Logz.io

About Bazaarvoice

Bazaarvoice connects brands and retailers to consumers so that every shopping experience feels personal. From search and discovery to purchase and advocacy, Bazaarvoice’s solutions reach in-market shoppers, personalize their experiences, and give them the confidence to buy. Each month in the Bazaarvoice Network, more than 900 million consumers view and share authentic content including reviews, questions and answers, and social photos across 5,700 brand and retail websites. Across the network, Bazaarvoice captures billions of shopper signals monthly – data that powers high-efficiency digital advertising and personalization with unmatched relevance.

Wanted – a flexible and scalable logging solution!

Log data lies at the base of an elaborate and extensive monitoring system constructed by Bazaarvoice for keeping tabs on the various services offered by the company. This system produces a large and growing amount of events per day that are all tracked and logged by various teams at the company.

Bazaarvoice’s logging system experiences periodic and large spikes in log volume. Due to the pricing model used by the previous logging tool used by the team, this exacted an increasingly high cost from the company. Bazaarvoice began searching for a more cost efficient logging solution, and after evaluating various alternatives, chose Logz.io as the company’s primary logging solution.

The key considerations in the selection process were the flexible overage plan that suited the nature of the company’s logging system, and the knowledge that the solution was scalable enough to handle the expected growth in log volume and stay cost-efficient at the same time. The fact that the solution was based on the ELK Stack and open source technology, coupled with the hands-on and dedicated support received by the team during the initial proof of concept process, made Logz.io the only viable option.   

“Logz.io stood out to us in the log analytics space for its scalability and flexibility, and the fact that it’s built on a popular open source platform means we can take advantage of a huge set of skills without needing to build anything proprietary,” says Joseph Poirier, Director of the Cloud Platform team at Bazaarvoice.

Distributed and high-volume logging

Bazaarvoice services run on an architecture that is distributed by nature. Primarily written in Java, but also Scala and Python, more than 50 different services, deployed on thousands of Amazon EC2 instances, are logged and monitored using Logz.io. These services serve billions of page views a day, and as such, generate a huge amount of log data.

Log data is shipped into Logz.io using a log aggregation tool that was developed in-house and that was designed to give Bazaarvoice full control of the logging process. A number of grok parsing rules were configured to process the log data, resulting in well-structured log messages that are easier to query and analyze in Kibana.

For reasons of data privacy and compliance, Bazaarvoice makes use of Logz.io’s multiple availability zones, keeping E.U. data in Europe and segregating it from other regions in the world.   

Improved troubleshooting and root cause analysis

Bazaarvoice services are developed by different development teams, each using Logz.io to query the generated log data for specific strings within the data. Kibana visualizations and dashboards were constructed by the different teams to monitor various metrics such as service response times or an abnormal number of response errors.

Bazaarvoice Kibana

Image: A Kibana dashboard used to monitor Bazaarvoice’s file transfer service.

Joseph Poirier describes one example in which these dashboards played a key role in identifying and resolving an issue:

After a difficult database migration for one of our production-supporting services, we wanted to make sure our applications were behaving normally. Our Logz.io dashboard helped us identify a spike in errors that wasn’t happening before – and exactly which line of code it was coming from. Moreover, we could drill down to see that it was only happening for one specific client. After fixing the problem, we watched in real time to verify the exception was no longer occurring.”

Using Logz.io’s built-in alerting mechanism, Bazaarvoice also put in place a notifications system based on service-specific threshold configurations that trigger off an alert. Identifying an abnormal rate of errors, for example, will trigger an alert and send off a notification to the relevant team. Both PagerDuty and HipChat are used as pre-configured endpoints for receiving and handling the log-based alerts triggered in Logz.io.

Monitoring data utilization

Bazaarvoice ships approximately 60 GB of log data to Logz.io a day while data bursts that have occurred in the past have resulted in data overages sometimes amounting up to 400 GBs. The company, therefore, required a method for gauging how much data is being shipped at any given time.

With the help of Logz.io’s Support and Customer Success team, a data utilization monitoring system was put in place that includes a dedicated Kibana dashboard and an alerting mechanism that alerts in case of any unexpected spikes in log volume. The system is based on utilization metrics that Logz.io ships that include used data volume as well as the expected data volume for the current indexing rate.

End result

Currently, more than 80 users across multiple development teams at Bazaarvoice use Logz.io to debug and monitor more than 50 of the company’s distributed services serving billions of end-users a month.

The volume of data involved, and the distributed sources generating it, required a logging solution that is not only cost-efficient, flexible and scalable, but also provides analysis capabilities enhancing visibility into the data.

Logz.io answers these needs for Bazaarvoice.  

Joseph Poirier, Director of Cloud Platform at Bazaarvoice, sums it up, “In a complex environment like ours, having a tool that gives us complete visibility and intelligent insights is invaluable to ensuring our operations run smoothly.”

Turn machine data into actionable insights with ELK as a Service