Like everyone else, my life for the last few months has become a never-ending stream of video calls. With Zoom calls, and the occasional Skype, Google Meet, or Microsoft Teams, becoming the norm I’ve noticed that the fans on my Macbook have been kicking in and sounding like a tiny jet trying to take off.
Zoom and Chrome are demolishing me.
Unfortunately, Activity Monitor isn’t the best for working out what’s triggering it directly, especially as I have terrible tab management with my chrome.
Zoom is definitely taking up a chunk of it. Many work-from-home colleagues and friends are getting the dreaded “Your CPU usage is affecting meeting quality” from the conferencing application. Add to this the immense bandwidth video conferencing uses (especially with multiple video feeds coming to your machine, and it’s no wonder a lot of people see a slowdown.
The excess at-home demand for internet access over the last few months, no matter where you live, is also affecting that video quality as well. I cannot count the number of meetings in which at least one person has had to drop off the call.
But there are mitigation strategies, which might be more pertinent to you depending on your machine and the quality of your web connection.
And as I work in an observability company, I may as well work out why using the tools I have access to. Now logs would be great for seeing what’s going on, but it doesn’t state how much system resource is being consumed. My running theory is it’s the CPU on fire that’s triggering this.
Shipping Zoom Metrics to Logz.io
That leads me to believe that we should use metrics to see what’s going on on my machine. IN order to do that I need to install the Metricbeat collector to my Macbook to do this.
Since we’re talking Macbooks, let’s go through it the Homebrew way.
brew install elastic/tap/metricbeat-full
To install the -oss version, enter the following:
brew install elastic/tap/metricbeat-oss/
brew install metricbeat
Next you will have to configure the yaml file. If you installed with Homebrew, it will be found at usr/local/etc/metricbeat/metricbeat.yml.
If you want info for installing and configuring on another platform, check out our Metricbeat tutorial.
To get started you’ll need to change Metricbeat’s main configuration file (<path to metricbeat>metricbeat.yml) to point its data towards the Logz.io listener, and to use the logz.io SSL certificate to get information flowing to use later. The first step is adding your Metrics shipping token (found here towards the bottom of the page), we do this under the General section:
#================================ General ===================================== fields: logzio_codec: json token: <INSERT YOUR TOKEN HERE> fields_under_root: true
Next, you’ll need to tell Metricbeat where to point the data. To do this, configure the output to use the Logstash listener and aim for that, as well as use the SSL certificate we provide for a secure connection.
To grab the certificate, run the following command in your terminal:
sudo curl https://raw.githubusercontent.com/logzio/public-certificates/master/TrustExternalCARoot_and_USERTrustRSAAAACA.crt --create-dirs -o /etc/pki/tls/certs/COMODORSADomainValidationSecureServerCA.crt
Then you’ll want to configure your output to the correct region, in my case that is listener.logz.io and the logstash endpoint uses port 5015.
#================================ Outputs ===================================== output.logstash: hosts: ["listener.logz.io:5015"] ssl.certificate_authorities: ['/etc/pki/tls/certs/COMODORSADomainValidationSecureServerCA.crt']
Once you enable the system module, you’re ready to send data to logz.io.
sudo metricbeat modules enable system # Enables the systems module metricbeat run # Starts the collection of data
To make this run automatically on startup, you’ll need to add it to use something like Automator.
Once you’ve configured this, and you see some metrics appearing in your Metrics account (https://app.logz.io/#/dashboard/grafana/) and you see the system metrics, it’s time to make some modifications to the defaults.
To collect metrics for a specific application on my Mac, I initially assumed I’d have to add a new custom metricset and module to metricbeat. You can find a guide to custom modules here, I even got so far as getting some totally untested golang together to gather the metrics.
As it turns out, that’s total overengineering and the system module does handle all the application processes on your machine but it filters out a lot normally.
Normally, the preconfigured system module configuration (<path to metricbeat>modules.d/system.yml) only sends the top five consuming processes for CPU and memory, which isn’t representative when you have 1 million Google Chrome Helper (renderer) processes running on your machine at a time. If usually looks like this by default:
# Module: system # Docs: https://www.elastic.co/guide/en/beats/metricbeat/7.7/metricbeat-module-system.html - module: system period: 10s metricsets: - cpu - load - memory - network - process - process_summary - socket_summary #- entropy #- core #- diskio #- socket #- service #- users process.include_top_n: by_cpu: 5 # include top 5 processes by CPU by_memory: 5 # include top 5 processes by memory - module: system period: 1m metricsets: - filesystem - fsstat processors: - drop_event.when.regexp: system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)' - module: system period: 15m metricsets: - uptime #- module: system # period: 5m # metricsets: # - raid # raid.mount_point: '/'
That means commenting out the “top_n” limitation and collecting everything.
# process.include_top_n: # by_cpu: 5 # include top 5 processes by CPU # by_memory: 5 # include top 5 processes by memory processes: ['.*'] process.include_top_n.enabled: false
The processes line is using a regex wildcard, forcing the system module to process everything., The process.include_top_n.enabled: false line is to just disable the sorting and limitation of the processes sent via the system metric module.
Make sure to leave the metricsets, you’ll want to see the overall system usage as well the process-specific details.
With metrics in Logz.io we can start having a poke at the output…
Analyzing Zoom Metrics in Logz.io
Now that we have metric data available in the metrics section of the logz.io dashboard, we can start using Grafana to analyse this. When you first go to Grafana you’ll see the welcome dashboard that informs you of the overall number of metric data point coming into the system… this isn’t what we’re looking for but it does tell you it’s all working.
What we’re after in this case is the System Metrics dashboard, you can find this by clicking the Logz.io Dashboards button and scrolling down to System Metrics. Once you’re here you’ll be presented with this dashboard:
Now we can see what’s happening, by looking at the spikes in CPU can you tell when I had a Zoom call? My memory consumption is pretty consistent due to the number of Chrome instances, my IDE and a couple of VMs I have running nearly constantly, however, the CPU my user is triggering definitely is spikey just as I have to chat.
Zoom Metrics for CPU and Memory Usage
Let’s have a look at how much Zoom is making those spikes happen. We could go straight to the Grafana Explore, but I personally want to see the application-specific data alongside the overall system information to provide better context. To do this I’m going to duplicate the System Metrics dashboard, and modify it to our current needs.
The fastest way to duplicate a Grafana dashboard is to go to the settings in the top navigation bar and click “Save As…”. Once you’re on your new dashboard, you’ll see the “Add Panel” option in the top, this will pop up a new panel with the option “Add Query” option.
You can see that my queries are both using
process.name="zoom.us", which limits us to just looking for Zoom metrics, then I’ve used the
Sum system.process.cpu.total.norm.pct$MAX &
Sum system.process.memory.rss.pct$MAX to show the CPU consumption and memory consumption respectively.
You can definitely tell when the application is active. The memory spikes show when it opened, but the big killer is the huge spikes in CPU utilisation when my calls are on. Just a note, to get the percentage fields to display better (as they’re decimals of 1), I used the Script field within the Options drop-down of the Metric to make it the total percentage of 100.
Just for fun, I created a replica of this panel—but for Google Chrome—by changing the query to be process.name=”Google Chrome H”. And you can definitely tell when I’m opening more and more tabs to follow along with my meetings (I personally find it hard to follow with screen sharing sometimes so go to the raw data to read at my own rate).
Chrome tabs actually aren’t too intensive on CPU consumption individually, but when you start having so many they definitely start showing the memory consumption.
You can very quickly see how easy it is to interrogate the data within Grafana to find out why certain things are happening. In my case, I can definitely see that Zoom is a huge contributor to the CPU utilization that triggers my Macbook’s fans to go bonkers trying to stay cool. Additionally, we can also see how easy it is to display different data sets separately in order to get a decent side-by-side view of what’s happening. With this data mind, I should definitely start closing tabs during meetings, or even close some other background apps … but I doubt that’ll happen.