This article explores integrating Google Pub/Sub with the world’s most popular open source log analysis platform — the ELK Stack, for deeper analysis and investigation.

The integration described here includes the following steps — creating some sample data in Google Stackdriver, exporting this data into to Google Pub/Sub, and using a community beat created by the folks at Google called pubsubbeat to pull the data into the ELK Stack (or Logz.io).

The article assumes you have an ELK deployment running (Logstash is not required) or a Logz.io account, and of course a Google Cloud account and project. Instructions on installing the ELK Stack can be found in our ELK Guide. Signing up to Logz.io for a 14-day free trial can be performed here.

What is Google Pub/Sub?

Google Pub/Sub is Google Cloud’s message queuing service based, as the name of the service implies, on the publish-subscribe model. The idea behind Google Pub/Sub, and other message middleware services is to decouple sender and receiver components in an application’s architecture, thus making the communication more resilient and secure.

Google Pub/Sub can be used for various use cases. For example, a website processing purchases could publish an order to a Pub/Subtopic, to which a number of services are subscribed to, each responsible for a different process (asynchronous workflows). Another example could be an application for sending notifications to different services based on a specific event being triggered (event-driven workflows).

Google Cloud users logging their cloud environment can use Google Pub/Sub to write log messages to multiple endpoints, each subscribed to the same topic.

Step 1: Generating some sample logs

Before we begin building our pipeline, we need some demo data to play around with. If you already are streaming data with Pub/Sub, skip to step 3.

For the sake of this tutorial, I’m going to use Google Functions to run a simple function that simulates request logs. We will then stream these logs from Google Stackdriver to a Pub/Subtopic which we will subscribe to and use pubsubbeat to pull from.

The function source code:

And the package.json (for defining dependencies):

helloworld

Using the HTTP trigger type, we can generate logs upon request by calling the URL.

Opening Google Stackdriver, I’m going to enter a filter to show only logs for the function we just created and called.

create metric

Step 2: Exporting logs to Google Pub/Sub

Now that we have our demo log messages, we can start configuring our data pipeline into the ELK Stack.

First, we will create what is called a Stackdriver export sink. Stackdriver supports a number of ways to export logs using sinks, including to Google Storage (a subject for a future article), Google BigQuery, and to the service we will use for this pipeline — Google Pub/Sub.

To begin the process, click the Create Export button above the logs and edit your export. Give the new sink a name, select Cloud Pub/Sub as your sink service and then under Sink Destination either select an existing Pub/Sub topic or create a new one.

list

Clicking Create Sink, creates the new export sink and you will see a message informing you that a new Google IAM service account was created with the correct permissions to write to the Pub/Sub topic.

Make note of this service key, you will need it in the next step.

 sink created

Your next step is to subscribe to the topic.

This is easily done from the Google Pub/Sub console. Select the newly created topic and click Create Subscription at the top of the page. Enter a subscription name (e.g. logstash) and leave the default delivery type selected as Pull.

create subscription

Hit create to subscribe to the topic.

Step 3: Downloading your IAM service key

If your Logstash instance is deployed outside Google Cloud, as is the case here, you will need to download a key for the IAM service account. If you are running Logstash on a Google Compute Engine machine, you can use Application Default Credentials and skip this step.

This is done via the Service accounts page in the Google IAM Console.

Locate the service account created when creating the export sink, and on the right-hand side open the options menu and select create key. Leave the default setting as-is and create the key to download it to your machine.

blurry

Remember the location of this file as you will need to define its path for the beat configuration file.

Step 4: Installing and configuring pubsubbeat

As explained above, to pull messages from Google Pub/Sub into the ELK Stack, we will be using a community beat called pubsubbeat created by the kind folks at Google.

Beats are a family of log shippers that are used for shipping different types of data from different types of platforms into the ELK Stack. Written in Go and more lightweight in nature compared to other log aggregators that are used for shipping data into ELK (e.g. Logstash, fluentd), beats are a reliable and easy-to-use shipping method.

First download the correct release version for your OS from the releases page:

Extract the tar.gz file and cd into the directory:

Next, configure the beat’s YAML configuration file:

Just to clarify:

  • credentials_file – the path to the service account key
  • project_id – your Google Cloud project ID
  • topic – your Google Pub/Sub topic name
  • subscription.name – your Google Pub/Sub subscription name
  • json.enabled – set to true to enable JSON decoding of the messages from Pub/Sub
  • json.add_error_key – set to true to add an error message when JSON objects are not parsed correctly

Save your configuration file, and before you start the beat give it root permissions:

Start the beat with:

The beat will start and display debugging info. Stream some data by generating some logs (I created a CRON job to call my function via the HTTP endpoint) and within a few minutes you should see a new ‘pubsubbeat-*’ index created in Elasticsearch:

Kibana

Step 5: Shipping to Logz.io

If you are using Logz.io, you can ship data from Google Pub/Sub by applying some small tweaks to your pubsubbeat.yml file.

First, you will need to download an SSL certificate to use encryption:

Next, open Logz.io and retrieve your account token (under Settings → Account Settings).

Make the following adjustments to the pubsubbeat.yml configuration file:

Start the beat again and the Stackdriver logs will be pulled from Pub/Sub and displayed in Logz.io within a few seconds:

shipped

Step 6: Analyzing Pub/Sub data in Kibana

The messages pulled from Google Pub/Sub in this case are formatted in JSON and so basic parsing of the fields is applied out of the box. Depending on the type of data you are streaming from Pub/Sub though, you might want to parse out other elements in your message to make analysis more efficient by outputting the data to Logstash instead of directly into Elasticsearch.

The advantage of streaming data from Google Pub/Sub into the ELK Stack lies in the rich analysis and visualization capabilities it provides.

Start by adding some fields to the main display area from the list of available fields on the left — this gives us some visibility into the data shipped:

logz.io

Kibana can be used to perform basic and advanced querying based on Lucene syntax. In our example we are logging Google Functions logs and can use the following query to search for function invocations:

orange

Saving the search, we can use it to create a simple metric visualization that monitors how many invocations occurred:

25

Or perhaps the number of functions invoked over time:

line

Endnotes

This article focused on building a data pipeline from Google Pub/Sub into the ELK Stack, an integration that provides you with the option to drill down deeper into your data for further investigation and analysis.

Of course, the analysis examples above are a very basic demonstration of what can be done with Kibana, and future articles will explore more advanced analysis.

It’s worth pointing out that there is also a Logstash Input plugin that can be used to pull from Google Pub/Sub. Beats are more lightweight in nature, and if you do not require advanced parsing, are the preferred way to go over Logstash.

Get ELK with Live Tail, alerts, automated parsing, and more!