Analyzing Runkeeper Data with the ELK Stack

runkeeper and elk_stack

I know.

2017 is behind us and the time for summaries is over.

But indulge me.

I’d like to give a healthy start to 2018 by combining two of my favorite past times, running and analyzing random data sets, by providing a simple workflow for analyzing sports activity.

If you’re an athlete of any kind, you’re most likely tracking your activities using an application. Personally, I use Runkeeper to track my runs, and the occasional bike ride or hike, but there are many other applications that do the job quite well.

Most of these applications allow you to export your activity data, which means it can be analyzed using a data analysis tool of your choice.

Guess what tool I’m going to use?

Exporting your Runkeeper data

The initial step is of course to export the activity report.

This will be different in each application, but in Runkeeper this is done from the application’s Setting page. There, all you have to do is click the Export Data tab, select the time period you want to analyze and then hit the Export Data button.

After a short while, a ‘runkeeper-data-export-*.ZIP’ file is generated, and to download it just click the Download Now button.

Runkeeper

Unzip the file.

In the uncompressed folder, you will find a ‘cardioActivities.csv’ which contains all your cardio activities – runs, walks, rides, etc. For the sake of simplicity,  I renamed it  ‘runkeeper.csv’.

Shipping into ELK

Our next step is to ship the data into ELK. To do this, we will need to configure Logstash to process the CSV file and ship it to Elasticsearch.

Taking a look at the file, you’ll see that it consists of 12 columns. These columns will translate into fields when indexed in Elasticsearch.

Since I did not measure my heartbeat, use a route name, or enter any notes for the different activities, I’m going to delete these columns. The GPX File column is also of no interest, and I’m also going to rename the Type column ‘Activity’. Last but not least, I have no need for the top line defining the different column names since we will use Logstash to define these.

Our future fields are therefore: Date, Activity, Route Name, Distance (km), Duration, Average Pace, Average Speed (km/h), Calories Burned, Climb (m).  

Logstash configuration file

The Logstash configuration file (/etc/logstash/conf.d/runkeeper-01.conf) uses the file input plugin to ship the CSV file. To process the file, we are using the CSV filter plugin, as well as the mutate plugin for mapping some of the fields as integers, and the date plugin to define our timestamp field. A local Elasticsearch instance is defined as the output.

input {
  file {
    path => "/home/ubuntu/runkeeper.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
    csv {
      separator => ","
      columns => ["Date","Activity","Distance(km)","Duration","Average Pace","Average Speed (km/h)","Calories Burned","Climb(m)"]
    }
    mutate {
      convert => { 
        "Distance(km)" => "integer" 
        "Calories Burned" => "integer"
        "Average Speed (km/h)" => "integer"
        "Climb(m)" => "integer" 
        }
    }
    date {
     match => [ "Date" , "MM/dd/yy hh:mm", "MM/dd/yy h:mm"]
     remove_field => [ "Date" ]
    }
}

output {
  elasticsearch { 
    hosts => ["localhost:9200"] 
  }
}

Starting Logstash, the data should be indexed into separate Elasticsearch indices, one per Runkeeper activity date.

To verify, cURL Elasticsearch with:

curl -XGET 'localhost:9200/_cat/indices?v&pretty'

You should be seeing a list of the indices created:

health status index               uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   logstash-2017.11.23 B_rI8jimQxeuJq_Ekstakw   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.11.26 1ktPMaI5S12DnLNi3ys_tg   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.11.16 oaQP1mX7RCuX_TL8fhjMJA   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.06.19 YgAVNjRnQ46WVS_uV_UjHg   5   1          2            0     16.3kb         16.3kb
yellow open   logstash-2017.09.10 tQHOAMFTQxWoCjXGEOL3Dw   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.01.16 Izixk0EvTfC33EgBKPX97g   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.10.08 202CivWCQ72FBVe06gwxEg   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.04.06 6rU9Kym2TrapjxfWcxqv6w   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.10.01 c5Fg_G--TjeuERbAyhe9nQ   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.01.10 xdk1bEhrRPWBTd5rZYoH0g   5   1          1            0      8.7kb          8.7kb
yellow open   logstash-2017.08.21 CaogzquKTWSFvAIf6i8r5g   5   1          1            0      8.7kb          8.7kb
...

In Kibana, you will be able to now define the index pattern as logstash-* and view your activity, looking a year back.

Logstash

Adding some of the fields into the main display area, we get a nice overview of all the activities tracked by Runkeeper in 2017:

calories burned

Shipping into Logz.io

If you’re using Logz.io, a few modifications are required to the Logstash configuration file. We need to add the Logz.io user token (retrieved from the Settings page, in the Logz.io UI) and define Logz.io as an output destination.

The resulting configuration should look something like this:

input {
  file {
    path => "/home/ubuntu/runkeeper.csv"
    start_position => "beginning"
    sincedb_path => "/dev/null"
  }
}

filter {
    csv {
      separator => ","
      columns => ["Date","Activity","Distance(km)","Duration","Average Pace","Average Speed (km/h)","Calories Burned","Climb(m)"]
    }
    mutate {
      convert => { 
        "Distance(km)" => "integer" 
        "Calories Burned" => "integer"
        "Average Speed (km/h)" => "integer"
        "Climb(m)" => "integer" 
        }
    }
    date {
     match => [ "Date" , "MM/dd/yy hh:mm", "MM/dd/yy h:mm"]
     remove_field => [ "Date" ]
    }
    mutate {
      add_field => { "token" => "tWMKrePSAcfaBSTPKLZeEXGCeiVMpuHb" }
    }
}

output {
  tcp {
    host => "listener.logz.io"
    port => 5050
    codec => json_lines
    }
}

Analyzing the Runkeeper data

Now that the data is indexed and parsed correctly, it’s time for some fun.

Using a series of simple Kibana visualizations, we can build a nice dashboard that gives us an overview into our activities.

For example, a simple pie chart visualization shows that my favorite activity is…running!

runvswalk

A metric visualization gives us the number of activities conducted over the year – 95!

95

How about the average distance covered per activity?

7.347

A bar chart visualization gives us a depiction of the different activities, per month:

bar

Let’s take a look at the change in distance covered over time. To do this we’ll use a line chart:

line

If you must ask, that huge drop in June was due to a knee injury. We can see it corresponds to me moving to walks instead of running in the bar chart above.

And to summarize it all, we can use a data table aggregating metrics per month:

table

Endnotes

Adding these visualizations, and others, into one dashboard, we get a nice overview of the last year:

dashboard

Yes, analyzing your sports activity with Logstash, Elasticsearch and Kibana is akin to bringing a gun to a swordfight. But while not a typical use case for the ELK Stack, it’s a nice and extremely simple data analysis exercise, especially if you’re a keen runner as I am.

It was also a nice opportunity to get acquainted with some of the new improvements in Kibana 6, soon to be introduced in Logz.io as well. Kudos to the folks at Elastic on some great work in improving the UX and UI, especially to the dashboarding experience.

I wish you all a great, and active, 2018!

Visualize and Analyze your Data with Logz.io.

 

Thank you for Subscribing!
Artboard Created with Sketch.

Leave a Reply

Your email address will not be published. Required fields are marked *

×

Turn machine data into actionable insights with ELK as a Service

By submitting this form, you are accepting our Terms of Use and our Privacy Policy

×

DevOps News and Tips to your inbox

We write about DevOps. Log Analytics, Elasticsearch and much more!

By submitting this form, you are accepting our Terms of Use and our Privacy Policy