Kinesis is a managed, high-performance and large-capacity service for real time processing of (live) streaming data. Prominent users include Netflix, Comcast and Major League Baseball. Its design to let it grab data from multiple sources at the same time and to scale processing within EC2 instances. AWS Kinesis logs come from its Data Stream feature, one of the main two Kinesis services along with Kinesis Data Firehose (note that there are also services for Kinesis Analytics and Kinesis Video Streams). It is modelled after, and is designed to be an alternative to, Apache Kafka.
While Kafka is considered more durable (as an open-source application, it’s configuration is ultimately up to the developer), it also comes with the need to manually manage clusters while Amazon Kinesis is a fully managed service by Amazon for AWS. With that in mind, Kinesis isn’t for on-prem applications. So if you’re looking to save time and personnel resources, and have already gone all-in on using the cloud (and AWS in particular), Kinesis might (should) be a better option than Kafka.
As a fully managed service, Kinesis has limits to data storage: a default of 24 hours but a configurable maximum limit of seven days. All uptime is managed by Amazon and all data going through Data Streams gets automatic, built-in cross replication.
Producers send data to be ingested into AWS Kinesis Data Streams. Each stream is divided into shards (each shard has a limit of 1 MB and 1,000 records per second). Output is then sent onward to Consumers.
Shipping AWS Kinesis Logs to Logz.io
1. Create a new Lambda function
This Lambda function will consume a Kinesis data stream and sends the logs to Logz.io in bulk over HTTP.
Open the AWS Lambda Console, and click Create Function. Choose Author from scratch, and use this information
Name: We suggest adding the log type to the name (in this case, obviously, “Kinesis“), but any name is acceptable.
Runtime: Choose Python 3.7
Role: Use a role that has AWSLambdaKinesisExecutionRole permissions.
Click Create Function (bottom right corner of the page). After a few moments, you’ll see configuration options for your Lambda function. You’ll need this page later on, so keep it open.
2. Zip Source Files
Clone the Kinesis Stream Shipper – Lambda project from GitHub to your computer, and zip the Python files in the src/ folder.
git clone https://github.com/logzio/logzio_aws_serverless.git \
&& cd logzio_aws_serverless/python3/kinesis/ \
&& mkdir -p dist/python3/shipper; cp -r ../shipper/shipper.py dist/python3/shipper \
&& cp src/lambda_function.py dist \
&& cd dist/ \
&& zip logzio-kinesis lambda_function.py python3/shipper/*
You’ll upload logzio-kinesis.zip in the next step.
3. Upload the zip file and set environment variables
In the Function code section of Lambda, find the Code entry type list. Choose Upload a .ZIP file from this list.
Click Upload, and choose the zip file you created earlier (logzio-kinesis.zip).
In the Environment variables section, set your Logz.io account token, URL, and log type, and any other variables that you need to use.
TOKEN: <<YOUR LOGZ.IO TOKEN>>
FORMAT: text #or_json
Notes: FORMAT could be text or json. Set COMPRESS to false if you want to compress logs before sending
4. Configuring the function
In basic settings, we recommend 512 MB of memory and a Timeout setting of 1:00.
5. Set the Kinesis event trigger
On the Add triggers list to the left of the Designer panel, choose Kinesis. Below the Designer, notice the Configure triggers panel and choose the Kinesis stream you want your Lambda function to watch. Then click Add and Save at the top of the page.
6. Check Logz.io that logs have been sent
Logs will not instantaneously appear in your Logz.io account (nor in the open-source version of ELK, we might add). After a few minutes, they should appear in Kibana.