How many times we have heard of organization's entire database being breached and downloaded by the hackers. The irony is, they are not even aware about anything until the hacker is selling the database details on the dark web after few months.
Even though they implement decent security controls, what they lack is continuous security monitoring policy. It is one of the most common things that you might find in a startup or mid-sized organization.
In this article, we will show how to choose the right log monitoring tool to implement continuous security monitoring policy.
You are reading an excerpt from the book Enterprise Cloud Security and Governance, written by Zeal Vora.
Log monitoring is considered to be part of the de facto list of things that need to be implemented in an organization. It gives us the power of visibility of various events through a single central solution so we don't have to end up doing less or tail on every log file of every server.
In the following screenshot, we have performed a new search with the keyword not authorized to perform and the log monitoring solution has shown us such events in a nice graphical way along with the actual logs, which span across days:
Thus, if we want to see how many permission denied events occurred last week on Wednesday, this will be a 2-minute job if we have a central log monitoring solution with search functionality.
This makes life much easier and would allow us to detect anomalies and attacks in a much faster than traditional approach.
This is a very important decision that needs to be taken by the organization. There are both commercial offerings as well as open source offerings that are available today but the amount of efforts that need to be taken in each of them varies a lot.
I have seen many commercial offerings such as Splunk and ArcSight being used in large enterprises, including national level banks. On the contrary, there are also open source offerings, such as ELK Stack, that are gaining popularity especially after Filebeat got introduced.
At a personal level, I really like Splunk but it gets very expensive when you have a lot of data being generated. This is one of the reasons why many startups or mid-sized organizations use commercial offering along with open source offerings such as ELK Stack.
Having said that, we need to understand that if you decide to go with ELK Stack and have a large amount of data, then ideally you would need a dedicated person to manage it.
Just to mention, AWS also has a basic level of log monitoring capability available with the help of CloudWatch.
There will always be many sources from which we need to monitor logs. Since it will be difficult to cover each and every individual source, we will talk about two primary ones, which we will be discussing sequentially:
VPC flow logs is a feature that allows us to capture information related to IP traffic that goes to and from the network interfaces within the VPC.
VPC flow logs help in both troubleshooting related to why certain traffic is not reaching the EC2 instances and also understanding what the traffic is that is accepted and rejected.
The VPC flow logs can be part of individual network interface level of an EC2 instance. This allows us to monitor how many packets are accepted or rejected in a specific EC2 instance running in the DMZ maybe.
By default, the VPC flow logs are not enabled, so we will go ahead and enable the VPC flow log within our VPC:
Before we go ahead in creating our first flow log, we need to create the CloudWatch log group as well where the VPC flow logs data will go into.
CloudWatch gives us the capability to filter packets based on certain expressions. We can filter all the rejected packets by creating a simple search for REJECT OK in the search bar and CloudWatch will give us all the traffic that was rejected. This is shown in the following image:
Plain text data is good but it's not very appealing and does not give you deep insights about what exactly is happening. It's always preferred to send these logs to a Log Monitoring tool, which can give you deep insights about what exactly is happening.
In my case, I have used Splunk to give us an overview about the logs in our environment. When we look into VPC Flow Logs, we see that Splunk gives us great detail in a very nice GUI and also maps the IP addresses to the location from which the traffic is coming:
The following image is the capture of VPC flow logs which are being sent to the Splunk dashboard for analyzing the traffic patterns:
AWS Config is a great service that allows us to continuously assess and audit the configuration of the AWS-related resources.
With AWS Config, we can exactly see what configuration has changed from the previous week to today for services such as EC2, security groups, and many more.
One interesting feature that Config allows is to set the compliance test as shown in the following screenshots. We see that there is one rule that is failing and is considered non-compliant, which is the CloudTrail.
There are two important features that Config service provides:
Once they are enabled and you have associated Config rules accordingly, then you would see a dashboard similar to the following screenshot:
In the preceding screenshot, on the left-hand side, Config gives details related to the Resources, which are present in your AWS; and on the right-hand column, Config gives us the status if the resources are compliant or non-compliant according to the rules that are set.
Let's look into how we can get started with the AWS Config service and have great dashboards along with compliance checks, which we saw in the previous screenshot:
For demo purposes, I decided to disable the CloudTrail service and if we then look into the Config dashboard, it says that one rule check has been failed:
Instead of graphs, Config can also show the resources in a tabular manner if we want to inspect the Config rules with the associated names. This is illustrated in the following diagram:
AWS Config allows us to evaluate the configuration changes that have been made to the resources. This is a great feature that allows us to see how our resource looked a day, a week, or even months back.
This feature is particularly useful specifically during incidents when, during investigation, one might want to see what exactly changed before the incident took place. It will help things go much faster.
In order to evaluate the changes, we will need to perform the following steps:
We covered log monitoring and how to choose the right log monitoring tool for continuous security monitoring policy. Check out the book Enterprise Cloud Security and Governance to build resilient cloud architectures for tackling data disasters with ease.
Cloud Security Tips: Locking Your Account Down with AWS Identity Access Manager (IAM)
Monitoring, Logging, and Troubleshooting