In this tutorial, we will understand how to leverage chatbots to assist in network operations. As we move toward intelligent operations, another area to focus on is mobility. It's good to have a script to perform configurations, remediations, or even troubleshooting, but it still requires a presence to monitor, initiate, or even execute those programs or scripts.
Nokia's MIKA is a good example of a chatbot that operations personnel can use for network troubleshooting and repair. According to Nokia's blog, MIKA responds with an alarm prioritization information based on the realities for this individual network and also compare's the current situation to a whole service history of past events from this network and others, in order to identify the best solution for the current problem.
Let's create a chatbot to assist in network operations. For this use case, we will use a widely-used chat application, Slack. Referring to the intelligent data analysis capabilities of Splunk, we would see some user chat interaction with the chatbot, to get some insight into the environment.
This tutorial is an excerpt from a book written by Abhishek Ratan titled Practical Network Automation - Second Edition. This book will acquaint you with the fundamental concepts of network automation and help you improve your data center's robustness and security.
The code for this tutorial can be found on GitHub.
As we have our web framework deployed, we'll leverage the same framework to interact with the Slack chatbot, which in turn will interact with Splunk. It can also interact directly with network devices so we can initiate some complex chats, such as rebooting a router from Slack if need be. This eventually gives mobility to an engineer who can work on tasks from anywhere (even from a cellphone) without being tied to a certain location or office.
To create a chatbot, here are the basic steps:
Here, a crucial step is once we type in the URL that accepts chat messages, that particular URL needs to be verified from Slack. A verification involves the API endpoint sending the same response back as a string or JSON that is being sent to that endpoint from Slack. If we receive the same response, Slack confirms that the endpoint is authentic and marks it as verified. This is a one-time process and any changes in the API URL will result in repeating this step.
Here is the Python code in the Ops API framework that responds to this specific query:
import falcon import json def on_get(self,req,resp): # Handles GET request resp.status=falcon.HTTP_200 # Default status resp.body=json.dumps({"Server is Up!"}) def on_post(self,req,resp): # Handles POST Request print("In post") data=req.bounded_stream.read() try: # Authenticating end point to Slack data=json.loads(data)["challenge"] # Default status resp.status=falcon.HTTP_200 # Send challenge string back as response resp.body=data except: # URL already verified resp.status=falcon.HTTP_200 resp.body=""
This would validate, and if a challenge is sent from Slack, it would respond back with the same challenge value that confirms it to be the right endpoint for the Slack channel to send chat data to.
The core API framework code that responds to specific chat messages, performs the following actions:
The code is as follows:
import falcon
import json
import requests
import base64
from splunkquery import run
from splunk_alexa import alexa
from channel import channel_connect,set_data
class Bot_BECJ82A3V():
def on_get(self,req,resp):
# Handles GET request
resp.status=falcon.HTTP_200 # Default status
resp.body=json.dumps({"Server is Up!"})
def on_post(self,req,resp):
# Handles POST Request
print("In post")
data=req.bounded_stream.read()
try:
bot_id=json.loads(data)["event"]["bot_id"]
if bot_id=="BECJ82A3V":
print("Ignore message from same bot")
resp.status=falcon.HTTP_200
resp.body=""
return
except:
print("Life goes on. . .")
try:
# Authenticating end point to Slack
data=json.loads(data)["challenge"]
# Default status
resp.status=falcon.HTTP_200
# Send challenge string back as response
resp.body=data
except:
# URL already verified
resp.status=falcon.HTTP_200
resp.body=""
print(data)
data=json.loads(data)
#Get the channel and data information
channel=data["event"]["channel"]
text=data["event"]["text"]
# Authenticate Agent to access Slack endpoint
token="xoxp-xxxxxx"
# Set parameters
print(type(data))
print(text)
set_data(channel,token,resp)
# Process request and connect to slack channel
channel_connect(text)
return
# falcon.API instance , callable from gunicorn
app= falcon.API()
# instantiate helloWorld class
Bot3V=Bot_BECJ82A3V()
# map URL to helloWorld class
app.add_route("/slack",Bot3V)
Performing a channel interaction response: This code takes care of interpreting specific chats that are performed with chat-bot, in the chat channel. Additionally, this would respond with the reply, to the specific user or channel ID and with authentication token to the Slack API https://slack.com/api/chat.postMessage. This ensures the message or reply back to the Slack chat is shown on the specific channel, from where it originated. As a sample, we would use the chat to encrypt or decrypt a specific value.
For example, if we write encrypt username[:]password, it would return an encrypted string with a base64 value.
Similarly, if we write decrypt <encoded string>, the chatbot would return a <username/password> after decrypting the encoded string.
The code is as follows:
import json
import requests
import base64
from splunk_alexa import alexa
channl=""
token=""
resp=""
def set_data(Channel,Token,Response):
global channl,token,resp
channl=Channel
token=Token
resp=Response
def send_data(text):
global channl,token,res
print(channl)
resp = requests.post("https://slack.com/api/chat.postMessage",data='{"channel":"'+channl+'","text":"'+text+'"}',headers={"Content-type": "application/json","Authorization": "Bearer "+token},verify=False)
def channel_connect(text):
global channl,token,resp
try:
print(text)
arg=text.split(' ')
print(str(arg))
path=arg[0].lower()
print(path in ["decode","encode"])
if path in ["decode","encode"]:
print("deecode api")
else:
result=alexa(arg,resp)
text=""
try:
for i in result:
print(i)
print(str(i.values()))
for j in i.values():
print(j)
text=text+' '+j
#print(j)
if text=="" or text==None:
text="None"
send_data(text)
return
except:
text="None"
send_data(text)
return
decode=arg[1]
except:
print("Please enter a string to decode")
text="<decode> argument cannot be empty"
send_data(text)
return
deencode(arg,text)
def deencode(arg,text):
global channl,token,resp
decode=arg[1]
if arg[1]=='--help':
#print("Sinput")
text="encode/decode <encoded_string>"
send_data(text)
return
if arg[0].lower()=="encode":
encoded=base64.b64encode(str.encode(decode))
if '[:]' in decode:
text="Encoded string: "+encoded.decode('utf-8')
send_data(text)
return
else:
text="sample string format username[:]password"
send_data(text)
return
try:
creds=base64.b64decode(decode)
creds=creds.decode("utf-8")
except:
print("problem while decoding String")
text="Error decoding the string. Check your encoded string."
send_data(text)
return
if '[:]' in str(creds):
print("[:] substring exists in the decoded base64 credentials")
# split based on the first match of "[:]"
credentials = str(creds).split('[:]',1)
username = str(credentials[0])
password = str(credentials[1])
status = 'success'
else:
text="encoded string is not in standard format, use username[:]password"
send_data(text)
print("the encoded base64 is not in standard format username[:]password")
username = "Invalid"
password = "Invalid"
status = 'failed'
temp_dict = {}
temp_dict['output'] = {'username':username,'password':password}
temp_dict['status'] = status
temp_dict['identifier'] = ""
temp_dict['type'] = ""
#result.append(temp_dict)
print(temp_dict)
text="<username> "+username+" <password> "+password
send_data(text)
print(resp.text)
print(resp.status_code)
return
This code queries the Splunk instance for a particular chat with the chatbot. The chat would ask for any management interface (Loopback45) that is currently down. Additionally, in the chat, a user can ask for all routers on which the management interface is up. This English response is converted into a Splunk query and, based upon the response from Splunk, it returns the status to the Slack chat.
Let us see the code that performs the action to respond the result, to Slack chat:
from splunkquery import run def alexa(data,resp): try: string=data.split(' ') except: string=data search=' '.join(string[0:-1]) param=string[-1] print("param"+param) match_dict={0:"routers management interface",1:"routers management loopback"} for no in range(2): print(match_dict[no].split(' ')) print(search.split(' ')) test=list(map(lambda x:x in search.split(' '),match_dict[no].split(' '))) print(test) print(no) if False in test: pass else: if no in [0,1]: if param.lower()=="up": query="search%20index%3D%22main%22%20earliest%3D0%20%7C%20dedup%20interface_name%2Crouter_name%20%7C%20where%20interface_name%3D%22Loopback45%22%20%20and%20interface_status%3D%22up%22%20%7C%20table%20router_name" elif param.lower()=="down": query="search%20index%3D%22main%22%20earliest%3D0%20%7C%20dedup%20interface_name%2Crouter_name%20%7C%20where%20interface_name%3D%22Loopback45%22%20%20and%20interface_status%21%3D%22up%22%20%7C%20table%20router_name" else: return "None" result=run(query,resp) return result
The following Splunk query fetches the status:
index="main" earliest=0 | dedup interface_name,router_name | where interface_name="Loopback45" and interface_status="up" | table router_name
index="main" earliest=0 | dedup interface_name,router_name | where interface_name="Loopback45" and interface_status!="up" | table router_name
Let's see the end result of chatting with the chatbot and the responses being sent back based on the chats.
The encoding/decoding example is as follows:
As we can see here, we sent a chat with the encode abhishek[:]password123 message. This chat was sent as a POST request to the API, which in turn encrypted it to base64 and responded back with the added words as Encoded string: <encoded string>. In the next chat, we passed the same string with the decode option. This responds back with decoding the information from API function, and responds back to Slack chat, with username abhishek and password password123.
Let's see the example of the Splunk query chat:
In this query, we have shut down the Loopback45 interface on rtr1. During our scheduled discovery of those interfaces through the Python script, the data is now in Splunk. When queried on which management interface (Loopback45) is down, it would respond back with rtr1. The slack chat, On which routers the management interface is down, would pass this to the API, which, upon receiving this payload, will run the Splunk query to get the stats. The return value (which, in this case, is rtr1) will be given back as a response in the chat.
Similarly, a reverse query of, On which routers the management interface is up, will query Splunk and eventually share back the response as rtr2, rtr3, and rtr4 (as interfaces on all these routers are UP).
This chat use case can be extended to ensure that full end-to-end troubleshooting can occur using a simple chat. Extensive cases can be built using various backend functions, starting from a basic identification of problems to complex tasks, such as remediation based upon identified situations.
In this tutorial, we implemented some real-life use cases and looked at techniques to perform troubleshooting using chatbot. The use cases gave us insight into performing intelligent remediation as well as performing audits at scale, which are key challenges in the current environment.
To learn how to automate your own network without any hassle while leveraging the power of Python, check out our book Practical Network Automation - Second Edition.
Preparing and automating a task in Python [Tutorial]
PyPy 7.0 released for Python 2.7, 3.5, and 3.6 alpha
5 blog posts that could make you a better Python programmer