As an example of unstructured data, we have pulled some sample server logs from a public source and included them in a text document. We can take a glimpse of what this unstructured data looks like, so we can recognize it in the future:
# Import our data manipulation tool, Pandas
import pandas as pd
# Create a pandas DataFrame from some unstructured Server Logs
logs = pd.read_table('../data/server_logs.txt', header=None, names=['Info'])
# header=None, specifies that the first line of data is the first data point, not a column name
# names=['Info] is me setting the column name in our DataFrame for easier access
We created a DataFrame in pandas called logs that hold our server logs. To take a look, let's call the .head() method to look at the first few rows:
# Look at the first 5...