Formulating a plan of action
Having inspected the GDELT schemas, we now need to make some decisions around what data we are going to use, and make sure we justify that usage based on our hypotheses. This is a critical stage as there are many areas to consider, and at the very least we need to:
Ensure that our hypotheses are clear so that we have a known starting point
Ensure that we are clear about how we are going to implement the hypotheses, and determine an action plan
Ensure that we use enough appropriate data to meet our action plan; scope the data usage to ensure we can produce a conclusion within a given time frame, for example, using all GDELT data would be great, but is probably not reasonable unless a large processing cluster is available. On the other hand using one day is clearly not enough to gauge any patterns over time
Formulate a plan B in case our initial results are not conclusive
Our second hypothesis is about the detail of the events; for the purposes of clarity, in this chapter...