When we're working on a distributed environment, sometimes it is required to share information across nodes so that all the nodes can operate using consistent variables. Spark handles this case by providing two kinds of variables: read-only and write-only variables. By no longer ensuring that a shared variable is both readable and writable, it also drops the consistency requirement, letting the hard work of managing this situation fall on the developer's shoulders. Usually, a solution is quickly reached, as Spark is really flexible and adaptive.
Sharing variables across cluster nodes
Read-only broadcast variables
Broadcast variables are variables shared by the driver node; that is, the node running the IPython notebook...