Introducing the stream package
Package stream is based on two major components:
- Data stream data is used to connect to data streams.
- Data stream task is to used to perform a data mining task on the data stream.
It's an extensible framework to work on data in motion.
Let us quickly look at the major components inside this framework:
Let us look at the individual boxes in the subsequent sections.
Data stream data
Data stream data (DSD) is an abstraction layer which connects to any streaming data source (of course, with some small hacks, which we will see as we progress). The stream package provides several DSD implementations.
Let us look at them:
DSD as a static simulator
As a simulator DSD can simulate static streams as well as streams with drift. In cases where we are developing algorithms to work on streaming data, we can use this simulator feature effectively.
Let us see how DSD can be leveraged as a data simulator:
> library(stream, quietly = TRUE) > set.seed(100) > gaussian.stream <...