Reading a dataset
Reading datasets using Petastorm can be very simple. In this section, we will demonstrate how we can easily load a Petastorm dataset into two frequently used deep learning frameworks, which are TensorFlow and PyTorch:
- To load our Petastorm datasets, we use the
petastorm.reader.Reader
class, which implements the iterator interface that allows us to use plain Python to go over the samples very efficiently. Thepetastorm.reader.Reader
class can be created using thepetastorm.make_reader
factory method:from petastorm import make_reader with make_reader('dfs://some_dataset') as reader: Â Â Â for sample in reader: Â Â Â Â Â Â Â print(sample.id) Â Â Â Â Â Â Â plt.imshow(sample.image1)
- The following code example shows how we can stream a dataset into the TensorFlow
Examples
class, which as we have seen before is a named tuple with the keys being the ones specified in the Unischema of...