We use synchronous ETL for demonstration purposes. But for production, asynchronous ETL is preferable. In production, the existence of a single low-performance ETA component can cause a performance bottleneck. In DL4J, we load data to the disk using DataSetIterator. It can load the data from disk or, memory, or simply load data asynchronously. Asynchronous ETL uses an asynchronous loader in the background. Using multithreading, it loads data into the GPU/CPU and other threads take care of compute tasks. In the following recipe, we will perform asynchronous ETL operations in DL4J.
Using asynchronous ETL
How to do it...
- Create asynchronous iterators with asynchronous prefetch:
DatasetIterator asyncIterator = new AsyncMultiDataSetIterator...