Creating efficient active ML pipelines
As we have seen in the previous chapter, efficient active ML pipelines consist of end-to-end pipelines. This means that the active ML algorithm needs to be able to access the unlabeled data, select the most informative frames, and then seamlessly send them to the labeling platform. All these steps need to happen one after the other in an automatic manner in order to reduce manual intervention.
Moreover, it is essential to test this pipeline to ensure that each step works properly. An example of a cloud-hosted active ML pipeline would be as follows:
- Unlabeled data is stored in an AWS S3 bucket.
- An active ML algorithm runs on an EC2 instance that can access the S3 bucket.
- The results of the active ML run are saved in a dedicated S3 bucket specifically for this purpose and are linked to the labeling platform used for the project.
- The final step of the active ML run is to link the selected frames to the labeling platform and...