Exploring output parameters
Output parameters enable you to configure what happens to your processed data after the result lands in the output repository. You can set it up to be placed in an external S3 repository or configure an egress.
s3_out
The s3_out
parameter enables your Pachyderm pipeline to write output to an S3 repository instead of the standard pfs/out
. This parameter requires a Boolean value. To access the output repository, you would have to use an S3 protocol address, such as s3://<output-repo>
. The output repository will still be eponymous to your pipeline's name.
The following code shows how to define an s3_out
parameter in YAML format:
s3_out: true
Here's how to do the same in JSON format:
"s3_out": true
Now, let's learn about egress
.
egress
The egress
parameter enables you to specify an external location for your output data. Pachyderm supports Amazon S3 (the s3://
protocol), Google Cloud Storage (the gs:...