The PDI MongoDB GridFS Output Step
The BJSON document size in MongoDB is limited to 16 MB. If you want to store large files and/or different file types, you can use GridFS. There are some cases in which storing large files may be more efficient in MongoDB than in a filesystem, for example, if the filesystem is limited in the number of files in a directory or it's possible to access only some portions of large files without loads all the files in the memory.
SPEC INDIA has contributed to the Pentaho community with the MongoDB GridFS Output Step under a GPL license on GitHub at https://github.com/SPECUSA/MongoDBGridfs.
Getting ready
To get ready for this recipe, you will again need to start your ETL development environment Spoon and make sure that you have the MongoDB server running with the data from the previous chapters.
How to do it…
Perform the following steps to use the MongoDB GridFS Output step:
Let's install the MongoDB GridFS Output step:
From the menu bar of Spoon, select Help and then...