Time for action – enhancing the films file by converting rows to columns
In this tutorial, we will work with a file that contains a list of French movies of all times. Each movie is described through several rows. This is how it looks:
... Caché Year: 2005 Director: Michael Haneke Cast: Daniel Auteuil, Juliette Binoche, Maurice Bénichou Jean de Florette Year: 1986 Genre: Historical drama Director: Claude Berri Produced by: Pierre Grunstein Cast: Yves Montand, Gérard Depardieu, Daniel Auteuil Le Ballon rouge Year: 1956 Genre: Fantasy | Comedy | Drama ...
In order to process the information within the file it would be better if the rows belonging to each movie were merged into a single row. Now carry out the following steps:
Download the
movies.txt
file from the book's website (www.packtpub.com/support).Create a transformation and read the file with a Text file input step.
In the Content tab of the Text file input configuration step, put
:
as a separator. Also, uncheck the Header and No empty...