So far we have just looked at training data. But we will also need to look at testing the data. There is a test dataset and a RUL dataset. These datasets will help us test our models. To import them you would run 2 additional import steps:
- Importing test data: Relying on the schema from the training set the test set is imported and put in a table called engine_test:
# File location and type
file_location = "/FileStore/tables/test_FD001.txt"
df = spark.read.option("delimiter"," ").csv(file_location,
schema=schema,
header=False)
df.write.mode("overwrite").saveAsTable("engine_test")
- Importing the RUL Dataset: The next step is to import the remaining useful life dataset and save that to a table as well:
file_location = "/FileStore/tables/RUL_FD001.txt"
RULschema = StructType([StructField("RUL", IntegerType...