Hello I试图用管道模式在火.中装载节省的管道。
selectedDf = reviews
.select("reviewerID", "asin", "overall")
# Make pipeline to build recommendation
reviewerIndexer = StringIndexer(
inputCol="reviewerID",
outputCol="intReviewer"
)
productIndexer = StringIndexer(
inputCol="asin",
outputCol="intProduct"
)
pipeline = Pipeline(stages=[reviewerIndexer, productIndexer])
pipelineModel = pipeline.fit(selectedDf)
transformedFeatures = pipelineModel.transform(selectedDf)
pipeline_model_name = ./ + model_name + pipeline
pipelineModel.save(pipeline_model_name)
该法典成功地挽救了档案系统的模型,但问题是,我可以把这一管道装上其他数据。 当我试图按守则装载模型时,我有这种错误。
pipelineModel = PipelineModel.load(pipeline_model_name)
Traceback (most recent call last):
File "/app/spark/load_recommendation_model.py", line 12, in <module>
sa.load_model(pipeline_model_name, recommendation_model_name, user_id)
File "/app/spark/sparkapp.py", line 142, in load_model
pipelineModel = PipelineModel.load(pipeline_model_name)
File "/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 311, in load
File "/spark/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 240, in load
File "/spark/python/lib/pyspark.zip/pyspark/ml/util.py", line 497, in loadMetadata
File "/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1379, in first
ValueError: RDD is empty
问题是什么? 我如何解决这一问题?