English 中文(简体)
分组的“S”未能通过一份说明书撰写_delta_log,而没有批准《说明书》的数据查阅?
原标题:Clustered Spark fails to write _delta_log via a Notebook without granting the Notebook data access?

<<>strong>TLDR: 除非我的Jupyter说明书能够查阅数据地点,否则我为什么没有填写德尔塔表格,这与我预期“Se”公司应当独立于Jupyter数据查阅处理书面材料相反?

I ve set up a PySpark Jupyter Notebook connected to a Spark cluster, where the Spark instance is intended to perform writes to a Delta table. However, I m observing that the Spark instance fails to complete the writes if the Jupyter Notebook doesn t have access to the data location. Repo for reproducibility. Specific PR that reproduces the bug.

<>Setup:

version:  3 
services:
  spark:
    image: com/data_lake_spark:latest
    # Spark service configuration details...

  spark-worker-1:
    # Configuration details...

  spark-worker-2:
    # Configuration details...

  jupyter:
    image: com/data_lake_notebook:latest
    # Jupyter Notebook service configuration details...

www.un.org/Depts/DGACM/index_spanish.htm 届会的配置:

# Spark session setup...

http://www.hchr.org。

# Write initial test data to Delta table
owner_df.write.format("delta").mode("overwrite").save(delta_output_path)

移除Jupyter进入Docker Compose配置的名录/data,在试图写到Delta表格时,导致了DeltaIOException。 但是,如果能够查阅<代码>/数据目录,则可以成功撰写。

www.un.org/Depts/DGACM/index_spanish.htm Error 信:

Py4JJavaError: An error occurred while calling o56.save.
: org.apache.spark.sql.delta.DeltaIOException: [DELTA_CANNOT_CREATE_LOG_PATH] Cannot create file:/data/delta_table_of_dog_owners/_delta_log
    at org.apache.spark.sql.delta.DeltaErrorsBase.cannotCreateLogPathException(DeltaErrors.scala:1534)
    at org.apache.spark.sql.delta.DeltaErrorsBase.cannotCreateLogPathException$(DeltaErrors.scala:1533)
    at org.apache.spark.sql.delta.DeltaErrors$.cannotCreateLogPathException(DeltaErrors.scala:3203)
    at org.apache.spark.sql.delta.DeltaLog.createDirIfNotExists$1(DeltaLog.scala:443)

我期望“花旗”能够独立处理Jupyter数据存取。 为解决这一问题寻求见解或建议。 希望得到任何指导。

问题回答

在您的Docker Compose档案中,当你确定数量时,Docker将该量与集装箱内的具体路线联系起来。 如果你不为Jupyter笔记本设定一个与数据地点一致的卷册,该笔记本就能够与档案系统的这一部分互动。

So Spark does handle the writes independently but needs access to the /data directory. As you have correctly mentioned if you don t explicitly link the containers using a common filesystem the write will fail with DeltaIOException.

总结一下,当你重新开展涉及阅读和写作数据的业务时,轮船组和Jupyter笔记本都需要进入同一个档案系统地点,否则就会出现许可问题。

这一结构将失败:

services:
  spark:
    image: com/data_lake_spark:latest
    # Spark service configuration details...
    # Filesystem 1

  jupyter:
    image: com/data_lake_notebook:latest
    # Jupyter Notebook service configuration details...
    # Filesystem 2

在集装箱中添加以下数量,应当确定这一问题,我认为,你已经这样做了,因此也是不航行的结果。

services:
  spark:
    image: com/data_lake_spark:latest
    # Spark service configuration details...
    # Volume
    volumes:
      - ./data:/data

  jupyter:
    image: com/data_lake_notebook:latest
    # Jupyter Notebook service configuration details...
    # Volume
    volumes:
      - ./data:/data




相关问题
Spring Properties File

Hi have this j2ee web application developed using spring framework. I have a problem with rendering mnessages in nihongo characters from the properties file. I tried converting the file to ascii using ...

Logging a global ID in multiple components

I have a system which contains multiple applications connected together using JMS and Spring Integration. Messages get sent along a chain of applications. [App A] -> [App B] -> [App C] We set a ...

Java Library Size

If I m given two Java Libraries in Jar format, 1 having no bells and whistles, and the other having lots of them that will mostly go unused.... my question is: How will the larger, mostly unused ...

How to get the Array Class for a given Class in Java?

I have a Class variable that holds a certain type and I need to get a variable that holds the corresponding array class. The best I could come up with is this: Class arrayOfFooClass = java.lang....

SQLite , Derby vs file system

I m working on a Java desktop application that reads and writes from/to different files. I think a better solution would be to replace the file system by a SQLite database. How hard is it to migrate ...

热门标签