English 中文(简体)
1. 将火线数据框架转换为光电动态框架
原标题:convert spark dataframe to aws glue dynamic frame

我试图将我的火力数据镜转换成活体,作为冰川的文档,但我发现错误。

数据标的没有来自DF的“特性”。

My code uses heavily spark dataframes. Is there a way to convert from spark dataframe to dynamic frame so I can write out as glueparquet? If so could you please provide an example, and point out what I m doing wrong below?

法典:

# importing libraries

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

glueContext = GlueContext(SparkContext.getOrCreate())

# updated 11/19/19 for error caused in error logging function

spark = glueContext.spark_session

from pyspark.sql import Window
from pyspark.sql.functions import col
from pyspark.sql.functions import first
from pyspark.sql.functions  import date_format
from pyspark.sql.functions import lit,StringType
from pyspark.sql.types import *
from pyspark.sql.functions import substring, length, min,when,format_number,dayofmonth,hour,dayofyear,month,year,weekofyear,date_format,unix_timestamp


base_pth= s3://test/ 

bckt_pth1=base_pth+ test_write/glueparquet/ 


test_df=glueContext.create_dynamic_frame.from_catalog(
                 database= test_inventory ,
                 table_name= inventory_tz_inventory ).toDF()

test_df.fromDF(test_df, glueContext, "test_nest")


glueContext.write_dynamic_frame.from_options(frame = test_nest,
                                             connection_type = "s3",
                                             connection_options = {"path": bckt_pth1+ inventory },
                                             format = "glueparquet")

错误:

 DataFrame  object has no attribute  fromDF 
Traceback (most recent call last):
  File "/mnt/yarn/usercache/livy/appcache/application_1574556353910_0001/container_1574556353910_0001_01_000001/pyspark.zip/pyspark/sql/dataframe.py", line 1300, in __getattr__
    " %s  object has no attribute  %s " % (self.__class__.__name__, name))
AttributeError:  DataFrame  object has no attribute  fromDF 
问题回答

代码/代码>是一个班级功能。 您可如何将<代码>Dataframe转换至<代码>DynamicFrame。

from awsglue.dynamicframe import DynamicFrame

DynamicFrame.fromDF(test_df, glueContext, "test_nest")

仅仅为了把对Schala用户的答复合并起来,这里还要说明如何将一个浮游数据机改造为动态框架(动态框架的变迁方法在动态框架的速成体中确实存在) :

import com.amazonaws.services.glue.DynamicFrame  
val dynamicFrame = DynamicFrame(df, glueContext)

我希望它有助于!

# Import Dynamic DataFrame class
from awsglue.dynamicframe import DynamicFrame

#Convert from Spark Data Frame to Glue Dynamic Frame
dyfCustomersConvert = DynamicFrame.fromDF(df, glueContext, "convert")

#Show converted Glue Dynamic Frame
dyfCustomersConvert.show()




相关问题
how to use phoenix5.0 with spark 3.0 preview

case "phoenix" =>{ outputStream.foreachRDD(rdd=>{ val spark=SparkSession.builder().config(rdd.sparkContext.getConf).getOrCreate() val ds=spark.createDataFrame(rdd,...

同一S3bucket使用多位证书

I'm using 2.1.1 with Hadoop 2.7.3 and I'm use data from different S3 sites in one管线。

热门标签