如何解决执行某些步骤后,Aws 日志未写入 Cloud Watch
我有一个 aws 作业,它读取 pyspark 中的 csv 文件,该文件将日志写入 aws cloud watch 。日志在初始步骤中写入,但在某个步骤之后未写入日志
for i,key in enumerate(keys):
source_path = "s3://"+source_bucket+"/"+key
if conf_format =='csv' or conf_format == 'text':
#source_df = spark.read.option("header",conf_header).option("inferSchema","true").option("delimiter",conf_delimiter).csv(source_path)
#source_df = spark.read.option("header",conf_delimiter).csv(source_path)
validated_df_2=schema_validation(source_path)
source_df_2=validated_df_2.filter(validated_df_2.valid_rec == "Y")
print("printing source_df")
source_df=source_df_2.drop(source_df_2.valid_rec)
print("printing source_df after drop dolumn")
source_df.printSchema()
source_df.show(5)
elif conf_format =='json':
source_df = spark.read.option("multiline","true").json(source_path)
elif conf_format =='avro':
source_df = spark.read.format("com.databricks.spark.avro").load(source_path)
if i==0:
target_df = source_df
else:
target_df = target_df.union(source_df)
ct_before = target_df.count()
#remove all null values
target_df.na.drop("all")
#Convert column names to lower case
lower_df = target_df.toDF(*[c.lower() for c in target_df.columns])
#Convert slash into hyphen in column name
col_df = lower_df.toDF(*list(map(lambda col : col if '/' not in col else col[1:].replace('/','-'),lower_df.columns)))
#Convert whitespace into empty in column name
final_df = col_df.toDF(*(c.replace(' ','') for c in col_df.columns))
#remove duplicates
col = final_df.columns
col1 = final_df.columns[0]
print(col)
print(col1)
win = Window.partitionBy(final_df.columns).orderBy(col1)
df_with_rn = final_df.withColumn("row_num",f.row_number().over(win))
df_with_rn.createOrReplaceTempView("t_stage")
deduped_df = spark.sql(""" select * from t_stage where row_num = 1
""")
delta_df = deduped_df.drop(deduped_df.row_num)
print("show delta df schema and data")
delta_df.printSchema()// till line prints in cloud watch . after this no logs and job is running for ever
delta_df.show(5)
// have more than 1000 lines
感谢任何帮助
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。