如何解决动态添加填充零
mock_data = [('TYCO',' 1303','13'),('EMC',' 120989 ','123'),('VOLVO ','102329 ','1234'),('BMW','1301571345 ',' '),('FORD','004','21212')]
df = spark.createDataFrame(mock_data,['col1','col2','col3'])
+-------+------------+-----+
| col1 | col2| col3|
+-------+------------+-----+
| TYCO| 1303| 13|
| EMC| 120989 | 123|
|VOLVO | 102329 | 1234|
| BMW|1301571345 | |
| FORD| 004|21212|
+-------+------------+-----+
修剪col2并基于长度(10-col2长度)需要在col3中动态添加填充零。连接col2和col3。
df2 = df.withColumn('length_col2',10-length(trim(df.col2)))
+-------+------------+-----+-----------+
| col1| col2| col3|length_col2|
+-------+------------+-----+-----------+
| TYCO| 1303| 13| 6|
| EMC| 120989 | 123| 4|
|VOLVO | 102329 | 1234| 4|
| BMW|1301571345 | | 0|
| FORD| 004|21212| 7|
+-------+------------+-----+-----------+
预期产量
+-------+----------+-----+-------------
| col1| col2 | col3|output
+-------+----------+-----+-------------
| TYCO| 1303 | 13|1303000013
| EMC| 120989 | 123|1209890123
|VOLVO | 102329 | 1234|1023291234
| BMW| 1301571345 | |1301571345
| FORD| 004 |21212|0040021212
+-------+----------+-----+-------------
解决方法
您正在寻找的是rpad
中的pyspark.sql.functions
函数,如下所示=> https://spark.apache.org/docs/2.3.0/api/sql/index.html
请参阅下面的解决方案:
%pyspark
mock_data = [('TYCO',' 1303','13'),('EMC',' 120989 ','123'),('VOLVO ','102329 ','1234'),('BMW','1301571345 ',' '),('FORD','004','21212')]
df = spark.createDataFrame(mock_data,['col1','col2','col3'])
df.createOrReplaceTempView("input_df")
spark.sql("SELECT *,concat(rpad(trim(col2),10,'0'),col3) as OUTPUT from input_df").show(20,False)
和结果
+-------+------------+-----+---------------+
|col1 |col2 |col3 |OUTPUT |
+-------+------------+-----+---------------+
|TYCO | 1303 |13 |130300000013 |
|EMC | 120989 |123 |1209890000123 |
|VOLVO |102329 |1234 |10232900001234 |
|BMW |1301571345 | |1301571345 |
|FORD |004 |21212|004000000021212|
+-------+------------+-----+---------------+
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。