Posts spark dataframe을 csv로 export하기
Post
Cancel

spark dataframe을 csv로 export하기

spark dataframe을 csv로 export하기

jupyter, sparkmagic kernel 기준

1
2
3
%%configure -f
{"jars": ["/user/olaf.kido/spark-csv_2.10-1.5.0.jar", "/user/olaf.kido/commons-csv-1.6.jar"]}

1
2
# Spark 1.6이라면
sqlContext = HiveContext(sc)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
df = sqlContext \
.sql("select * from source_table") \
.coalesce(1)

# HDFS에 저장
df\
.write\
.mode("overwrite")\
.format("com.databricks.spark.csv")\
.save("/user/kidokim509/data.csv")

# pySpark이고 DN에 pandas가 설치 되어있다면
df.\
toPandas().\
to_csv("data.csv")
This post is licensed under CC BY 4.0 by the author.