Don't read,understand it: coalesce() vs repartition()

Thursday, 25 April 2019

coalesce() vs repartition()

both are used to repartition the RDD to avoid full shuffle we can use coalesce.

val rdd1=sc.parallelize(1 to 1000,15)
rdd1.partitions.length

val rdd2=rdd1.coalesce(5,false)
rdd2.partitions.length

output
=====
int=15
int=5

for example if we are going to repartition 100 to 10 ,10 partition will claim the available resource to achieve the same if we use coalesce().

Don't read,understand it

Thursday, 25 April 2019

coalesce() vs repartition()

No comments:

Post a Comment