0

I am trying to run spark scala application from head node of azure HDInsight cluster with command

spark-submit --class com.test.spark.Wordcount SparkJob1.jar wasbs://containername@<storageaccountname>/sample.sas7bdat wasbs://containername@<storageaccountname>/sample.csv

I am getting below exception with it.

Caused by: java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD

Same jar file is working if I invoke from Azure data factory . Am I missing some configuration with spark-submit command?

vidyak
  • 173
  • 4
  • 14

1 Answers1

0

Normally, it was caused by your code logic about type conversion. There is a similar SO thread How to fix java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List to field type scala.collection.Seq? which had been answered, I think you can refer to it and check your code for resolving the issue.

Community
  • 1
  • 1
Peter Pan
  • 23,476
  • 4
  • 25
  • 43