7

I am trying to fully utilize kryo serialization for spark. Setting

.set("spark.kryo.registrationRequired", "true")

This will let me know which classes need to be registered. I have registered about 40 classes, some of my classes and some of spark's classes. I followed Require kryo serialization in Spark (Scala) post to register/set everything up.

I am now running into the following and cannot figure out how to register it in scala. Has anyone solved this issue?

I have tried a bunch of different combinations including:

kryo.register(classOf[Array[Array[Byte]]])
conf.set("classesToRegister", "classOf[Array[Array[Byte]]]")
conf.registerKryoClasses(Array(classOf[Array[Array[Byte]]]))

I found an unanswered post https://mail-archives.apache.org/mod_mbox/spark-user/201603.mbox/%3CCAHCfvsSyUpx78ZFS_A9ycxvtO1=Jp7DfCCAeJKHyHZ1sugqHEQ@mail.gmail.com%3E stating the same problem.

java.lang.RuntimeException: com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Class is not registered: byte[][]
Note: To register this class use: kryo.register(byte[][].class);
Serialization trace:
buffers (org.apache.spark.sql.columnar.CachedBatch)
at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:585)
at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:568)
at org.apache.spark.serializer.KryoSerializationStream.writeObject(KryoSerializer.scala:158)
at org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:153)
at org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1190)
at org.apache.spark.storage.BlockManager.dataSerialize(BlockManager.scala:1199)
at org.apache.spark.storage.MemoryStore.getBytes(MemoryStore.scala:191)
at org.apache.spark.storage.BlockManager.doGetLocal(BlockManager.scala:480)
at org.apache.spark.storage.BlockManager.getBlockData(BlockManager.scala:302)
at org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$2.apply(NettyBlockRpcServer.scala:57)
at org.apache.spark.network.netty.NettyBlockRpcServer$$anonfun$2.apply(NettyBlockRpcServer.scala:57)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:57)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:114)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:87)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:101)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Community
  • 1
  • 1
John Engelhart
  • 279
  • 4
  • 13

2 Answers2

18
conf.registerKryoClasses(Array( Class.forName("[[B"))) 

should work

Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
Harel Gliksman
  • 734
  • 1
  • 7
  • 19
  • 1
    This is helpful. I had the same problem and this helped me fix it. However, I don't understand how [[B corresponds to byte[][]. I know this is an old post, but if someone could provide a deeper explanation it would be helpful. – Tim Ryan Sep 29 '16 at 19:29
  • 2
    @TimRyan See https://docs.oracle.com/javase/6/docs/api/java/lang/Class.html#getName%28%29 – Reinstate Monica Nov 03 '16 at 19:29
0

I know that extremely old question but for those who might look for a solution for any array of class:

case class Person(name: String)

conf.registerKryoClasses(Array(
   classOf[Array[Array[Person]]]
))

Or in case of this specific question:

conf.registerKryoClasses(Array(
   classOf[Array[Array[Byte]]]
))
Noam Shaish
  • 1,613
  • 2
  • 16
  • 37
  • it didn't work @Noam Shaish, i have `val personList: Array[Person] = (1 to 100000).map(value=> Person(value+"",value)).toArray`, and `.registerKryoClasses(Array( classOf[Array[Array[Person]]] ))` didn't worked, it says ` Caused by: java.lang.IllegalArgumentException: Class is not registered: KyroExample$Person Note: To register this class use: kryo.register(KyroExample$Person.class);` . I have the full question if you can have a look https://stackoverflow.com/questions/59099104/kryo-serialization-not-registering-even-after-registering-the-class-in-conf – supernatural Nov 29 '19 at 06:10