Some Thoughts about Variance in Scala
Some time ago (so long ago!) I was asked by a fledgling (at the time) Scala developer to explain how covariance, contravariance and invariance work in Scala. Just like me, he came from Java, so getting all the technicalities of variance in Scala wasn’t straightforward for him (just like it...
Apache Spark - Does Shuffling Occur When Count Actions are Performed?
While Apache Spark’s level of abstraction eases the development of jobs running on distributed data, it’s not always easy to figure out how to optimize them, or how to avoid common pitfalls. A well-known source of performance issues is shuffling. Shuffling is a process of data redistribution across partitions; when...
The Shaded Documentation of Maven Shade Plugin
Shit Happens. Especially when dealing with many dependencies in a complex system, shit happens. As Java developers, we usually don’t deal with all the intricacies of versioning; this is due to the fact that deprecated methods and classes are simply annotated as @Deprecated, but maintained for backward compatibility. Yet, there...
How To Delete Spark Job Server's Temporary Files
My colleagues and I are working on a system that should interoperate with Apache Spark. The idea behind this interoperability is to send Spark Job Server a JAR package containing classes that, whenever invoked from the system, should deal with Spark. The issue is, it sometimes happens that a new...
Why Am I Looping Through Options??
For those like me who started with imperative programming, functional programming may seem like a minefield, especially when familiar concepts gain different meanings. Recently, I had the opportunity to learn a couple of things about Scala, and I fell in love with Options: they deal with the infamous billion-dollar mistake...