Majid Fatemian, is a senior software/data engineer at Red Ventures. With more than a decade experience he has contributed to many projects for architecture, implementation and performance-optimization of high-traffic online services in distributed environments.
You can find him on twitter @majidfn .
Session en anglais - Débutant
Volume of data is continuously growing bigger and traditional storage and tools are not capable of catching up. Distributed technologies have been able to accommodate commodity hardware in distributed environments to achieve faster and more reliable performance. Apache Spark, a distributed processing engine, has simplified distributed computing in many ways. In this talk we will cover the essentials of Apache Spark and its usage.
Session en anglais - Intermédiaire
Before doing any data science, machine learning or AI, you need to get your data right. As the volume of data grows, having a reliable, available and scalable data pipeline becomes a challenge.
In this talk we will share our learnings from running a data pipeline in AWS infrastructure using technologies like Apache Spark, gRPC, Protocol buffers.