Majid Fatemian, is a senior software/data engineer at Red Ventures. With more than a decade experience he has contributed to many projects for architecture, implementation and performance-optimization of high-traffic online services in distributed environments.
You can find him on twitter @majidfn .
English session - Beginner
Volume of data is continuously growing bigger and traditional storage and query tools are not capable of catching up. Distributed technologies like Hadoop and MapReduce have been able to accommodate usage of commodity hardware in distributed environments to achieve faster and more reliable performance. Since its inception, Apache Spark, a distributed processing engine, has simplified distributed computing in many ways. In this talk we will cover
English session - Intermediate
Before doing any data science, machine learning or AI, you need to get your data right. As the volume of data grows, having a reliable, available and scalable data pipeline becomes a challenge.
In this talk we will share our learnings from running a data pipeline in AWS infrastructure using technologies like Apache Spark, gRPC, Protocol buffers.