Practical Big Data with Spark workshop

Brochure Image

by Vladimir Bacvanski download a PDF brochure Download Event Brochure

From Wednesday May 17 2017 to Friday May 19 2017

Price: 1,700.00 Euro

Register now

Residenza di Ripetta
Via di Ripetta 231
00186 Roma (RM)

more information about our venue...

Description

In this workshop, you will learn how to tame Big Data with Apache Spark. Spark is the fastest growing Big Data system and provides the solid foundation for processing of large volumes of data. We introduce the key concepts of Spark, its architecture and the development model. We show how to use the several APIs provided by Spark to ingest data, process it in parallel on a cluster and control the various aspects of its execution. This is a practical, hands-on course with numerous exercises. Expect to spend about 50% of your time developing. If you are a bit rusty when it comes to programming, don’t worry: we provide full guidance and worked out solutions. The background of the course: this course and his predecessors has been used to train Big Data professionals in leading financial technology organizations worldwide and has been delivered as a tutorial at the top industry conferences.

What you will learn

Upon completion, students will be able to:

  • Identify key parts of Spark architecture and their roles
  • Ingest data and process them in parallel
  • Develop batch and streaming applications for Spark
  • Work with SQL for Spark
  • Understand the key concepts behind Machine Learning and GraphX libraries in Spark

Main Topics

  • Spark Architecture
  • Scala for Spark: a Crash Course
  • Developing with RDDs
  • Developing Spark Applications with Scala APIs
  • Test-Driven Development (TDD) with Spark
  • Spark Streaming
  • Spark SQL and Dataframes
  • Overview of MLib, GraphX, and other Spark APIs