Skip to main content

Description

Spark Training is designed to help you acquire the skills you need to process and analyze big data with the Apache Spark framework. Over the course of 3 days, you will learn how to use Spark to develop high-performance processing applications and integrate Spark into a Hadoop environment.

Materials

  • Documentation and online resources
  • Code examples and use cases
  • Technical support and hands-on training
     

Day 1: Introduction to Spark

  • Introduction to Spark and its role in big data processing
  • How Spark works and its main components (Spark Core, Spark SQL, Spark Streaming, Spark ML)
  • Using Spark Shell to prototype operations
  • Creating a cluster structure and hosting the structure

Day 2: Using Spark for data processing

  • Using RDD (Resilient Distributed Datasets) to process data
  • Managing RDD and MapReduce operations
  • Using Spark SQL to analyze relational data
  • Using Spark Streaming for real-time data processing

Day 3: Application development with Spark

  • Designing an application using Spark
  • Application configuration and compilation
  • Using Spark ML for machine learning
  • Optimizing performance with Spark
  • Developers
  • Architects
  • System administrators
  • DevOps
  • Basic knowledge of a Unix system
  • Knowledge of Scala or Python
  • Stats-oriented culture
  • Understand the basic principles of Spark and its components
  • Learn how to use Spark for big data processing
  • Develop high-performance processing applications with Spark
  • Integrate Spark into a Hadoop environment

We design, build and support digital products for clients who want to make a positive impact in their industry. Creative with technology, we develop great solutions to help our clients grow and especially by strengthening our relationships based on continuous improvement, maintenance, support and hosting services.

Follow us