Scala for Data Science

Take our free course

Data-Science-and-Scala-(BETA)

Scala for Data Science

with Dr Priya Dev

Audience:
Anyone interested in Scala for Data Science

Time to complete:
6 - 8 hours

Available in:
English

Apache Spark™ is a fast and general engine for large-scale data processing, with built-in modules for streaming, SQL, machine learning and graph processing. This course shows how to use Spark’s machine learning pipelines to fit models and search for optimal hyperparameters using a Spark cluster.

Course Syllabus

There are 5 modules to this course.

 

  1. Module 1 - Basic statistics and data types
  2. Module 2 - Preparing data
  3. Module 3 - Feature engineering
  4. Module 4 - Fitting a model
  5. Module 5 - Pipelines and grid search

Pre-requisites

  1. Taken the Introduction to Scala Course and Spark Overview for Scala Analytics
  2. Experience with Java (preferred), Python, or another object­ oriented language
  3. General understanding of machine learning