Scala with Data Science

A Thorough Introduction

The Scala with Data Science path is designed to give experienced Data Developers and Data Science the know-­how to confidently start programming in Scala for data science tasks.

The courses ensure they will have a solid understanding of the fundamentals of the language, the tooling and the development process as well as a good appreciation of the more advanced features. If students already have Scala programming experience, then these courses can be useful refreshers, yet no previous knowledge of Scala is assumed.

Featured Courses

Introduction to Scala

Summary

This introduction to Scala course was created by Typesafe as part of our Data Science learning path. It is designed to give experienced Data Developers and Data Science the know­-how to confidently start programming in Scala for data science tasks. The course ensures they will have a solid understanding of the fundamentals of the language, the tooling and the development process as well as a good appreciation of the more advanced features. If students already have Scala programming experience, then this course could be a useful refresher, yet no previous knowledge of Scala is assumed.

Prerequisites

Students taking this Scala course should have:

1. Experience with Java (preferred), Python, or another object­ oriented language

2. No previous Scala knowledge is required

3. No previous experience with Data Science concepts is required. These concepts will be explained as needed

Objectives of this Scala learning path

1. Become a competent user of Scala

2. Know and be able to apply the functional programming style in Scala

3. Know how to use fundamental Scala tools

4. Become confident to start using Scala in production environments

Spark Overview for Scala Analytics

The “Spark Overview for Scala Analytics” course will cover the history of Spark and how it came to be, how to build applications with Spark, establish an understanding of RDDs and DataFrames, and other advanced Spark topics. Apache Spark™ is a fast and general engine for large-scale data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Having finished this class, a student would be prepared to leverage the core RDD and DataFrame APIs to perform analytics on datasets.
This course is meant to be an overview of Spark and its associated ecosystem.  For deeper understanding of Spark, we recommend that students take the Spark Fundamentals courses I and II.

Scala for Data Science

Apache Spark™ is a fast and general engine for large-scale data processing, with built-in modules for streaming, SQL, machine learning and graph processing. This course shows how to use Spark’s machine learning pipelines to fit models and search for optimal hyperparameters using a Spark cluster.

Introduction to Scala

Summary

This introduction to Scala course was created by Typesafe as part of our Data Science learning path. It is designed to give experienced Data Developers and Data Science the know­-how to confidently start programming in Scala for data science tasks. The course ensures they will have a solid understanding of the fundamentals of the language, the tooling and the development process as well as a good appreciation of the more advanced features. If students already have Scala programming experience, then this course could be a useful refresher, yet no previous knowledge of Scala is assumed.

Prerequisites

Students taking this Scala course should have:

1. Experience with Java (preferred), Python, or another object­ oriented language

2. No previous Scala knowledge is required

3. No previous experience with Data Science concepts is required. These concepts will be explained as needed

Objectives of this Scala learning path

1. Become a competent user of Scala

2. Know and be able to apply the functional programming style in Scala

3. Know how to use fundamental Scala tools

4. Become confident to start using Scala in production environments

Spark Overview for Scala Analytics

The “Spark Overview for Scala Analytics” course will cover the history of Spark and how it came to be, how to build applications with Spark, establish an understanding of RDDs and DataFrames, and other advanced Spark topics. Apache Spark™ is a fast and general engine for large-scale data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Having finished this class, a student would be prepared to leverage the core RDD and DataFrame APIs to perform analytics on datasets.
This course is meant to be an overview of Spark and its associated ecosystem.  For deeper understanding of Spark, we recommend that students take the Spark Fundamentals courses I and II.

Scala for Data Science

Apache Spark™ is a fast and general engine for large-scale data processing, with built-in modules for streaming, SQL, machine learning and graph processing. This course shows how to use Spark’s machine learning pipelines to fit models and search for optimal hyperparameters using a Spark cluster.

Big Data University also offers a vast number of courses on various other analytics, big data, and data science topics. View our complete course catalog.

What is Big Data University?

An IBM community initiative, Big Data University is the world’s best education on big data. Learn about big data, data science and analytic technologies from experts using hands-on exercises and interactive videos. Best of all, it’s completely free.