Apache Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical Map Reduce program cannot provide, Spark is the alternative. Spark performs at speeds up to 100 times faster than Map Reduce for iterative algorithms or interactive data mining. Spark provides in-memory cluster computing for lightning fast speed and supports Java, Scala, and Python APIs for ease of development.
Spark combines SQL, streaming and complex analytics together seamlessly in the same application to handle a wide range of data processing scenarios. Spark runs on top of Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources such as HDFS, Cassandra, HBase, or S3.
Big Data University has been chosen by IBM as one of the issuers of badges as part of the IBM Open Badge program. Share your achievements through LinkedIn, Facebook, Twitter, and more!
Big Data University leverages the services of Pearson VUE Acclaim to assist in the administration of the IBM Open Badge program. By enrolling into this course, you agree to Big Data University sharing your details with Pearson VUE Acclaim for the strict use of issuing your badge upon completion of the badge criteria.
This is the second course in the Big Data University Spark curriculum, and expands concepts discussed in Spark Fundamentals. This course covers Spark’s architecture and goes in-depth on how data is distributed and tasks are parallelized. Students will have a better understanding for how to optimize their data for joins, using Spark’s memory caching, and use the more advanced operations available in the API.
This course was developed with the support of:
MetiStream, Inc. (metistream.com), experts in Apache Spark implementations and training
IBM Analytics (ibm.com/analytics) helps you make better decisions by gleaning new insights from the volume and variety of big data.
An IBM community initiative, Big Data University
is the world’s best education on big data. Learn about big data,
data science and analytic technologies from experts using hands-on
exercises and interactive videos. Best of all, it’s completely free.