From the course: Apache Spark Deep Learning Essential Training

Unlock the full course today

Join today to access over 22,500 courses taught by industry experts or purchase this course individually.

Apache Spark ecosystem

Apache Spark ecosystem - Spark DataFrames Tutorial

From the course: Apache Spark Deep Learning Essential Training

Start my 1-month free trial

Apache Spark ecosystem

- [Instructor] If you head over to the Apache Spark website, you can see some of the reasons that people are using it. It's fast, it's easy to develop in because it has a couple of language APIs, meaning that you can access it using Scala, Java, Python, R or SQL, and it has its own ecosystem. Just so you know, Apache Spark is written in Scala. Now the developers of Apache Spark wanted it to be fast to work on their large data sets. But when they were working on their big data projects, many of the scripting languages didn't fit the bill. Because Spark runs on a Java virtual machine, or JVM, it lets you call in to other Java based dictator systems, such as Cassandra, HTFS and HBase. Spark runs locally, as well as in clusters, on Premise or in the Cloud. It runs on top of Hadoop Yarn, Apache Mesos Standalone. Although Spark is designed to work on large distributed clusters, you can also use it on a single Machine Mode. The means of accessing these different modules is via the Spark Core…

Contents