From the course: Amazon Web Services: Data Services

Unlock the full course today

Join today to access over 22,600 courses taught by industry experts or purchase this course individually.

Understand Hadoop jobs and libraries

Understand Hadoop jobs and libraries - Amazon Web Services (AWS) Tutorial

From the course: Amazon Web Services: Data Services

Start my 1-month free trial

Understand Hadoop jobs and libraries

- [Instructor] As we think more about Hadoop, core concept is something called a job, and what that is is a processing task that runs on top of the underlying file storage. So, a Hadoop job includes tools to monitor job execution overhead, and console-based tools for MapReduce tasks, and the EMR implementation in Amazon includes alarms and logs. So, this is a partially managed implementation of the Hadoop ecosystem, and it's similar conceptually to some of the other partially managed data solutions that we've looked at in this course, such as RDS for relational data and DynamoDB for NoSQL. So, you are paying for Amazon to do some of the management here. Now, that being said, as mentioned, you could use a plain vanilla implementation, and then you would probably use some vendor's tools to get the alarms, and logs, and so on, and so forth. I generally tend to use EMR because it's the simplest to set up and monitor,…

Contents