1. Which of the following partitions the key space ?
Correct Answer is : Partitioner
2. ____________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer
Correct Answer is : OutputCollector
3. Point out the wrong statement :
Correct Answer is : None of the mentioned
4. __________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.
Correct Answer is : JobConf
5. The ___________ executes the Mapper/ Reducer task as a child process in a separate jvm.
Correct Answer is : JobTracker
6. Maximum virtual memory of the launched child-task is specified using :
Correct Answer is : mapred
7. Which of the following parameter is the threshold for the accounting and serialization buffers ?
Correct Answer is : io.sort.spill.percent
8. ______________ is percentage of memory relative to the maximum heap size in which map outputs may be retained during the reduce.
Correct Answer is : mapred.job.reduce.input.buffer.percen
9. ____________ specifies the number of segments on disk to be merged at the same time.
Correct Answer is : io.sort.factor
10. Point out the correct statement :
Correct Answer is : The number of sorted map outputs fetched into memory before being merged to disk
11. Map output larger than ___ percent of the memory allocated to copying map outputs.
12. Jobs can enable task JVMs to be reused by specifying the job configuration :
Correct Answer is : mapissue.job.reuse.jvm.num.tasks
13. Point out the wrong statement :
Correct Answer is : None of the mentioned
14. During the execution of a streaming job, the names of the _______ parameters are transformed.
Correct Answer is : mapred
15. The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to :
Correct Answer is : ${HADOOP_LOG_DIR}/userlogs
16. ____________ is the primary interface by which user-job interacts with the JobTracker.
Correct Answer is : JobClient
17. The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks.
Correct Answer is : DistributedCache
18. __________ is used to filter log files from the output directory listing.
Correct Answer is : OutputLogFilter
19. The split size is normally the size of an ________ block, which is appropriate for most applications.
20. Point out the correct statement :
Correct Answer is : The minimum split size is usually 1 byte, although some formats have a lower bound on the split size