Hadoop interview questions part 35

Hadoop interview questions part 35

Take as many assements as you can to improve your validate your skill rating

Total Questions: 20

1. Which of the following partitions the key space ?

Correct Answer is : Partitioner

2. ____________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer

Correct Answer is : OutputCollector

3. Point out the wrong statement :

Correct Answer is : None of the mentioned

4. __________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.

Correct Answer is : JobConf

5. The ___________ executes the Mapper/ Reducer task as a child process in a separate jvm.

Correct Answer is : JobTracker

6. Maximum virtual memory of the launched child-task is specified using :

Correct Answer is : mapred

7. Which of the following parameter is the threshold for the accounting and serialization buffers ?

Correct Answer is : io.sort.spill.percent

8. ______________ is percentage of memory relative to the maximum heap size in which map outputs may be retained during the reduce.

Correct Answer is : mapred.job.reduce.input.buffer.percen

9. ____________ specifies the number of segments on disk to be merged at the same time.

Correct Answer is : io.sort.factor

10. Point out the correct statement :

Correct Answer is : The number of sorted map outputs fetched into memory before being merged to disk

11. Map output larger than ___ percent of the memory allocated to copying map outputs.

Correct Answer is : 25

12. Jobs can enable task JVMs to be reused by specifying the job configuration :

Correct Answer is : mapissue.job.reuse.jvm.num.tasks

13. Point out the wrong statement :

Correct Answer is : None of the mentioned

14. During the execution of a streaming job, the names of the _______ parameters are transformed.

Correct Answer is : mapred

15. The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to :

Correct Answer is : ${HADOOP_LOG_DIR}/userlogs

16. ____________ is the primary interface by which user-job interacts with the JobTracker.

Correct Answer is : JobClient

17. The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks.

Correct Answer is : DistributedCache

18. __________ is used to filter log files from the output directory listing.

Correct Answer is : OutputLogFilter

19. The split size is normally the size of an ________ block, which is appropriate for most applications.

Correct Answer is : HDFS

20. Point out the correct statement :

Correct Answer is : The minimum split size is usually 1 byte, although some formats have a lower bound on the split size

Similar Interview Questions

    Search for latest jobs

    Icon
    Icon