Hadoop Interview Questions - 35

1. Which of the following partitions the key space ?

A. Partitioner

B. Compactor

C. Collector

D. All of the mentioned

Correct Answer is : Partitioner

2. ____________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer

A. OutputCompactor

B. OutputCollector

C. InputCollector

D. All of the mentioned

Correct Answer is : OutputCollector

3. Point out the wrong statement :

A. It is legal to set the number of reduce-tasks to zero if no reduction is desired

B. The outputs of the map-tasks go directly to the FileSystem

C. The Mapreduce framework does not sort the map-outputs before writing them out to the FileSystem

D. None of the mentioned

Correct Answer is : None of the mentioned

4. __________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.

A. JobConfig

B. JobConf

C. JobConfiguration

D. All of the mentioned

Correct Answer is : JobConf

5. The ___________ executes the Mapper/ Reducer task as a child process in a separate jvm.

A. JobTracker

B. TaskTracker

C. TaskScheduler

D. None of the mentioned

Correct Answer is : JobTracker

6. Maximum virtual memory of the launched child-task is specified using :

A. mapv

B. mapred

C. mapvim

D. All of the mentioned

Correct Answer is : mapred

7. Which of the following parameter is the threshold for the accounting and serialization buffers ?

A. io.sort.spill.percent

B. io.sort.record.percent

C. io.sort.mb

D. None of the mentioned

Correct Answer is : io.sort.spill.percent

8. ______________ is percentage of memory relative to the maximum heap size in which map outputs may be retained during the reduce.

A. mapred.job.shuffle.merge.percent

B. mapred.job.reduce.input.buffer.percen

C. mapred.inmem.merge.threshold

D. io.sort.factor

Correct Answer is : mapred.job.reduce.input.buffer.percen

9. ____________ specifies the number of segments on disk to be merged at the same time.

A. mapred.job.shuffle.merge.percent

B. mapred.job.reduce.input.buffer.percen

C. mapred.inmem.merge.threshold

D. io.sort.factor

Correct Answer is : io.sort.factor

10. Point out the correct statement :

A. The number of sorted map outputs fetched into memory before being merged to disk

B. The memory threshold for fetched map outputs before an in-memory merge is finished

C. The percentage of memory relative to the maximum heap size in which map outputs may not be retained during the reduce

D. None of the mentioned

Correct Answer is : The number of sorted map outputs fetched into memory before being merged to disk

11. Map output larger than ___ percent of the memory allocated to copying map outputs.

A. 10

B. 15

C. 25

D. 35

Correct Answer is : 25

12. Jobs can enable task JVMs to be reused by specifying the job configuration :

A. mapred.job.recycle.jvm.num.tasks

B. mapissue.job.reuse.jvm.num.tasks

C. mapred.job.reuse.jvm.num.tasks

D. all of the mentioned

Correct Answer is : mapissue.job.reuse.jvm.num.tasks

13. Point out the wrong statement :

A. The task tracker has local directory to create localized cache and localized job

B. The task tracker can define multiple local directories

C. The Job tracker cannot define multiple local directories

D. None of the mentioned

Correct Answer is : None of the mentioned

14. During the execution of a streaming job, the names of the _______ parameters are transformed.

A. vmap

B. mapvim

C. mapreduce

D. mapred

Correct Answer is : mapred

15. The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to :

A. ${HADOOP_LOG_DIR}/user

B. ${HADOOP_LOG_DIR}/userlogs

C. ${HADOOP_LOG_DIR}/logs

D. None of the mentioned

Correct Answer is : ${HADOOP_LOG_DIR}/userlogs

16. ____________ is the primary interface by which user-job interacts with the JobTracker.

A. JobConf

B. JobClient

C. JobServer

D. All of the mentioned

Correct Answer is : JobClient

17. The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks.

A. DistributedLog

B. DistributedCache

C. DistributedJars

D. None of the mentioned

Correct Answer is : DistributedCache

18. __________ is used to filter log files from the output directory listing.

A. OutputLog

B. OutputLogFilter

C. DistributedLog

D. DistributedJars

Correct Answer is : OutputLogFilter

19. The split size is normally the size of an ________ block, which is appropriate for most applications.

A. Generic

B. Task

C. Library

D. HDFS

Correct Answer is : HDFS

20. Point out the correct statement :

A. The minimum split size is usually 1 byte, although some formats have a lower bound on the split size

B. Applications may impose a minimum split size

C. The maximum split size defaults to the maximum value that can be represented by a Java long type

D. All of the mentioned

Correct Answer is : The minimum split size is usually 1 byte, although some formats have a lower bound on the split size

Hadoop interview questions part 35

Hadoop interview questions part 35

Take as many assements as you can to improve your validate your skill rating

Total Questions: 20

1. Which of the following partitions the key space ?

2. ____________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer

3. Point out the wrong statement :

4. __________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.

5. The ___________ executes the Mapper/ Reducer task as a child process in a separate jvm.

6. Maximum virtual memory of the launched child-task is specified using :

7. Which of the following parameter is the threshold for the accounting and serialization buffers ?

8. ______________ is percentage of memory relative to the maximum heap size in which map outputs may be retained during the reduce.

9. ____________ specifies the number of segments on disk to be merged at the same time.

10. Point out the correct statement :

11. Map output larger than ___ percent of the memory allocated to copying map outputs.

12. Jobs can enable task JVMs to be reused by specifying the job configuration :

13. Point out the wrong statement :

14. During the execution of a streaming job, the names of the _______ parameters are transformed.

15. The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to :

16. ____________ is the primary interface by which user-job interacts with the JobTracker.

17. The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks.

18. __________ is used to filter log files from the output directory listing.

19. The split size is normally the size of an ________ block, which is appropriate for most applications.

20. Point out the correct statement :

Similar Interview Questions

Search for latest jobs

For Employers

For Partner

For Jobseekers

Help

Follow Us

snaprecruit