| Snaprecruit.com

| Snaprecruit.com

Interview question based on skill :

Take as many assements as you can to improve your validate your skill rating

Total Questions: 20

1. Data analytics scripts are written in ____________ .

Correct Answer is : PigLatin

2. If demux is successful within ___ attempts, archives the completed files in Chukwa.

Correct Answer is : three

3. Chukwa is ___________ data collection system for managing large distributed systems.

Correct Answer is : open source

4. Collectors write chunks to logs/*.chukwa files until a ___ MB chunk is reached.

Correct Answer is : 64

5. The _________ codec from Google provides modest compression ratios.

Correct Answer is : Snappy

6. Point out the correct statement :

Correct Answer is : The Snappy codec is integrated into Hadoop Common, a set of common utilities that supports other Hadoop subprojects

7. Which of the following compression is similar to Snappy compression ?

Correct Answer is : LZO

8. Which of the following supports splittable compression ?

Correct Answer is : LZO

9. Point out the wrong statement :

Correct Answer is : From a usability standpoint, LZO and Gzip are similar.

10. Which of the following is the slowest compression technique ?

Correct Answer is : Bzip2

11. Gzip (short for GNU zip) generates compressed files that have a _________ extension.

Correct Answer is : .gz

12. Which of the following is based on the DEFLATE algorithm ?

Correct Answer is : Gzip

13. __________ typically compresses files to within 10% to 15% of the best available techniques.

Correct Answer is : Bzip2

14. The LZO compression format is composed of approximately __________ blocks of compressed data.

Correct Answer is : 256k

15. The Apache Crunch Java library provides a framework for writing, testing, and running ___________ pipelines.

Correct Answer is : MapReduce

16. Point out the correct statement :

Correct Answer is : A number of common Aggregator implementations are provided in the Aggregators class

17. For Scala users, there is the __________ API, which is built on top of the Java APIs

Correct Answer is : Scrunch

18. The Crunch APIs are modeled after _________ , which is the library that Google uses for building data pipelines on top of their own implementation of MapReduce.

Correct Answer is : FlumeJava

19. Point out the wrong statement :

Correct Answer is : None of the mentioned

20. Crunch was designed for developers who understand __________ and want to use MapReduce effectively.

Correct Answer is : Java