Take as many assements as you can to improve your validate your skill rating
Total Questions: 20
1. Data analytics scripts are written in ____________ .
Correct Answer is : PigLatin
2. If demux is successful within ___ attempts, archives the completed files in Chukwa.
Correct Answer is : three
3. Chukwa is ___________ data collection system for managing large distributed systems.
Correct Answer is : open source
4. Collectors write chunks to logs/*.chukwa files until a ___ MB chunk is reached.
Correct Answer is : 64
5. The _________ codec from Google provides modest compression ratios.
Correct Answer is : Snappy
6. Point out the correct statement :
Correct Answer is : The Snappy codec is integrated into Hadoop Common, a set of common utilities that supports other Hadoop subprojects
7. Which of the following compression is similar to Snappy compression ?
Correct Answer is : LZO
8. Which of the following supports splittable compression ?
Correct Answer is : LZO
9. Point out the wrong statement :
Correct Answer is : From a usability standpoint, LZO and Gzip are similar.
10. Which of the following is the slowest compression technique ?
Correct Answer is : Bzip2
11. Gzip (short for GNU zip) generates compressed files that have a _________ extension.
Correct Answer is : .gz
12. Which of the following is based on the DEFLATE algorithm ?
Correct Answer is : Gzip
13. __________ typically compresses files to within 10% to 15% of the best available techniques.
Correct Answer is : Bzip2
14. The LZO compression format is composed of approximately __________ blocks of compressed data.
Correct Answer is : 256k
15. The Apache Crunch Java library provides a framework for writing, testing, and running ___________ pipelines.
Correct Answer is : MapReduce
16. Point out the correct statement :
Correct Answer is : A number of common Aggregator implementations are provided in the Aggregators class
17. For Scala users, there is the __________ API, which is built on top of the Java APIs
Correct Answer is : Scrunch
18. The Crunch APIs are modeled after _________ , which is the library that Google uses for building data pipelines on top of their own implementation of MapReduce.
Correct Answer is : FlumeJava
19. Point out the wrong statement :
Correct Answer is : None of the mentioned
20. Crunch was designed for developers who understand __________ and want to use MapReduce effectively.