Hadoop Interview Questions - 33

1. During merging, __________ now always checks the incoming segments for corruption before merging.

A. LocalWriter

B. IndexWriter

C. ReadWriter

D. All of the mentioned

Correct Answer is : IndexWriter

2. Heap usage during IndexWriter merging is also much lower with the new :

A. LucCodec

B. Lucene50Codec

C. Lucene20Cod

D. All of the mentioned

Correct Answer is : Lucene50Codec

3. Point out the wrong statement :

A. ConcurScheduler detects whether the index is on SSD or not

B. Memory index supports payloads

C. Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate limit IO writes for each merge depending on incoming merge rate

D. The default codec has an option to control BEST_SPEED or BEST_COMPRESSION for stored fields

Correct Answer is : ConcurScheduler detects whether the index is on SSD or not

4. PostingsFormat now uses a __________ API when writing postings, just like doc values.

A. push

B. pull

C. read

D. all of the mentioned

Correct Answer is : pull

5. New ____________ type enables Indexing and searching of date ranges, particularly multi-valued ones.

A. RangeField

B. DateField

C. DateRangeField

D. All of the mentioned

Correct Answer is : DateRangeField

6. SolrJ now has first class support for __________ API

A. Compactions

B. Collections

C. Distribution

D. All of the mentioned

Correct Answer is : Collections

7. ____________ Collection API allows for even distribution of custom replica properties.

A. BALANUNIQUE

B. BALANCESHARDUNIQUE

C. BALANCEUNIQUE

D. None of the mentioned

Correct Answer is : BALANCESHARDUNIQUE

8. ____________ can be used to generate stats over the results of arbitrary numeric functions.

A. stats.field

B. sta.field

C. stats.value

D. none of the mentioned

Correct Answer is : stats.field

9. Mahout provides ____________ libraries for common and primitive Java collections.

A. Java

B. Javascript

C. Perl

D. Python

Correct Answer is : Java

10. Point out the correct statement :

A. Mahout is distributed under a commercially friendly Apache Software license

B. Mahout is a library of scalable machine-learning algorithms, implemented on top of Apache Hadoop® and using the MapReduce paradigm

C. Apache Mahout is a project of the Apache Software Foundation to produce free implementations of distributed or otherwise scalable machine learning algorithms

D. None of the mentioned

Correct Answer is : None of the mentioned

11. _________ does not restrict contributions to Hadoop based implementations.

A. Mahout

B. Oozie

C. Impala

D. All of the mentioned

Correct Answer is : Mahout

12. Mahout provides an implementation of a ______________ identification algorithm which scores collocations using log-likelihood ratio.

A. collocation

B. compaction

C. collection

D. none of the mentioned

Correct Answer is : collocation

13. Point out the wrong statement :

A. ‘Taste’ collaborative-filtering recommender component of Mahout was originally a separate project and can run standalone without Hadoop

B. Integration of Mahout with initiatives such as the Pregel-like Giraph are actively under discussion

C. Calculating the LLR is very straightforward

D. None of the mentioned

Correct Answer is : None of the mentioned

14. The tokens are passed through a Lucene ____________ to produce NGrams of the desired length.

A. ShngleFil

B. ShingleFilter

C. SingleFilter

D. Collfilter

Correct Answer is : ShingleFilter

15. The _________ collocation identifier is integrated into the process that is used to create vectors from sequence files of text keys and values.

A. lbr

B. lcr

C. llr

D. lar

Correct Answer is : llr

16. ____________ generates NGrams and counts frequencies for ngrams, head and tail subgrams.

A. CollocationDriver

B. CollocDriver

C. CarDriver

D. All of the mentioned

Correct Answer is : CollocDriver

17. A key of type ___________ is generated which is used later to join ngrams with their heads and tails in the reducer phase.

A. GramKey

B. Primary

C. Secondary

D. None of the mentioned

Correct Answer is : GramKey

18. ________ phase merges the counts for unique ngrams or ngram fragments across multiple documents.

A. CollocCombiner

B. CollocReducer

C. CollocMerger

D. None of the mentioned

Correct Answer is : CollocCombiner

19. _______ can change the maximum number of cells of a column family.

A. set

B. reset

C. alter

D. select

Correct Answer is : alter

20. Point out the correct statement :

A. You can add a column family to a table using the method addColumn()

B. Using alter, you can also create a column family

C. Using disable-all, you can truncate a column family

D. None of the mentioned

Correct Answer is : You can add a column family to a table using the method addColumn()

Hadoop interview questions part 33

Hadoop interview questions part 33

Take as many assements as you can to improve your validate your skill rating

Total Questions: 20

1. During merging, __________ now always checks the incoming segments for corruption before merging.

2. Heap usage during IndexWriter merging is also much lower with the new :

3. Point out the wrong statement :

4. PostingsFormat now uses a __________ API when writing postings, just like doc values.

5. New ____________ type enables Indexing and searching of date ranges, particularly multi-valued ones.

6. SolrJ now has first class support for __________ API

7. ____________ Collection API allows for even distribution of custom replica properties.

8. ____________ can be used to generate stats over the results of arbitrary numeric functions.

9. Mahout provides ____________ libraries for common and primitive Java collections.

10. Point out the correct statement :

11. _________ does not restrict contributions to Hadoop based implementations.

12. Mahout provides an implementation of a ______________ identification algorithm which scores collocations using log-likelihood ratio.

13. Point out the wrong statement :

14. The tokens are passed through a Lucene ____________ to produce NGrams of the desired length.

15. The _________ collocation identifier is integrated into the process that is used to create vectors from sequence files of text keys and values.

16. ____________ generates NGrams and counts frequencies for ngrams, head and tail subgrams.

17. A key of type ___________ is generated which is used later to join ngrams with their heads and tails in the reducer phase.

18. ________ phase merges the counts for unique ngrams or ngram fragments across multiple documents.

19. _______ can change the maximum number of cells of a column family.

20. Point out the correct statement :

Similar Interview Questions

Search for latest jobs

For Employers

For Partner

For Jobseekers

Help

Follow Us

snaprecruit