JOBSEEKERS
Login
Sign Up
Jobseeker
Employer
Staffing Firm
Direct Client
Hadoop interview questions part 45
Hadoop interview questions part 45
Back
Take as many assements as you can to improve your validate your skill rating
Total Questions: 20
1. The ________ method in the ModelCountReducer class “reduces” the values the mapper collects into a derived value
A. count
B. add
C. reduce
D. all of the mentioned
Show Correct Answer
Correct Answer is :
reduce
2. Which of the following works well with Avro ?
A. Lucene
B. kafka
C. MapReduce
D. None of the mentioned
Show Correct Answer
Correct Answer is :
MapReduce
3. __________ tools is used to generate proxy objects in Java to easily work with the objects.
A. Lucene
B. kafka
C. MapReduce
D. Avro
Show Correct Answer
Correct Answer is :
Avro
4. Spark was initially started by ____________ at UC Berkeley AMPLab in 2009.
A. Mahek Zaharia
B. Matei Zaharia
C. Doug Cutting
D. Stonebraker
Show Correct Answer
Correct Answer is :
Matei Zaharia
5. Point out the correct statement :
A. RSS abstraction provides distributed task dispatching, scheduling, and basic I/O functionalities
B. For cluster manager, Spark supports standalone Hadoop YARN
C. Hive SQL is a component on top of Spark Core
D. None of the mentioned
Show Correct Answer
Correct Answer is :
For cluster manager, Spark supports standalone Hadoop YARN
6. ____________ is a component on top of Spark Core.
A. Spark Streaming
B. Spark SQL
C. RDDs
D. All of the mentioned
Show Correct Answer
Correct Answer is :
Spark SQL
7. Spark SQL provides a domain-specific language to manipulate ___________ in Scala, Java, or Python.
A. Spark Streaming
B. Spark SQL
C. RDDs
D. All of the mentioned
Show Correct Answer
Correct Answer is :
RDDs
8. Point out the wrong statement :
A. For distributed storage, Spark can interface with a wide variety, including Hadoop Distributed File System (HDFS)
B. Spark also supports a pseudo-distributed mode, usually used only for development or testing purposes
C. Spark has over 465 contributors in 2014
D. All of the mentioned
Show Correct Answer
Correct Answer is :
All of the mentioned
9. ______________ leverages Spark Core fast scheduling capability to perform streaming analytics.
A. MLlib
B. Spark Streaming
C. GraphX
D. RDDs
Show Correct Answer
Correct Answer is :
Spark Streaming
10. ____________ is a distributed machine learning framework on top of Spark
A. MLlib
B. Spark Streaming
C. GraphX
D. RDDs
Show Correct Answer
Correct Answer is :
MLlib
11. ________ is a distributed graph processing framework on top of Spark.
A. MLlib
B. Spark Streaming
C. GraphX
D. All of the mentioned
Show Correct Answer
Correct Answer is :
GraphX
12. GraphX provides an API for expressing graph computation that can model the __________ abstraction.
A. GaAdt
B. Spark Core
C. Pregel
D. None of the mentioned
Show Correct Answer
Correct Answer is :
Pregel
13. Spark architecture is ___________ times as fast as Hadoop disk-based Apache Mahout and even scales better than Vowpal Wabbit.
A. 10
B. 20
C. 50
D. 100
Show Correct Answer
Correct Answer is :
10
14. Users can easily run Spark on top of Amazon’s __________
A. Infosphere
B. EC2
C. EMR
D. None of the mentioned
Show Correct Answer
Correct Answer is :
EC2
15. Point out the correct statement :
A. Spark enables Apache Hive users to run their unmodified queries much faster
B. Spark interoperates only with Hadoop
C. Spark is a popular data warehouse solution running on top of Hadoop
D. None of the mentioned
Show Correct Answer
Correct Answer is :
Spark enables Apache Hive users to run their unmodified queries much faster
16. Spark runs on top of ___________, a cluster manager system which provides efficient resource isolation across distributed applications
A. Mesjs
B. Mesos
C. Mesus
D. All of the mentioned
Show Correct Answer
Correct Answer is :
Mesos
17. Which of the following can be used to launch Spark jobs inside MapReduce ?
A. SIM
B. SIMR
C. SIR
D. RIS
Show Correct Answer
Correct Answer is :
SIMR
18. Point out the wrong statement :
A. Spark is intended to replace, the Hadoop stack
B. Spark was designed to read and write data from and to HDFS, as well as other storage systems
C. Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN
D. None of the mentioned
Show Correct Answer
Correct Answer is :
Spark is intended to replace, the Hadoop stack
19. Which of the following language is not supported by Spark ?
A. Java
B. Pascal
C. Scala
D. Python
Show Correct Answer
Correct Answer is :
Pascal
20. Spark is packaged with higher level libraries, including support for _________ queries.
A. SQL
B. C
C. C++
D. None of the mentioned
Show Correct Answer
Correct Answer is :
SQL
Similar Interview Questions
Search for latest jobs
Find Jobs