MapReduce MCQs for CCAT | 20 Practice Questions with Answers

Q: MapReduce programming model consists of:

The correct answer is C: Map and Reduce phases. MapReduce has Map phase (transforms data into key-value pairs) and Reduce phase (aggregates values by key).

Q: The Map function outputs:

The correct answer is B: Key-value pairs. Map function processes input and emits intermediate key-value pairs for the Reduce phase.

Q: Shuffle and Sort phase occurs:

The correct answer is B: Between Map and Reduce phases. Shuffle and Sort transfers Map output to Reducers and sorts data by keys between phases.

Q: In MapReduce, Combiner is:

The correct answer is B: A mini-reducer that runs on Map output. Combiner is an optional local reducer that runs on Map output to reduce network transfer.

Q: Partitioner in MapReduce determines:

The correct answer is B: Which Reducer gets which key. Partitioner determines which Reducer receives which key-value pairs, typically using hash of the key.

Q1.

MapReduce programming model consists of:

AMap phase only

BReduce phase only

CMap and Reduce phases

DSort phase only

Show Answer & Explanation

Correct Answer: C — Map and Reduce phases

MapReduce has Map phase (transforms data into key-value pairs) and Reduce phase (aggregates values by key).

Q2.

The Map function outputs:

AFinal results

BKey-value pairs

COnly keys

DOnly values

Show Answer & Explanation

Correct Answer: B — Key-value pairs

Map function processes input and emits intermediate key-value pairs for the Reduce phase.

Q3.

Shuffle and Sort phase occurs:

ABefore Map phase

BBetween Map and Reduce phases

CAfter Reduce phase

DOnly if specified

Show Answer & Explanation

Correct Answer: B — Between Map and Reduce phases

Shuffle and Sort transfers Map output to Reducers and sorts data by keys between phases.

Q4.

In MapReduce, Combiner is:

ASame as Reducer

BA mini-reducer that runs on Map output

CA file format

DA compression algorithm

Show Answer & Explanation

Correct Answer: B — A mini-reducer that runs on Map output

Combiner is an optional local reducer that runs on Map output to reduce network transfer.

Q5.

Partitioner in MapReduce determines:

ANumber of Map tasks

BWhich Reducer gets which key

CFile split size

DCompression type

Show Answer & Explanation

Correct Answer: B — Which Reducer gets which key

Partitioner determines which Reducer receives which key-value pairs, typically using hash of the key.

Q6.

Input to Map function is:

AKey-value pair

BOnly file name

COnly value

DEntire file

Show Answer & Explanation

Correct Answer: A — Key-value pair

Map receives a key-value pair where key is typically offset and value is the line content.

Q7.

InputFormat in MapReduce:

ACompresses data

BDefines how to read and split input files

CWrites final output

DManages memory

Show Answer & Explanation

Correct Answer: B — Defines how to read and split input files

InputFormat defines how input files are read and split into InputSplits for Map tasks.

Q8.

Number of Map tasks is determined by:

ANumber of Reducers

BNumber of input splits

CCluster size

DUser specification only

Show Answer & Explanation

Correct Answer: B — Number of input splits

Number of Map tasks equals number of input splits, which depends on input data size and block size.

Q9.

Word Count MapReduce: Map phase emits:

A(word, 1) for each word

B(document, count)

C(line_number, word)

D(total_count, word)

Show Answer & Explanation

Correct Answer: A — (word, 1) for each word

In Word Count, Map emits (word, 1) for each word occurrence; Reduce sums the counts per word.

Q10.

Speculative execution in MapReduce:

ARuns tasks on failed nodes

BRuns duplicate tasks to handle slow nodes

CPredicts task output

DCaches intermediate results

Show Answer & Explanation

Correct Answer: B — Runs duplicate tasks to handle slow nodes

Speculative execution runs backup copies of slow-running tasks to prevent stragglers from delaying jobs.

Q11.

Reduce function receives:

ASingle key-value pair

BKey and iterator of all values for that key

COnly values

DRaw input data

Show Answer & Explanation

Correct Answer: B — Key and iterator of all values for that key

Reduce receives a key and an iterator over all values associated with that key after shuffle/sort.

Q12.

OutputFormat in MapReduce:

ASplits input files

BDefines how to write output

CManages Map tasks

DHandles network

Show Answer & Explanation

Correct Answer: B — Defines how to write output

OutputFormat defines how Reduce output is written - format, location, and structure.

Q13.

Data locality in MapReduce means:

AAll data stored locally

BMoving computation to where data resides

CData compression

DData replication

Show Answer & Explanation

Correct Answer: B — Moving computation to where data resides

Data locality moves computation to nodes where data is stored rather than moving data over network.

Q14.

RecordReader in MapReduce:

AWrites output records

BReads input split and generates key-value pairs

CSorts records

DCompresses records

Show Answer & Explanation

Correct Answer: B — Reads input split and generates key-value pairs

RecordReader reads an InputSplit and generates key-value pairs for the Map function.

Q15.

Default number of Reducers is:

A0

B1

CSame as Mappers

DUnlimited

Show Answer & Explanation

Correct Answer: B — 1

Default number of Reducers is 1, but can be configured based on data size and cluster capacity.

Q16.

Counters in MapReduce are used for:

ACounting reducers

BTracking job statistics and metrics

CFile compression

DMemory management

Show Answer & Explanation

Correct Answer: B — Tracking job statistics and metrics

Counters track various statistics like input/output records, bytes processed, and custom metrics.

Q17.

DistributedCache in MapReduce provides:

AIn-memory caching

BRead-only data distribution to all nodes

CWrite caching

DNetwork caching

Show Answer & Explanation

Correct Answer: B — Read-only data distribution to all nodes

DistributedCache distributes read-only files (like lookup tables) to all nodes before task execution.

Q18.

Map-only job has:

AMultiple Reducers

BNo Reduce phase

CNo Map phase

DOnly shuffle phase

Show Answer & Explanation

Correct Answer: B — No Reduce phase

Map-only jobs set Reducers to 0, outputting Map results directly without reduce/aggregation.

Q19.

Job Tracker in Hadoop 1.x was responsible for:

AData storage

BResource management and job scheduling

CFile splitting

DData compression

Show Answer & Explanation

Correct Answer: B — Resource management and job scheduling

JobTracker managed resources and scheduled jobs in Hadoop 1.x, replaced by YARN ResourceManager in 2.x.

Q20.

Secondary sort in MapReduce:

ASorts by value as well as key

BSorts only keys

CSorts files

DSorts reducers

Show Answer & Explanation

Correct Answer: A — Sorts by value as well as key

Secondary sort allows sorting by both key and value, using composite keys and custom comparators.

MapReduce — Practice MCQs for CCAT

More Big Data Topics

Ready for the real exam?