AI & Machine Learning

Machine Learning Basics — Practice MCQs for CCAT

20 Questions Section B: Programming AI & Machine Learning

Practice 20 Machine Learning Basics multiple-choice questions designed for CDAC CCAT exam preparation. Click "Show Answer" to reveal the correct option with detailed explanation.

Q1.
Machine Learning is a subset of:
ADatabase Management
BArtificial Intelligence
CNetwork Security
DOperating Systems
Show Answer & Explanation

Correct Answer: B — Artificial Intelligence

Machine Learning is a branch of AI that enables systems to learn from data without explicit programming.

Q2.
Supervised learning requires:
AUnlabeled data only
BLabeled training data
CNo data
DOnly test data
Show Answer & Explanation

Correct Answer: B — Labeled training data

Supervised learning uses labeled data where both input features and target outputs are provided.

Q3.
Which is an example of unsupervised learning?
AEmail spam classification
BCustomer segmentation/clustering
CHouse price prediction
DImage classification with labels
Show Answer & Explanation

Correct Answer: B — Customer segmentation/clustering

Clustering groups similar data points without predefined labels - a classic unsupervised learning task.

Q4.
Reinforcement learning involves:
AOnly labeled data
BAgent learning through rewards and penalties
COnly clustering
DNo learning
Show Answer & Explanation

Correct Answer: B — Agent learning through rewards and penalties

Reinforcement learning trains agents to make decisions by receiving rewards or penalties for actions.

Q5.
Classification in ML predicts:
AContinuous values
BDiscrete categories/classes
COnly numbers
DOnly text
Show Answer & Explanation

Correct Answer: B — Discrete categories/classes

Classification predicts discrete categorical labels like spam/not spam, cat/dog, etc.

Q6.
Regression in ML predicts:
ACategories
BContinuous numerical values
COnly binary outcomes
DOnly integers
Show Answer & Explanation

Correct Answer: B — Continuous numerical values

Regression predicts continuous values like prices, temperatures, or sales figures.

Q7.
Overfitting occurs when:
AModel performs poorly on training data
BModel memorizes training data but fails on new data
CModel is too simple
DDataset is too large
Show Answer & Explanation

Correct Answer: B — Model memorizes training data but fails on new data

Overfitting happens when a model learns noise in training data, performing well on training but poorly on unseen data.

Q8.
Underfitting indicates:
AModel is too complex
BModel is too simple to capture patterns
CPerfect fit
DToo much training
Show Answer & Explanation

Correct Answer: B — Model is too simple to capture patterns

Underfitting occurs when the model is too simple to capture the underlying patterns in the data.

Q9.
Cross-validation is used for:
AData collection
BEvaluating model performance on different data splits
CData storage
DNetwork testing
Show Answer & Explanation

Correct Answer: B — Evaluating model performance on different data splits

Cross-validation evaluates model performance by training and testing on different subsets of data.

Q10.
Feature in machine learning refers to:
AOutput variable
BInput variable used for prediction
CDatabase table
DNetwork node
Show Answer & Explanation

Correct Answer: B — Input variable used for prediction

Features are input variables (attributes) that the model uses to make predictions.

Q11.
Training set is used to:
AEvaluate final model
BTrain/fit the model parameters
CDeploy the model
DStore predictions
Show Answer & Explanation

Correct Answer: B — Train/fit the model parameters

Training set is used to train the model - the model learns patterns from this data.

Q12.
Test set is used to:
ATrain the model
BEvaluate model on unseen data
CStore training data
DClean data
Show Answer & Explanation

Correct Answer: B — Evaluate model on unseen data

Test set evaluates how well the trained model generalizes to new, unseen data.

Q13.
Decision tree splits data based on:
ARandom selection
BFeature values to maximize information gain
CAlphabetical order
DFile size
Show Answer & Explanation

Correct Answer: B — Feature values to maximize information gain

Decision trees split data on feature values that best separate classes (maximize information gain/minimize impurity).

Q14.
K-Nearest Neighbors (KNN) classifies based on:
ASingle nearest point
BMajority vote of k nearest neighbors
CAll data points
DRandom selection
Show Answer & Explanation

Correct Answer: B — Majority vote of k nearest neighbors

KNN classifies a point based on the majority class among its k nearest neighbors in feature space.

Q15.
Linear regression finds:
AClusters in data
BBest-fit line to predict continuous values
CClassification boundaries
DAnomalies
Show Answer & Explanation

Correct Answer: B — Best-fit line to predict continuous values

Linear regression fits a line (or hyperplane) that minimizes the error between predictions and actual values.

Q16.
Logistic regression is used for:
ARegression only
BBinary classification
CClustering
DDimensionality reduction
Show Answer & Explanation

Correct Answer: B — Binary classification

Despite its name, logistic regression is used for binary classification, predicting probabilities.

Q17.
Accuracy in classification measures:
AOnly false positives
BPercentage of correct predictions
COnly false negatives
DTraining time
Show Answer & Explanation

Correct Answer: B — Percentage of correct predictions

Accuracy is the ratio of correct predictions (both positive and negative) to total predictions.

Q18.
Precision measures:
AAll correct predictions
BProportion of true positives among predicted positives
CSpeed of prediction
DData size
Show Answer & Explanation

Correct Answer: B — Proportion of true positives among predicted positives

Precision = True Positives / (True Positives + False Positives) - how many predicted positives are actually positive.

Q19.
Recall (Sensitivity) measures:
AFalse positive rate
BProportion of actual positives correctly identified
CPrediction speed
DModel complexity
Show Answer & Explanation

Correct Answer: B — Proportion of actual positives correctly identified

Recall = True Positives / (True Positives + False Negatives) - how many actual positives were correctly identified.

Q20.
Bias-variance tradeoff means:
AMore data is always better
BReducing bias increases variance and vice versa
CSimpler models are always better
DComplex models never overfit
Show Answer & Explanation

Correct Answer: B — Reducing bias increases variance and vice versa

Decreasing bias (making model more complex) typically increases variance (sensitivity to training data), and vice versa.