Back to Practice Artificial Intelligence

Machine Learning Basics - Practice MCQs for CCAT

50 Questions Section B: Programming Artificial Intelligence

Machine Learning Basics Question Bank for C-CAT

Topic-wise Machine Learning Basics MCQs for CDAC C-CAT preparation with answers and explanations.

Q1.
Machine Learning is a subset of:
ADatabase Management
BOperating Systems
CNetwork Security
DArtificial Intelligence
Show Answer & Explanation

Correct Answer: D - Artificial Intelligence

Machine Learning is a branch of AI that enables systems to learn from data without explicit programming.

Q2.
Supervised learning requires:
AUnlabeled data only
BLabeled training data
CNo data
DOnly test data
Show Answer & Explanation

Correct Answer: B - Labeled training data

Supervised learning uses labeled data where both input features and target outputs are provided.

Q3.
Which is an example of unsupervised learning?
AEmail spam classification
BHouse price prediction
CCustomer segmentation/clustering
DImage classification with labels
Show Answer & Explanation

Correct Answer: C - Customer segmentation/clustering

Clustering groups similar data points without predefined labels - a classic unsupervised learning task.

Q4.
Reinforcement learning involves:
AOnly labeled data
BAgent learning through rewards and penalties
COnly clustering
DNo learning
Show Answer & Explanation

Correct Answer: B - Agent learning through rewards and penalties

Reinforcement learning trains agents to make decisions by receiving rewards or penalties for actions.

Q5.
Classification in ML predicts:
ADiscrete categories/classes
BContinuous values
COnly numbers
DOnly text
Show Answer & Explanation

Correct Answer: A - Discrete categories/classes

Classification predicts discrete categorical labels like spam/not spam, cat/dog, etc.

Q6.
Regression in ML predicts:
ACategories
BOnly integers
COnly binary outcomes
DContinuous numerical values
Show Answer & Explanation

Correct Answer: D - Continuous numerical values

Regression predicts continuous values like prices, temperatures, or sales figures.

Q7.
Overfitting occurs when:
AModel performs poorly on training data
BDataset is too large
CModel is too simple
DModel memorizes training data but fails on new data
Show Answer & Explanation

Correct Answer: D - Model memorizes training data but fails on new data

Overfitting happens when a model learns noise in training data, performing well on training but poorly on unseen data.

Q8.
Underfitting indicates:
AModel is too complex
BPerfect fit
CModel is too simple to capture patterns
DToo much training
Show Answer & Explanation

Correct Answer: C - Model is too simple to capture patterns

Underfitting occurs when the model is too simple to capture the underlying patterns in the data.

Q9.
Cross-validation is used for:
AEvaluating model performance on different data splits
BData collection
CData storage
DNetwork testing
Show Answer & Explanation

Correct Answer: A - Evaluating model performance on different data splits

Cross-validation evaluates model performance by training and testing on different subsets of data.

Q10.
Feature in machine learning refers to:
AOutput variable
BInput variable used for prediction
CDatabase table
DNetwork node
Show Answer & Explanation

Correct Answer: B - Input variable used for prediction

Features are input variables (attributes) that the model uses to make predictions.

Q11.
Training set is used to:
AEvaluate final model
BTrain/fit the model parameters
CDeploy the model
DStore predictions
Show Answer & Explanation

Correct Answer: B - Train/fit the model parameters

Training set is used to train the model - the model learns patterns from this data.

Q12.
Test set is used to:
ATrain the model
BEvaluate model on unseen data
CStore training data
DClean data
Show Answer & Explanation

Correct Answer: B - Evaluate model on unseen data

Test set evaluates how well the trained model generalizes to new, unseen data.

Q13.
Decision tree splits data based on:
ARandom selection
BFile size
CAlphabetical order
DFeature values to maximize information gain
Show Answer & Explanation

Correct Answer: D - Feature values to maximize information gain

Decision trees split data on feature values that best separate classes (maximize information gain/minimize impurity).

Q14.
K-Nearest Neighbors (KNN) classifies based on:
AMajority vote of k nearest neighbors
BSingle nearest point
CAll data points
DRandom selection
Show Answer & Explanation

Correct Answer: A - Majority vote of k nearest neighbors

KNN classifies a point based on the majority class among its k nearest neighbors in feature space.

Q15.
Linear regression finds:
AClusters in data
BAnomalies
CClassification boundaries
DBest-fit line to predict continuous values
Show Answer & Explanation

Correct Answer: D - Best-fit line to predict continuous values

Linear regression fits a line (or hyperplane) that minimizes the error between predictions and actual values.

Q16.
Logistic regression is used for:
ARegression only
BDimensionality reduction
CClustering
DBinary classification
Show Answer & Explanation

Correct Answer: D - Binary classification

Despite its name, logistic regression is used for binary classification, predicting probabilities.

Q17.
Accuracy in classification measures:
AOnly false positives
BOnly false negatives
CPercentage of correct predictions
DTraining time
Show Answer & Explanation

Correct Answer: C - Percentage of correct predictions

Accuracy is the ratio of correct predictions (both positive and negative) to total predictions.

Q18.
Precision measures:
AAll correct predictions
BSpeed of prediction
CProportion of true positives among predicted positives
DData size
Show Answer & Explanation

Correct Answer: C - Proportion of true positives among predicted positives

Precision = True Positives / (True Positives + False Positives) - how many predicted positives are actually positive.

Q19.
Recall (Sensitivity) measures:
AFalse positive rate
BProportion of actual positives correctly identified
CPrediction speed
DModel complexity
Show Answer & Explanation

Correct Answer: B - Proportion of actual positives correctly identified

Recall = True Positives / (True Positives + False Negatives) - how many actual positives were correctly identified.

Q20.
Bias-variance tradeoff means:
AReducing bias increases variance and vice versa
BMore data is always better
CSimpler models are always better
DComplex models never overfit
Show Answer & Explanation

Correct Answer: A - Reducing bias increases variance and vice versa

Decreasing bias (making model more complex) typically increases variance (sensitivity to training data), and vice versa.

Q21.
Which type of machine learning uses labeled data for training?
AUnsupervised Learning
BReinforcement Learning
CSupervised Learning
DSemi-supervised Learning
Show Answer & Explanation

Correct Answer: C - Supervised Learning

Supervised Learning uses labeled training data (input-output pairs) to learn a mapping function. The model is trained on known correct answers and then makes predictions on new, unseen data.

Q22.
K-Means is an example of which type of learning?
AUnsupervised Learning
BSupervised Learning
CReinforcement Learning
DTransfer Learning
Show Answer & Explanation

Correct Answer: A - Unsupervised Learning

K-Means is an unsupervised learning algorithm used for clustering. It partitions data into K clusters based on similarity, without requiring labeled training data.

Q23.
Overfitting in machine learning occurs when:
AThe model performs well on both training and test data
BThe model has too few parameters
CThe model performs poorly on both training and test data
DThe model performs well on training data but poorly on test data
Show Answer & Explanation

Correct Answer: D - The model performs well on training data but poorly on test data

Overfitting occurs when a model learns the training data too well, including noise and outliers, resulting in excellent training accuracy but poor generalization to new, unseen data.

Q24.
Which algorithm is commonly used for classification tasks?
ALinear Regression
BK-Means Clustering
CPrincipal Component Analysis
DDecision Tree
Show Answer & Explanation

Correct Answer: D - Decision Tree

Decision Trees are commonly used for classification tasks. They split data into branches based on feature values and make predictions at leaf nodes. They can also be used for regression.

Q25.
What is the purpose of cross-validation in machine learning?
ATo increase training speed
BTo select the best programming language
CTo reduce the dataset size
DTo evaluate model performance and reduce overfitting
Show Answer & Explanation

Correct Answer: D - To evaluate model performance and reduce overfitting

Cross-validation is a technique to evaluate model performance by splitting data into multiple folds, training on some folds and testing on others. It helps detect overfitting and provides a more reliable performance estimate.

Q26.
Linear Regression is used to predict:
ACategorical values
BBoolean values
CContinuous values
DComplex numbers
Show Answer & Explanation

Correct Answer: C - Continuous values

Linear Regression predicts continuous numerical values by finding the best-fit linear relationship between input features and the output variable. For example, predicting house prices based on area.

Q27.
Which metric is NOT commonly used for classification evaluation?
AAccuracy
BPrecision
CMean Squared Error
DF1-Score
Show Answer & Explanation

Correct Answer: C - Mean Squared Error

Mean Squared Error (MSE) is a regression metric, not a classification metric. Classification tasks typically use Accuracy, Precision, Recall, and F1-Score for evaluation.

Q28.
What does the 'K' in K-Nearest Neighbors represent?
ANumber of features
BNumber of clusters
CNumber of nearest neighbors to consider
DNumber of iterations
Show Answer & Explanation

Correct Answer: C - Number of nearest neighbors to consider

In KNN, 'K' represents the number of nearest neighbors to consider when making a classification or regression prediction. A new data point is classified based on the majority class of its K nearest neighbors.

Q29.
Which technique is used to prevent overfitting by adding a penalty term to the loss function?
ANormalization
BRegularization
CStandardization
DAugmentation
Show Answer & Explanation

Correct Answer: B - Regularization

Regularization prevents overfitting by adding a penalty term to the loss function. L1 (Lasso) and L2 (Ridge) regularization are common techniques that discourage overly complex models.

Q30.
In a confusion matrix, what does a 'True Positive' represent?
APredicted positive, actually positive
BPredicted positive, actually negative
CPredicted negative, actually positive
DPredicted negative, actually negative
Show Answer & Explanation

Correct Answer: A - Predicted positive, actually positive

A True Positive (TP) occurs when the model correctly predicts the positive class. The actual label is positive, and the model's prediction is also positive.

Q31.
Which of the following is an ensemble learning method?
ALinear Regression
BK-Means
CRandom Forest
DPCA
Show Answer & Explanation

Correct Answer: C - Random Forest

Random Forest is an ensemble learning method that builds multiple decision trees and merges their predictions. It reduces overfitting and improves accuracy compared to a single decision tree.

Q32.
What is the bias-variance tradeoff?
ATradeoff between underfitting and overfitting
BTradeoff between model speed and accuracy
CTradeoff between training and testing
DTradeoff between precision and recall
Show Answer & Explanation

Correct Answer: A - Tradeoff between underfitting and overfitting

The bias-variance tradeoff is the balance between underfitting (high bias, low variance) and overfitting (low bias, high variance). A good model balances both to minimize total error.

Q33.
Which algorithm uses the concept of 'margin' to find the optimal hyperplane?
ADecision Tree
BK-Means
CNaive Bayes
DSupport Vector Machine
Show Answer & Explanation

Correct Answer: D - Support Vector Machine

Support Vector Machine (SVM) finds the optimal hyperplane that maximizes the margin between two classes. The data points closest to the hyperplane are called support vectors.

Q34.
Naive Bayes classifier is based on which theorem?
APythagorean Theorem
BBayes' Theorem
CCentral Limit Theorem
DFermat's Last Theorem
Show Answer & Explanation

Correct Answer: B - Bayes' Theorem

Naive Bayes is based on Bayes' Theorem with the 'naive' assumption that features are conditionally independent given the class. It calculates posterior probability for classification.

Q35.
What is the purpose of feature scaling in machine learning?
ATo increase the number of features
BTo bring all features to the same scale
CTo remove irrelevant features
DTo add noise to the data
Show Answer & Explanation

Correct Answer: B - To bring all features to the same scale

Feature scaling normalizes the range of features so that no single feature dominates the model due to its scale. Common methods include Min-Max scaling and Standardization (z-score).

Q36.
Which of the following is a dimensionality reduction technique?
ARandom Forest
BBackpropagation
CGradient Descent
DPrincipal Component Analysis
Show Answer & Explanation

Correct Answer: D - Principal Component Analysis

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms high-dimensional data into a lower-dimensional space while preserving maximum variance.

Q37.
In gradient descent, the learning rate determines:
AThe direction of the gradient
BThe size of each step toward the minimum
CThe number of features
DThe number of training examples
Show Answer & Explanation

Correct Answer: B - The size of each step toward the minimum

The learning rate controls the size of the steps taken during gradient descent. A too-large learning rate may overshoot the minimum, while a too-small rate results in slow convergence.

Q38.
What is the difference between classification and regression?
AClassification predicts continuous values; regression predicts categories
BBoth predict continuous values
CClassification predicts categories; regression predicts continuous values
DBoth predict categories
Show Answer & Explanation

Correct Answer: C - Classification predicts categories; regression predicts continuous values

Classification predicts discrete categorical labels (e.g., spam/not spam), while regression predicts continuous numerical values (e.g., house price). They are both supervised learning tasks.

Q39.
Which of the following causes underfitting?
AToo complex model
BToo many features
CToo much training data
DToo simple model
Show Answer & Explanation

Correct Answer: D - Too simple model

Underfitting occurs when the model is too simple to capture the underlying patterns in the data. It results in high bias and poor performance on both training and test data.

Q40.
What does the R-squared (R²) score measure?
AProportion of variance explained by the model
BClassification accuracy
CNumber of outliers in data
DDistance between clusters
Show Answer & Explanation

Correct Answer: A - Proportion of variance explained by the model

R-squared measures the proportion of variance in the dependent variable that is explained by the independent variables. A value of 1 indicates a perfect fit, while 0 indicates no predictive power.

Q41.
Logistic Regression is used for:
ARegression problems only
BClustering problems
CClassification problems
DDimensionality reduction
Show Answer & Explanation

Correct Answer: C - Classification problems

Despite its name, Logistic Regression is used for binary classification. It uses the sigmoid function to output probabilities and classifies data points into one of two categories.

Q42.
Which method splits data into training and testing sets only once?
AHold-out Validation
BK-Fold Cross Validation
CLeave-One-Out Cross Validation
DStratified Sampling
Show Answer & Explanation

Correct Answer: A - Hold-out Validation

Hold-out validation splits the dataset into training and testing sets only once (typically 70-30 or 80-20). It is simpler but may give biased results compared to cross-validation.

Q43.
Bagging in ensemble learning stands for:
ABackward Aggregating
BBatch Aggregating
CBootstrap Aggregating
DBinary Aggregating
Show Answer & Explanation

Correct Answer: C - Bootstrap Aggregating

Bagging stands for Bootstrap Aggregating. It creates multiple subsets of the training data using bootstrapping (sampling with replacement), trains a model on each, and combines predictions.

Q44.
What is the curse of dimensionality?
AData becomes sparse in high-dimensional space
BThe model has too few parameters
CTraining takes too long
DThe dataset is too small
Show Answer & Explanation

Correct Answer: A - Data becomes sparse in high-dimensional space

The curse of dimensionality refers to the phenomenon where data becomes increasingly sparse as the number of features (dimensions) increases, making it harder for algorithms to find patterns.

Q45.
Which loss function is commonly used for binary classification?
AMean Squared Error
BBinary Cross-Entropy
CMean Absolute Error
DHinge Loss
Show Answer & Explanation

Correct Answer: B - Binary Cross-Entropy

Binary Cross-Entropy (Log Loss) is the standard loss function for binary classification. It measures the difference between predicted probabilities and actual binary labels.

Q46.
In k-fold cross-validation, if k = 5, how many times is the model trained?
A5
B1
C10
D25
Show Answer & Explanation

Correct Answer: A - 5

In 5-fold cross-validation, the data is split into 5 folds. The model is trained 5 times, each time using 4 folds for training and 1 fold for testing. The results are then averaged.

Q47.
What does the term 'epoch' mean in machine learning training?
AA single training example
BOne complete pass through the entire training dataset
CA single weight update
DThe learning rate value
Show Answer & Explanation

Correct Answer: B - One complete pass through the entire training dataset

An epoch refers to one complete pass through the entire training dataset during model training. Multiple epochs are typically needed for the model to converge to optimal parameters.

Q48.
Which of the following is a type of unsupervised learning?
AClustering
BRegression
CClassification
DObject Detection
Show Answer & Explanation

Correct Answer: A - Clustering

Clustering is a type of unsupervised learning that groups similar data points together without using labeled data. K-Means, DBSCAN, and Hierarchical clustering are common methods.

Q49.
Precision in a classification model is defined as:
ATP / (TP + FN)
BTP / (TP + FP)
CTN / (TN + FP)
D(TP + TN) / Total
Show Answer & Explanation

Correct Answer: B - TP / (TP + FP)

Precision = TP / (TP + FP). It measures the proportion of predicted positives that are actually positive. High precision means fewer false positives.

Q50.
What is the main advantage of Gradient Boosting over Bagging?
AIt builds models sequentially, correcting errors of previous models
BIt is faster
CIt uses fewer resources
DIt requires no hyperparameter tuning
Show Answer & Explanation

Correct Answer: A - It builds models sequentially, correcting errors of previous models

Gradient Boosting builds models sequentially, where each new model focuses on correcting the errors of the previous ensemble. This often leads to better accuracy than bagging methods.

Showing 1-10 of 50 questions