Machine Learning Basics Question Bank for C-CAT
Topic-wise Machine Learning Basics MCQs for CDAC C-CAT preparation with answers and explanations.
Show Answer & Explanation
Correct Answer: D - Artificial Intelligence
Machine Learning is a branch of AI that enables systems to learn from data without explicit programming.
Show Answer & Explanation
Correct Answer: B - Labeled training data
Supervised learning uses labeled data where both input features and target outputs are provided.
Show Answer & Explanation
Correct Answer: C - Customer segmentation/clustering
Clustering groups similar data points without predefined labels - a classic unsupervised learning task.
Show Answer & Explanation
Correct Answer: B - Agent learning through rewards and penalties
Reinforcement learning trains agents to make decisions by receiving rewards or penalties for actions.
Show Answer & Explanation
Correct Answer: A - Discrete categories/classes
Classification predicts discrete categorical labels like spam/not spam, cat/dog, etc.
Show Answer & Explanation
Correct Answer: D - Continuous numerical values
Regression predicts continuous values like prices, temperatures, or sales figures.
Show Answer & Explanation
Correct Answer: D - Model memorizes training data but fails on new data
Overfitting happens when a model learns noise in training data, performing well on training but poorly on unseen data.
Show Answer & Explanation
Correct Answer: C - Model is too simple to capture patterns
Underfitting occurs when the model is too simple to capture the underlying patterns in the data.
Show Answer & Explanation
Correct Answer: A - Evaluating model performance on different data splits
Cross-validation evaluates model performance by training and testing on different subsets of data.
Show Answer & Explanation
Correct Answer: B - Input variable used for prediction
Features are input variables (attributes) that the model uses to make predictions.
Show Answer & Explanation
Correct Answer: B - Train/fit the model parameters
Training set is used to train the model - the model learns patterns from this data.
Show Answer & Explanation
Correct Answer: B - Evaluate model on unseen data
Test set evaluates how well the trained model generalizes to new, unseen data.
Show Answer & Explanation
Correct Answer: D - Feature values to maximize information gain
Decision trees split data on feature values that best separate classes (maximize information gain/minimize impurity).
Show Answer & Explanation
Correct Answer: A - Majority vote of k nearest neighbors
KNN classifies a point based on the majority class among its k nearest neighbors in feature space.
Show Answer & Explanation
Correct Answer: D - Best-fit line to predict continuous values
Linear regression fits a line (or hyperplane) that minimizes the error between predictions and actual values.
Show Answer & Explanation
Correct Answer: D - Binary classification
Despite its name, logistic regression is used for binary classification, predicting probabilities.
Show Answer & Explanation
Correct Answer: C - Percentage of correct predictions
Accuracy is the ratio of correct predictions (both positive and negative) to total predictions.
Show Answer & Explanation
Correct Answer: C - Proportion of true positives among predicted positives
Precision = True Positives / (True Positives + False Positives) - how many predicted positives are actually positive.
Show Answer & Explanation
Correct Answer: B - Proportion of actual positives correctly identified
Recall = True Positives / (True Positives + False Negatives) - how many actual positives were correctly identified.
Show Answer & Explanation
Correct Answer: A - Reducing bias increases variance and vice versa
Decreasing bias (making model more complex) typically increases variance (sensitivity to training data), and vice versa.
Show Answer & Explanation
Correct Answer: C - Supervised Learning
Supervised Learning uses labeled training data (input-output pairs) to learn a mapping function. The model is trained on known correct answers and then makes predictions on new, unseen data.
Show Answer & Explanation
Correct Answer: A - Unsupervised Learning
K-Means is an unsupervised learning algorithm used for clustering. It partitions data into K clusters based on similarity, without requiring labeled training data.
Show Answer & Explanation
Correct Answer: D - The model performs well on training data but poorly on test data
Overfitting occurs when a model learns the training data too well, including noise and outliers, resulting in excellent training accuracy but poor generalization to new, unseen data.
Show Answer & Explanation
Correct Answer: D - Decision Tree
Decision Trees are commonly used for classification tasks. They split data into branches based on feature values and make predictions at leaf nodes. They can also be used for regression.
Show Answer & Explanation
Correct Answer: D - To evaluate model performance and reduce overfitting
Cross-validation is a technique to evaluate model performance by splitting data into multiple folds, training on some folds and testing on others. It helps detect overfitting and provides a more reliable performance estimate.
Show Answer & Explanation
Correct Answer: C - Continuous values
Linear Regression predicts continuous numerical values by finding the best-fit linear relationship between input features and the output variable. For example, predicting house prices based on area.
Show Answer & Explanation
Correct Answer: C - Mean Squared Error
Mean Squared Error (MSE) is a regression metric, not a classification metric. Classification tasks typically use Accuracy, Precision, Recall, and F1-Score for evaluation.
Show Answer & Explanation
Correct Answer: C - Number of nearest neighbors to consider
In KNN, 'K' represents the number of nearest neighbors to consider when making a classification or regression prediction. A new data point is classified based on the majority class of its K nearest neighbors.
Show Answer & Explanation
Correct Answer: B - Regularization
Regularization prevents overfitting by adding a penalty term to the loss function. L1 (Lasso) and L2 (Ridge) regularization are common techniques that discourage overly complex models.
Show Answer & Explanation
Correct Answer: A - Predicted positive, actually positive
A True Positive (TP) occurs when the model correctly predicts the positive class. The actual label is positive, and the model's prediction is also positive.
Show Answer & Explanation
Correct Answer: C - Random Forest
Random Forest is an ensemble learning method that builds multiple decision trees and merges their predictions. It reduces overfitting and improves accuracy compared to a single decision tree.
Show Answer & Explanation
Correct Answer: A - Tradeoff between underfitting and overfitting
The bias-variance tradeoff is the balance between underfitting (high bias, low variance) and overfitting (low bias, high variance). A good model balances both to minimize total error.
Show Answer & Explanation
Correct Answer: D - Support Vector Machine
Support Vector Machine (SVM) finds the optimal hyperplane that maximizes the margin between two classes. The data points closest to the hyperplane are called support vectors.
Show Answer & Explanation
Correct Answer: B - Bayes' Theorem
Naive Bayes is based on Bayes' Theorem with the 'naive' assumption that features are conditionally independent given the class. It calculates posterior probability for classification.
Show Answer & Explanation
Correct Answer: B - To bring all features to the same scale
Feature scaling normalizes the range of features so that no single feature dominates the model due to its scale. Common methods include Min-Max scaling and Standardization (z-score).
Show Answer & Explanation
Correct Answer: D - Principal Component Analysis
Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms high-dimensional data into a lower-dimensional space while preserving maximum variance.
Show Answer & Explanation
Correct Answer: B - The size of each step toward the minimum
The learning rate controls the size of the steps taken during gradient descent. A too-large learning rate may overshoot the minimum, while a too-small rate results in slow convergence.
Show Answer & Explanation
Correct Answer: C - Classification predicts categories; regression predicts continuous values
Classification predicts discrete categorical labels (e.g., spam/not spam), while regression predicts continuous numerical values (e.g., house price). They are both supervised learning tasks.
Show Answer & Explanation
Correct Answer: D - Too simple model
Underfitting occurs when the model is too simple to capture the underlying patterns in the data. It results in high bias and poor performance on both training and test data.
Show Answer & Explanation
Correct Answer: A - Proportion of variance explained by the model
R-squared measures the proportion of variance in the dependent variable that is explained by the independent variables. A value of 1 indicates a perfect fit, while 0 indicates no predictive power.
Show Answer & Explanation
Correct Answer: C - Classification problems
Despite its name, Logistic Regression is used for binary classification. It uses the sigmoid function to output probabilities and classifies data points into one of two categories.
Show Answer & Explanation
Correct Answer: A - Hold-out Validation
Hold-out validation splits the dataset into training and testing sets only once (typically 70-30 or 80-20). It is simpler but may give biased results compared to cross-validation.
Show Answer & Explanation
Correct Answer: C - Bootstrap Aggregating
Bagging stands for Bootstrap Aggregating. It creates multiple subsets of the training data using bootstrapping (sampling with replacement), trains a model on each, and combines predictions.
Show Answer & Explanation
Correct Answer: A - Data becomes sparse in high-dimensional space
The curse of dimensionality refers to the phenomenon where data becomes increasingly sparse as the number of features (dimensions) increases, making it harder for algorithms to find patterns.
Show Answer & Explanation
Correct Answer: B - Binary Cross-Entropy
Binary Cross-Entropy (Log Loss) is the standard loss function for binary classification. It measures the difference between predicted probabilities and actual binary labels.
Show Answer & Explanation
Correct Answer: A - 5
In 5-fold cross-validation, the data is split into 5 folds. The model is trained 5 times, each time using 4 folds for training and 1 fold for testing. The results are then averaged.
Show Answer & Explanation
Correct Answer: B - One complete pass through the entire training dataset
An epoch refers to one complete pass through the entire training dataset during model training. Multiple epochs are typically needed for the model to converge to optimal parameters.
Show Answer & Explanation
Correct Answer: A - Clustering
Clustering is a type of unsupervised learning that groups similar data points together without using labeled data. K-Means, DBSCAN, and Hierarchical clustering are common methods.
Show Answer & Explanation
Correct Answer: B - TP / (TP + FP)
Precision = TP / (TP + FP). It measures the proportion of predicted positives that are actually positive. High precision means fewer false positives.
Show Answer & Explanation
Correct Answer: A - It builds models sequentially, correcting errors of previous models
Gradient Boosting builds models sequentially, where each new model focuses on correcting the errors of the previous ensemble. This often leads to better accuracy than bagging methods.