ANU UG/Degree 4th Sem(Y23) Machine Learning Using Python Unit Wise Important Questions are now available, these questions are very important for your semester exams. These questions are prepared by top qualified faculty. Read these questions for good marks.
UNIT I INTRODUCTION TO MACHINE LEARNING
Review of Linear Algebra for machine learning; Introduction and motivation for machine learning; Examples of machine learning applications, Vapnik-Chervonenkis (VC) dimension, Probably Approximately Correct (PAC) learning, Hypothesis spaces, Inductive bias, Generalization, Bias variance trade-off.
Short Answer Questions
- What role does linear algebra play in machine learning, and why is it essential for model formulation and computation?
- What are the main motivations behind developing and applying machine learning techniques?
- Provide an example of a machine learning application and discuss its impact on a specific industry or field.
- Define the Vapnik-Chervonenkis (VC) dimension and explain its significance in evaluating model complexity.
- Explain the concept of Probably Approximately Correct (PAC) learning and its role in assessing learning algorithms.
- Describe the concepts of hypothesis spaces and inductive bias, and explain how the bias-variance trade-off affects model generalization.
Long Answer Questions
- Discuss the importance of linear algebra in machine learning, including its role in algorithms such as principal component analysis (PCA) and support vector machines (SVM).
- Explain the key motivations behind machine learning, focusing on the need for automation, handling large datasets, and uncovering hidden patterns.
- Explore various machine learning applications across different industries, and analyze how these applications transform decision-making and operational efficiency.
- Provide a detailed explanation of the Vapnik-Chervonenkis (VC) dimension and Probably Approximately Correct (PAC) learning, discussing how these theoretical frameworks inform model complexity and reliability.
- Define hypothesis spaces and inductive bias in the context of machine learning, and discuss how these factors influence the learning process and model generalization.
- Elaborate on the bias-variance trade-off, explaining its impact on overfitting and underfitting, and discuss strategies for achieving optimal generalization in machine learning models.
UNIT II SUPERVISED LEARNING
Linear Regression Models: Least squares, single & multiple variables, Bayesian linear regression, gradient descent, Linear Classification Models: Discriminant function – Perceptron algorithm, Probabilistic discriminative model - Logistic regression, Probabilistic generative model – Naive Bayes, Maximum margin classifier – Support vector machine, Decision Tree, Random Forests
Short Answer Questions
- Define the least squares method in linear regression and explain its primary objective.
- Compare single-variable and multiple-variable linear regression models.
- What is Bayesian linear regression and how does it differ from classical linear regression?
- Describe the role of gradient descent in optimizing linear regression models.
- Explain the perceptron algorithm and its use as a linear classification method.
- Contrast logistic regression (a probabilistic discriminative model) with Naive Bayes (a probabilistic generative model) in terms of their classification approaches.
Long Answer Questions
- Discuss the fundamental concepts of linear regression, including the least squares method, differences between single-variable and multiple-variable models, Bayesian linear regression, and the role of gradient descent in model optimization.
- Compare and contrast linear classification techniques by explaining the discriminant function and the perceptron algorithm, including their strengths and limitations.
- Analyze logistic regression as a probabilistic discriminative model, detailing its theoretical foundations, parameter estimation, and practical applications in classification.
- Explain the principles of Naive Bayes as a probabilistic generative model, discussing its assumptions, implementation, and scenarios where it performs effectively.
- Elaborate on the concept of maximum margin classifiers by discussing support vector machines, including how they determine decision boundaries and improve classification accuracy.
- Evaluate decision trees and random forests as supervised learning methods, focusing on their methodologies, advantages, limitations, and comparative performance in various classification tasks.
UNIT III ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING
Combining multiple learners: Model combination schemes, Voting, Ensemble Learning - bagging, boosting, stacking, Unsupervised learning: K-means, Instance Based Learning: KNN, Gaussian mixture models and Expectation maximization.
Short Answer Questions
- What is ensemble learning, and how does the voting scheme work as a model combination method?
- Define bagging and explain its role in reducing variance in ensemble techniques.
- How does boosting improve model performance in an ensemble learning framework?
- What is stacking in ensemble learning, and how does it differ from bagging and boosting?
- Describe the basic objective of K-means clustering in unsupervised learning.
- How do Gaussian mixture models utilize the Expectation Maximization algorithm in clustering tasks?
Long Answer Questions
- Discuss various model combination schemes used in ensemble learning—including voting, bagging, boosting, and stacking—and explain how each method enhances predictive performance.
- Compare and contrast bagging and boosting, outlining their mechanisms, advantages, and potential drawbacks in ensemble methods.
- Explain the concept of stacking as an ensemble learning technique, detailing its structure, benefits, and challenges when integrating multiple learners.
- Provide a detailed explanation of the K-means clustering algorithm, including its step-by-step process, applications, and limitations in handling different data types.
- Elaborate on the principles of Gaussian mixture models and the role of the Expectation Maximization algorithm in clustering. Include a discussion on how these methods address data heterogeneity.
- Analyze the significance of instance-based learning methods, such as KNN, in the context of data analysis, and compare their use with model-based unsupervised techniques like Gaussian mixture models.
UNIT IV NEURAL NETWORKS
Multilayer perceptron, activation functions, network training – gradient descent optimization – stochastic gradient descent, error backpropagation, from shallow networks to deep networks –Unit saturation (aka the vanishing gradient problem) – ReLU, hyperparameter tuning, batch normalization, regularization, dropout
Short Answer Questions
- What is a multilayer perceptron, and how does it differ from a single-layer perceptron?
- Name two common activation functions and describe their roles in neural networks.
- What is gradient descent optimization, and why is it critical in training neural networks?
- How does stochastic gradient descent differ from standard gradient descent?
- What is error backpropagation, and how does it facilitate weight updates in a neural network?
Define unit saturation (vanishing gradient problem) and explain its impact on deep network training.
Long Answer Questions
- Discuss the architecture of a multilayer perceptron, including the role of activation functions such as ReLU, and explain how these components contribute to learning complex patterns.
- Explain the process of network training in neural networks, focusing on gradient descent optimization and stochastic gradient descent, and describe how these methods help minimize the error function.
- Elaborate on the error backpropagation algorithm, detailing its steps and how it propagates errors backward through the network to update weights effectively.
- Analyze the vanishing gradient problem (unit saturation) in deep neural networks, discussing its causes, the challenges it presents, and strategies to overcome it.
- Compare and contrast various activation functions (e.g., sigmoid, tanh, ReLU), discussing their mathematical properties and the implications for network performance and convergence.
- Discuss the importance of hyperparameter tuning, batch normalization, regularization, and dropout in neural network training, explaining how each technique improves generalization and mitigates overfitting.
UNIT V DESIGN AND ANALYSIS OF MACHINE LEARNING EXPERIMENTS
Guidelines for machine learning experiments, Cross Validation (CV) and resampling – K-fold CV, bootstrapping, measuring classifier performance, assessing a single classification algorithm and comparing two classification algorithms – t test, McNemar’s test, K-fold CV paired t test
Short Answer Questions
- What are the key guidelines for designing machine learning experiments, and why are they important for reproducibility and validity?
- Define cross-validation and explain its role in assessing model performance.
- What is K-fold cross-validation, and what are its main advantages over other resampling methods?
- How does bootstrapping differ from K-fold cross-validation in evaluating a classifier's performance?
- What are some common metrics used for measuring classifier performance, and why is it important to use multiple metrics?
- Explain the purpose of statistical tests such as the t-test and McNemar’s test in comparing classification algorithms.
Long Answer Questions
- Discuss the key guidelines for designing and analyzing machine learning experiments, including considerations for data splitting, parameter tuning, and ensuring experimental reproducibility.
- Explain the concept of cross-validation in detail, comparing K-fold cross-validation and bootstrapping. Highlight the advantages and limitations of each method.
- Describe various classifier performance metrics (such as accuracy, precision, recall, F1 score, and ROC-AUC). How do these metrics contribute to a comprehensive evaluation of a model?
- Elaborate on the process of assessing the performance of a single classification algorithm using cross-validation and resampling techniques. What are the potential pitfalls and how can they be mitigated?
- Compare and contrast the use of the t-test, McNemar’s test, and the K-fold CV paired t-test in evaluating and comparing the performance of two classification algorithms. Discuss their assumptions and applicability.
- Analyze the importance of resampling methods in machine learning experiments. How do techniques like cross-validation and bootstrapping improve the reliability and generalizability of model performance estimates?
0comments:
Post a Comment
Note: only a member of this blog may post a comment.