Machine Learning Solved Question Paper – To Score Better – RGPV (CS-8003)

Introduction

This article ‘RGPV (CS-8003) – Machine Learning – Solved Question Paper – To Score Better’ has answer for the questions asked in the above mentioned RGPV IT semester 8 Machine Learning question paper.

The articles includes K-Means algorithm, time series, decision trees and boosting, linear quadratic regulation, independent component analysis, real world ML, back propagation algorithms, support vector machine, steps in machine learning, bayes theorem, hidden markov model, reinforcement learning, association rules, principal component analysis, big data and map reduce, common softwares for ML, subset selection etc.

CS-8003(1)-CBGS
B.E. VIII Semester
Examination, December 2020
Choice Based Grading System (CBGS)
Machine Learning
Time : Three Hours
Maximum Marks : 70
Note: i) Attempt any five questions.
ii) All questions carry equal marks.

1.a) Explain K-Means algorithm with suitable example.

K-Means Algorithm:

  • Unsupervised clustering algorithm where unsupervised means doing clustering without any test data or unlabelled data.
  • Use: To solve clustering problems.
  • Unlike classification which learns form test data, clustering makes rule on it’s own.
  • Where k in k-means show number clusters to be formed out of data for e.g.
    • k=5 means 5 clusters are required.
    • k=3 means 3 clusters are required and so on.
  • Clusters are formed based on centroid i.e. points lying around a centroid comes under one cluster.
  • Value of k in this algorithm is pre-determined
  • Examples or real world applications:
    • Sentence clustering i.e in NLP to cluster same type of sentences together thus reducing manual effort to do so
    • Customer segmentation or market segmentation based on common attributes i.e. clustering paying customers into one category, or clustering them based on same region or area of interest etc.

Source: javatpoint | analyticsvidhya | wikipedia


1.b) Discuss in briefly about time series in ML.

Time Series in Machine Learning:

  • It is part of predictive analytics
  • Date or timestamp is the primary variable required
  • Time series algorithm is used to forecast or predict output in future dates
  • Example: Stock price prediction
  • So, available data is very required to predict the future outcome
  • Models used for time series forecasting:
    • ARMA: Autoregressive moving average
    • ARIMA: Autoregressive integrated moving average
  • Benefits:
    • Helps in analysing the systematic patterns over time
  • Applications:
    • Weather forecasting
    • Stock price forecasting
    • Sales forecasting etc.

Source: tutorialspoint | tableau

2.a) Give detailed discussion on decision trees and boosting.

Decision Tree:

  • Tree like flow structure is called as decision tree.
  • It has nodes and leafs, where nodes is test of a condition and leafs represents the outcome after that condition is tested.
    • E.g. Node test condition can whether the number >5 or not – so there can be two available outputs True of False.
      • True if number is >5.
      • False if number is <=5.
    • Another example if on flipping a coin the result will be Heads or Tails.
  • Is a classification algorithm.
  • Is one of the good algorithm to identify the best route to reach to any goal based on different if-else conditions.
  • Advantage is it’s quite easy to understand and can give 2-3 different possible ways to reach to the goal with posisbility of each one of them.
  • Although, there are many better algorithms also available.

Source:geekforgeeks | wikipedia

Boosting in Decision Tree:

  • Boosting is combining of several weak learners or trees in series to form a strong classifier.
  • Benefits of combining various weak learners is that each tree or learner try to reduce the error from pervious tree or learner.
  • Boosting is a way to improve accuracy of decision trees
  • Examples:
    • Adaboost
    • Gradient Boosting
    • XGBoost

Source: Indico | towardsdatascience

2.b) Explain linear quadratic regulation.

Linear Quadratic Regulation:

  • Is related to minimising the cost of dynamic system, where
    • System dynamics are described by linear differential equations.
    • Cost described by quadratic equation.
  • Reason why linear quadratic regulation is important is that there are very few solutions available for optimisation of dynamic problems and available solution are quite complex, whereas linear quadratic regulation is quite simple and easy w.r.t to those available solutions.
  • Refer the general description part from the below wikipedia link, where it is very nicely explained i.e.
    • How a machine’s or process’s cost function is minimised.
    • Importance of human being who provides with weighting actors to optimise cost function.
    • And on what factors cost function is dependent on like temperature, altitude, pressure etc.
    • Benefits includes decrease in work done by engineers or manual effort by optimising controller parameters to operate it efficiently.
    • But, although the engineer optimises the regulators or controllers the human intelligence is still required to operate it optimally by providing optimal inputs.
  • Second link has all the mathematical calculations behind it for finding LQR solution.

Source: wikipedia | kostasalexis

3.a) Explain working principle of Independent components analysis.

Independent Components Analysis:

  • Method for separating a multivariate signal into additive subcomponents.
  • Assumption: Subcomponents are statistically independent of each other and these are non-gaussian.
  • Application:
    • Face recognition
    • Stock market price prediction
    • Colour based detection of the ripeness of tomatoes etc.
  • While principal component analysis deals with principal components ICA deals with independent components (source: ML | Independent Component Analysis)

For understanding the mathematics behind it and other information related to it refer the wikipedia article link in the source.

Second link tries to show the usage of this visually as well using example – refer that in case more information is required.

Source: Wikipedia | geekforgeeks

3.b) Give short notes on Real world ML.

Real World ML Applications:

  • Healthcare:
    • For analysis of data of effects of different doses of medicines.
  • Social Media:
    • Friends suggestion on social media like facebook based on mutual connections.
  • Mobile unlock using face recognition
  • Uber self-driving project:
    • Uber is collecting data since a longtime and running it’s algorithm over it to make a flawless model for self-driving cars.
  • Natural Language Processing:
    • Extracting useful information from text data.
    • Finding spelling mistakes and improper sentence formations e.g. grammarly, suggestions for spelling and sentences correction in gmail while writing mails.
    • Google’s search algorithm where even if someone types incorrect sentences or words they get correct results or google shows them correct results recommendation.
  • Speech to text conversion.
  • Voice based smart assistants like alexa, siri etc.
  • Route optimisation like google maps where for going through point A to B they show best possible route based on minimum traffic and time.
  • Netflix movie recommendation engine which shows based on past history of user.

Source: hackr.io | simplilearn

4.a) Explain how back propagation algorithms helps in classification.

  • Back propagation algorithm is a type of artificial neural network, where neural network has connected nodes and each connection has different weights.
  • Traditional neural networks were fine but with time new improvements in that happened and so that is when back propagation comes into being.
  • In back propagation fine tuning of weights happens based on error rate from previous iteration, so with every new iteration error decreases and output of model improves.
  • Types of Back-propagation Networks:
    • Static back-propagation
    • Recurrent back-propagation
  • Advantage:
    • Remove weighted links with minimal effect.
    • Simple and easy to program.
  • Disadvantage:
    • Sensitive to noisy data.

Source: guru99 | brainkart

4.b) Explain the steps in developing a machine learning algorithm.

Steps involved in developing a machine learning algorithm:

  • Data collection:
    • Collecting data from different sources if data is captured already or do the necessary instrumentation to capture the data.
    • Make a pipeline to store the data in database that can be fetched to model code for training.
  • Data preparation:
    • Do basic data analysis to remove noise and outliers from the data as that can mess up with final results.
  • Model selection:
    • Select model or algorithm that seems suitable as per the problem statement.
  • Model training:
    • Partitioning the data into training and testing set.
    • Training the data on training set.
  • Model evaluation:
    • Testing the model output by running it on testing set and evaluating the model output.
  • Parameter tuning
    • Fine tuning the model parameters to improve final results.
  • Final predictions using model
    • Making the final prediction on data or deploying the model on production.

Source: simplilearn | analyticsindiamag

5.a) What is the goal of support vector machine? How to compute the margin?

Support Vector Machine Algorithm

  • Type: Supervised algorithm i.e. needs labelled data to train model and create rules.
  • Usage: Classification as well as regression i.e. in classification it classifies data into separate classes.
  • Visually it creates a best fit line or boundary or hyperplane between data points separating data points into different classes.
  • In future when a new data point comes in it gets classified into any one side of the boundary.
  • Application:
  • Types:
    • Linear SVM: Used in cases where data can be classified linearly.
    • Non-Linear SVM: Used in cases where data can’t be classified linearly.

Margin:

  • There can be multiple lines or hyperplanes possible that can separate the data points into classes. but the line with maximum margin is the best hyperplane.
  • And the SVM algorithm helps in identifying that best hyperplane only.
  • SVM algorithm finds out difference with all data points and tries to maximise that margin to find the best fit hyperplane.

Source: javatpoint | Support-vector machine – wikipedia | SVM: Difference between Linear and Non-Linear Models

5.b) Explain Bayes theorem.

Bayes Theorem:

  • Mathematical formula or algorithm to find conditional probability i.e. what is the probability of occurrence of second event given the first event has happened already.
  • Bayes theorem incorporates prior probability to find probability of events going to happen later.
  • Mathematical formula: P(A|B)=P(B|A)P(A)P(B)
  • Application:
    • A solution method for puzzles.
    • Finding defective item rate in a company using multiple machines.
    • Finding cancer rate based on symptoms.
  • Example:
    • Probability of getting tail in 2nd toss when result of 1st toss was heads.
    • Gets used in finance as well to predict probabilities of occurrence of events.

Refer the 3rd article in source below to see some real examples of how the calculation of actually works. for e.g.

Source: investopedia | A Gentle Introduction to Bayes Theorem for Machine Learning | cuemath | wikipedia

6.a) Explain Hidden Markov model.

Hidden Markov Model:

  • It’s a statistical model to model sequential data.
  • Goal of this model is to uncover hidden sequence or pattern that is not otherwise easily observable and that is where this statistical approach comes into play.
  • Applications:

Second article mentioned in source below has a list of possible areas where this machine learning model can be applied.

Source: Hidden Markov Model: Simple Definition & Overview | Hidden Markov model

6.b) Discuss in brief elements of reinforcement learning.

Reinforcement Learning:

  • It learn by performing actions or by doing certain actions and each action has either positive or negative feedback.
  • So, from these feedbacks the model learns and hence this does not requires prior labelled data instead.
  • Basically three thing happens in reinforcement learning using which the machine explores the environment where it is operating:
    1. Taking action
    2. Remaining in same state or changing it
    3. Getting feedback
  • So, reinforcement learning is all about hit and trial method.
  • Types:
    • Positive reinforcement learning: Adding something for making the occurrence of some required behaviour again and again.
    • Negative reinforcement learning: It removes negative condition to increase the occurrence of some behaviour again and again.
  • Example:

Source: Javatpoint | Difference between Reinforcement learning and Supervised learning: 

7.a) Explain different association rules with algorithms.

Association Rules:

  • Association rules is used to find correlation between variables in a dataset i.e. how certain variables are correlated with each other and how change in one impacts the other one.
  • So, it’s interesting to find the correlation between complex variables or why some event has happened i.e.
    • What has changed that lead to this lead to this behaviour and
      • How that can be ignored in future or
      • How something can be replicated again and again by knowing the causation of it.
  • So it’s an interesting method to find:
    • Discovering patterns in data,
    • Classification of data,
    • Data interconnections etc.
  • Application:
    • In medicine to find effect of any medicine or symptoms leading to some condition.
    • Market basket analysis e.g. person who buys milk also buys bread, eggs etc.
    • Web usage mining i.e. person who is clicking on first button will take x journey most probably.

Source: Javatpoint | Wikipedia

7.b) Explain principle component analysis with algorithm.

Principal Component Analysis (PCA):

  • Principle component analysis is used to reduce the dimensionality of data and while doing that not loosing the important information.
  • It is always used while creating machine learning models where is dataset is very large then it leads to
    1. More time required to train model on data
    2. More cost of data processing
    3. Unimportant data is an input for the model so final output is also not reliable i.e. garbage in = garbage out.
  • So, to avoid the 3rd point mentioned above principal component analysis is used (Note: the other two points mentioned above are an side advantage of using PCA).
  • Limitation:
    • PCA relies on linear model, so if dataset has some non-linear patterns it will not work on that.

Wikipedia article mentioned in the source below tells about it’s application as well in intelligence, population genetics etc. i.e. to summarise data on variation in human gene frequencies across regions.

Source: Wikipedia | Javatpoint

8.Write short notes on any two.
i) Big data and map reduce

  • MapReduce is the model to process data in parallel on multiple nodes.
  • Problem with traditional / old systems:
    • Not suitable to process large datasets as these used to have one centralised server for processing as well as data storage.
  • How MapReduce works:
    • In mapReduce one task is divided into many sub-tasks and each one is assigned to different computer. Once processing is completed the results are integrated into one.
  • Real world applications:
    • Index building for Google search and Yahoo
    • Spam detection for Yahoo search and Facebook etc.
  • Example:
    • MapReduce algorithm actions on data: tokenize > filter > count > aggregate counters. Refer the link for detailed explanation on it.

Source: tutorialspoint | analyticsvidhya

ii) Common software for MC.

Common softwares for machine learning are:

iii) Subset selection

  • In datasets with many parameters or attributes it is not possible to consider all of them while creating a model as
    • Not every parameter or attribute is important.
    • Considering all of them while model training increases complexity.
    • More processing power is required while training the model hence cost increases.
    • Most of the attributes are not relevant so including them in model makes no sense.
  • So, to avoid above mentioned problems subset selection of feature selection is done.
  • So, before training the model on any data it’s better to do feature selection based on:
    • Feature relevance
    • Feature redundancy

There are ways to find out feature relevance and redundancy refer the article mentioned below for the same.

Source: geekforgeeks | javatpoint

Similar Articles like “RGPV (IT-8001) – Information Security – Solved Question Paper – To Score Better”:

Final Words:

So, hope this article “RGPV (IT-8001) – Information Security – Solved Question Paper – To Score Better” will help in getting better marks in your Information Security exam, if yes then please let us know in comments how it help you and what other questions you want us to answer.