0% found this document useful (1 vote)
333 views3 pages

Question Bank Python For Data Science

This document contains a question bank covering various topics related to data science and machine learning. It includes questions on Python libraries like NumPy, Pandas and Matplotlib, data types, data visualization, machine learning algorithms and concepts like regression, classification, clustering, dimensionality reduction and more. The questions range from short ones requiring a few words/lines of code to longer questions requiring detailed explanations.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
0% found this document useful (1 vote)
333 views3 pages

Question Bank Python For Data Science

This document contains a question bank covering various topics related to data science and machine learning. It includes questions on Python libraries like NumPy, Pandas and Matplotlib, data types, data visualization, machine learning algorithms and concepts like regression, classification, clustering, dimensionality reduction and more. The questions range from short ones requiring a few words/lines of code to longer questions requiring detailed explanations.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 3

Question Bank (PDS)

Short Type
1. Write some associated libraries related to Data science.
2. What is the advantage of seaborn over matplotlib?
3. What is the use of Numpy.
4. Differentiate between regression and classification.
5. Differentiate data analysis and analytic.
6. Write a command to read a csv file while using Jupyter Notebook?
7. Write a command to import the required library for splitting the dataset into train and test
data?
8. Write the code to plot a graph between two variable x and y using matplotlib.
9. What is the basic difference between List and Tuples in Python?
10. Create a Pandas Series that contains the following data: 4, 8, 15, 16, 23, and 42. Then,
print the series.
11. Why do you need feature scaling?
12. Create an identity matrix of shape (5,5) using numpy functions?
13. What is ‘K’ in K-means algorithm?
14. Create a Pandas DataFrame that contains the following data:

Name Age Gender


Rama 25 Male
Laxman 22 Male
Sita 24 Female

15. Write a code to create numpy array of 3*3 matrix containing zeros only, using a numpy
array creation function.
16. What is the use of Numpy?
17. What is the difference between Numpy and Pandas?
18. Write a python program to print the first 10 Fibonacci numbers using a while loop.
19. Which function of pandas do we use to read an excel, CSV and weblink file? Write the
command of each type.
20. Given a Pandas DataFrame df with columns 'A', 'B', and 'C', write a Python function to
re-index the DataFrame with a new index that starts from 1 and increments by 2 for each
row.
21. Write a command to import the required library for splitting the dataset into train and test
data?
22. Create a variable of list type containing 10 elements in it, and apply pandas.Series
function on the variable print it.
23. How can you split the data set into x and y variable using ‘iloc’ command?
24. Which of the following is mutable in nature Series, DataFrame, Panel?
25. Write the steps involved to implement the machine learning algorithms.
26. What is ‘DataFrame’ in pandas and how is it different from pandas.series? Explain with
an example.
27. Write the difference between supervised and unsupervised ML.
28. Write the difference between clustering and association in unsupervised ML.
29. Write the different clustering methods available in unsupervised ML.
30. Write a code to find the following characteristics of variable, num_array:
(i) shape (ii) size

Focused
1. What are the different plots can be drawn using Seaborn library.
2. Write the operators associated with LIST of Python.
3. What is the confusion matrix and how can you plot using sklearn library?
4. Write the short notes on swarm plot.
5. How can you split the data set into x and y variable using ‘iloc’ command?
6. Explain the use of the following operators using an example:
a. /, %, //, *, **
7. Create a list of length 10 of your choice containing multiple types of data. Using for loop
print the element and its data type.
8. Which keyword is used to create a function? Create a function to return a list of odd
numbers in the range of 1 to 25.
9. What are the types of supervised learning available in ML?
10. What are the types of unsupervised learning available in ML?
11. What is Entropy?
12. What are the terminologies used in Decision Tree algorithm?
13. What are Euclidean and Manhattan distance? And where it is used?
14. What are some common functions you can use to manipulate data in a Pandas
DataFrame? Can you give an example of when you might use one of these functions?
15. What are the methods available to find the distance between two points? Explain each
with suitable example with formula.
16. What is data wrangling?
17. What is bias and variance and what are the different combination
18. Name any five plots that we can plot using the Seaborn library. Also, state the uses of
each plot.
19. Describe about IRIS data set.
20. What are over and under fitting in ML?
Long Type
1. How can you evaluate a regression model using different terminology? Explain with
sklearn library codes.
2. What are the methods associated with LIST data type of Python?
3. What are the functions associated with TUPLE data type of Python?
4. Give a brief discussion about all the data types used in Python.
5. What is the local and global with related to both variable and function and also explain
the use of ‘Global’ keyword.
6. Explain Call by Value and Call by Reference method with related to Python language?
7. What are the regression evaluation metrics? Explain with appropriate formula.
8. What is a generator function? Explain with an example.
9. What is a bar plot? Why is it used? Using the following data plot a bar plot and a
horizontal bar plot.
import numpy as np
company = np.array(["Apple", "Microsoft", "Google", "AMD"])
profit = np.array([3000, 8000, 1000, 10000])

10. What is a box plot? Why is it used? Using the following data plot a box plot.
box1 = np.random.normal(100, 10, 200)
box2 = np.random.normal(90, 20, 200)

11. What are the dimensionality reduction techniques used in ML? Explain PCA briefly.
12. Apply k-means clustering to solve the following data points.
A1 (2,10), A2(2,5), A3(8,4), B1(5,8), B2(7,5), B3(6,4), C1(1,2), C2(4,9).
Use Euclidean distance method.
13. Write the K-Means algorithm?
14. What is data cleaning and how it is related to NaNs?
15. Write the command to plot the following datasets using Matplotlib library with all details.
Semester = [1,2,3,4,5,6,7,8,9,10]
Grade =[2,4.5,1,2,3.5,2,1,2,3,2]

You might also like