Python Basic Interview Questions Compressed 1
Python Basic Interview Questions Compressed 1
Python is one of the most popular programming languages used by data scientists and
AIML professionals. This popularity is due to the following key features of Python:
Keywords in Python are reserved words which are used as identifiers, function name or
variable name. They help define the structure and syntax of the language.
There are a total of 33 keywords in Python 3.7 which can change in the next version, i.e.,
Python 3.8. A list of all the keywords is provided below:
Keywords in Python
as elif if or yield
assert else import pass
break except
Literals in Python refer to the data that is given in a variable or constant. Python has
various kinds of literals including:
Solution ->
tup1 = (1,”a”,True)
tup2 = (4,5,6)
Concatenation of tuples means that we are adding the elements of one tuple at the end
of another tuple.
All you have to do is, use the ‘+’ operator between the two tuples and you’ll get the
concatenated result.
Code
Ans: Functions in Python refer to blocks that have organised, and reusable codes to
perform single, and related events. Functions are important to create better modularity
for applications which reuse high degree of coding. Python has a number of built -in
functions like print(). However, it also allows you to create user-defined functions.
To Install Python, first go to Anaconda.org and click on “Download Anaconda”. Here, you
can download the latest version of Python. After Python is installed, it is a pretty
straightforward process. The next step is to power up an IDE and start coding in Python.
If you wish to learn more about the process, check out this Python Tutorial.
7. What is Python Used For?
Python is one of the most popular programming languages in the world today. Whether
you’re browsing through Google, scrolling through Instagram, watching videos on
YouTube, or listening to music on Spotify, all of these applications make use of Python
for their key programming requirements. Python is used across various platforms,
applications, and services such as web development.
8. How can you initialize a 5*5 numpy array with only zeroes?
Solution ->
9. What is Pandas?
Pandas is an open source python library which has a very rich set of data structures for
data based operations. Pandas with it’s cool features fits in every role of data operation,
whether it be academics or solving complex business problems. Pandas can deal with a
large variety of files and is one of the most important tools to have a grip on.
10. What are dataframes?
A pandas dataframe is a data structure in pandas which is mutable. Pandas has support
for heterogeneous data which is arranged across two axes.( rows and columns).
Here df is a pandas data frame. read_csv() is used to read a comma delimited file as a
dataframe in pandas.
Series is a one dimensional pandas data structure which can data of almost any type. It
resembles an excel column. It supports multiple operations and is used for single
dimensional data operations.
Code
A pandas groupby is a feature supported by pandas which is used to split and group an
object. Like the sql/mysql/oracle groupby it used to group data by classes, entities
which can be further used for aggregation. A dataframe can be grouped by one or more
columns.
Code
<code>df = pd.DataFrame({'Vehicle':['Etios','Lamborghini','Apache200','Pulsar200'],
'Type':["car","car","motorcycle","motorcycle"]}) df</code>
Output
<code>df.groupby('Type').count()</code>
Output
Code
<code>df=pd.DataFrame() bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"] df["cars"]=cars
df["bikes"]=bikes df</code>
Output
Code
Two different data frames can be stacked either horizontally or vertically by the
concat(), append() and join() functions in pandas.
Concat works best when the dataframes have the same columns and can be used for
concatenation of data having similar fields and is basically vertical stacking of
dataframes into a single dataframe.
Join is used when we need to extract data from different dataframes which are having
one or more common columns. The stacking is horizontal in this case.
Before going through the questions, here’s a quick video to help you refresh your
memory on Python.
16. What kind of joins does pandas offer?
Pandas has a left join, inner join, right join and an outer join.
Merging depends on the type and fields of different data frames being merged. If data is
having similar fields data is merged along axis 0 else they are merged along axis 1.
18. Give the below data frame drop all rows having Nan.
<code>df.dropna(inplace=True) df</code>
Output
By using the head(5) function we can get the top five entries of a data frame. By default
df.head() returns the top 5 rows. To get the top n rows df.head(n) will be used.
20. How to access the last five entries of a data frame?
By using tail(5) function we can get the top five entries of a dataframe. By default
df.tail() returns the top 5 rows. To get the last n rows df.tail(n) will be used.
Code
22. What are comments and how can you add comments in
Python?
1. Single-line comment
2. Multiple-line comment
Example
d={“a”:1,”b”:2}
25. Find out the mean, median and standard deviation of this
numpy array -> np.array([1,5,3,100,4,48])
A classifier is used to predict the class of any data point. Classifiers are special
hypotheses that are used to assign class labels to any particular data points. A classifier
often uses training data to understand the relation between input variables and the
class. Classification is a method used in supervised learning in Machine Learning.
All the upper cases in a string can be converted into lowercase by using the method:
string.lower()
ex: string = ‘GREATLEARNING’ print(string.lower())
o/p: greatlearning
We can use the capitalize() function to capitalize the first character of a string. If the
first character is already in capital then it returns the original string.
There are various methods to remove duplicate elements from a list. But, the most
common one is, converting the list into a set by using the set() function and using the
list() function to convert it back to a list, if required. ex: list0 = [2, 6, 4, 7, 4, 6, 7, 2]
list1 = list(set(list0)) print (“The list without duplicates : ” + str(list1)) o/p: The list
without duplicates : [2, 4, 6, 7]
Recursion is a function calling itself one or more times in it body. One very important
condition a recursive function should have to be used in a program is, it should
terminate, else there would be a problem of an infinite loop.
List comprehensions are used for transforming one list into another list. Elements can be
conditionally included in the new list and each element can be transformed as needed. It
consists of an expression leading a for clause, enclosed in brackets. for ex: list = [i for i in
range(1000)]
print list
The bytes() function returns a bytes object. It is used to convert objects into bytes
objects, or create empty bytes object of the specified size.
The map() function in Python is used for applying a function on all elements of a
specified iterable. It consists of two parameters, function and iterable. The function is
taken as an argument and then applied to all the elements of an iterable(passed as the
second argument). An object list is returned as a result.
def add(n):
return n + n number= (15, 25, 35, 45)
res= map(add, num)
print(list(res))
o/p: 30,50,70,90
The two static analysis tool used to find bugs in Python are: Pychecker and Pylint.
Pychecker detects bugs from the source code and warns about its style and
complexity.While, Pylint checks whether the module matches upto a coding standard.
Pass is a statentemen which does nothing when executed. In other words it is a Null
statement. This statement is not ignored by the interpreter, but the statement results in
no operation. It is used when you do not want any command to execute but a statement
is required.
Not all objects can be copied in Python, but most can. We ca use the “=” operator to
copy an obect to a variable.
ex: var=copy.copy(obj)
Modules are the way to structure a program. Each Python program file is a module,
importing other attributes and objects. The folder of a program is a package of modules.
A package can have modules or subfolders.
In Python the object() function returns an empty object. New properties or methods
cannot be added to this object.
len() is used to determine the length of a string, a list, an array, and so on. ex: str =
“greatlearning”
print(len(str))
o/p: 13
Encapsulation means binding the code and the data together. A Python class for
example.
type() is a built-in method which either returns the type of the object or returns a new
type object based on the arguments passed.
ex: a = 100
type(a)
o/p: int
Split fuction is used to split a string into shorter string using defined seperatos. letters =
(” A, B, C”)
n = text.split(“,”)
print(n)
o/p: [‘A’, ‘B’, ‘C’ ]
Boolean: The Boolean data type is a data type that has one of two possible values i.e.
True or False. Note that ‘T’ and ‘F’ are capital letters.
String: A string value is a collection of one or more characters put in single, double or
triple quotes.
List: A list object is an ordered collection of one or more data items which can be of
different types, put in square brackets. A list is mutable and thus can be modified, we
can add, edit or delete individual elements in a list.
Frozen set: They are like a set but immutable, which means we cannot modify their
values once they are created.
Dictionary: A dictionary object is unordered in which there is a key associated with each
value and we can access each value through its key. A collection of such pairs is enclosed
in curly brackets. For example {‘First Name’ : ’Tom’ , ’last name’ : ’Hardy’} Note that
Number values, strings, and tuple are immutable while as List or Dictionary object are
mutable.
52. What is docstring in Python?
Ans. Python docstrings are the string literals enclosed in triple quotes that appear right
after the definition of a function, method, class, or module. These are generally used to
describe the functionality of a particular function, method, class, or module. We can
access these docstrings using the __doc__ attribute. Here is an example:
<code>def square(n): '''Takes in a number n, returns the square of n''' return n**2
print(square.__doc__)</code>
In Python, there are no in-built functions that help us reverse a string. We need to make
use of an array slicing operation for the same.
1 str_reverse = string[::-1]
To check the Python Version in CMD, press CMD + Space. This opens Spotlight. Here,
type “terminal” and press enter. To execute the command, type python –version or
python -V and press enter. This will return the python version in the next line below the
command.
Yes. Python is case sensitive when dealing with identifiers. It is a case sensitive language.
Thus, variable and Variable would not be the same.
Code
grouby() in pandas can be used with multiple aggregate functions. Some of which are
sum(),mean(), count(),std().
Data is divided into groups based on categories and then the data in these individual
groups can be aggregated by the aforementioned functions.
3. How to select columns in pandas and add them to a new
dataframe? What if there are two columns with the same
name?
If df is dataframe in pandas df.columns gives the list of all columns. We can then form
new columns by selecting columns.
If there are two columns with the same name then both columns get copied to the new
dataframe.
Code
Code
<code>d={"col1":[1,2,3],"col2":["A","B","C"]} df=pd.DataFrame(d)
df.dropna(inplace=True) df=df[df.col1!=1] df</code>
Output
6. Given the below dataset find the highest paid player in each
college in each team.
<code>df.groupby(["Team","College"])["Salary"].max()</code>
7. Given the above dataset find the min max and average salary
of a player collegewise and teamwise.
Code
<code>df.groupby(["Team","College"])["Salary"].max.agg([('max','max'),('min','min'),('co
unt','count'),('avg','min')])</code>
Output
8. What is reindexing in pandas?
Code
<code>from functools import reduce sequences = [5, 8, 10, 20, 50, 100] sum = reduce
(lambda x, y: x+y, sequences) print(sum)</code>
Ans. vstack() is a function to align rows vertically. All rows must have same number of
elements.
Code
Spaces can be removed from a string in python by using strip() or replace() functions.
Strip() function is used to remove the leading and trailing white spaces while the
replace() function is used to remove all the white spaces in the string:
o/p: greatlearning
There are three file processing modes in Python: read-only(r), write-only(w), read-
write(rw) and append (a). So, if you are opening a text file in say, read mode. The
preceding modes become “rt” for read-only, “wt” for write and so on. Similarly, a binary
file can be opened by specifying “b” along with the file accessing flags (“r”, “w”, “rw”
and “a”) preceding it.
Pickling is the process of converting a Python object hierarchy into a byte stream for
storing it into a database. It is also known as serialization. Unpickling is the reverse of
pickling. The byte stream is converted back into an object hierarchy.
15. How is memory managed in Python?
Memory management in python comprises of a private heap containing all objects and
data stucture. The heap is managed by the interpreter and the programmer does not
have acess to it at all. The Python memory manger does all the memory allocation.
Moreover, there is an inbuilt garbage collector that recycles and frees memory for the
heap space.
Unittest is a unit testinf framework in Python. It supports sharing of setup and shutdown
code for tests, aggregation of tests into collections,test automation, and independence
of the tests from the reporting framework.
To create an empty class we can use the pass command after the definition of the class
object. A pass is a statement in Python that does nothing.
Ans. Decorators are functions that take another functions as argument to modify its
behaviour without changing the function itself. These are useful when we want to
dynamically increase the functionality of a function without changing it. Here is an
example :
Take up a data science course and power ahead in your career today!
From this dataset, how will you make a bar-plot for the top 5 states having maximum
confirmed cases as of 17=07-2020?
sol:
We start off by taking only the required columns with this command:
df = df[[‘Date’, ‘State/UnionTerritory’,’Cured’,’Deaths’,’Confirmed’]]
Then, we go ahead and rename the columns:
df.columns = [‘date’, ‘state’,’cured’,’deaths’,’confirmed’]
After that, we extract only those records, where the date is equal to 17th July:
today = df[df.date == ‘2020-07-17’]
Then, we go ahead and select the top 5 states with maximum no. of covide cases:
max_confirmed_cases=today.sort_values(by=”confirmed”,ascending=False)
max_confirmed_cases
top_states_confirmed=max_confirmed_cases[0:5]
Here, we are using seaborn library to make the bar-plot. “State” column is mapped onto
the x-axis and “confirmed” column is mapped onto the y-axis. The color of the bars is
being determined by the “state” column.
How can you make a bar-plot for the top-5 states with the most amount of deaths?
Sol:
<code>max_death_cases=today.sort_values(by=”deaths”,ascending=False)
max_death_cases sns.set(rc={‘figure.figsize’:(15,10)})
sns.barplot(x=”state”,y=”deaths”,data=top_states_death,hue=”state”)
plt.show()</code>
Code Explanation:
We start off by sorting our dataframe in descending order w.r.t the “deaths” column:
max_death_cases=today.sort_values(by=”deaths”,ascending=False)
Max_death_cases
Then, we go ahead and make the bar-plot with the help of seaborn library:
sns.set(rc={‘figure.figsize’:(15,10)})
sns.barplot(x=”state”,y=”deaths”,data=top_states_death,hue=”state”)
plt.show()
Here, we are mapping “state” column onto the x-axis and “deaths” column onto the y-
axis.
3. From this covid-19 dataset:
How can you make a line plot indicating the confirmed cases with respect to date?
Sol:
Code Explanation:
We start off by extracting all the records where the state is equal to “Maharashtra”:
sns.set(rc={‘figure.figsize’:(15,10)})
sns.lineplot(x=”date”,y=”confirmed”,data=maha,color=”g”)
plt.show()
Here, we map the “date” column onto the x-axis and “confirmed” column onto y-axis.
How will you implement a linear regression algorithm with “date” as independent
variable and “confirmed” as dependent variable. That is you have to predict the number
of confirmed cases w.r.t date.
Sol:
maha[‘date’]=maha[‘date’].map(dt.datetime.toordinal)
This is done because we cannot build the linear regression algorithm on top of the date
column.
Then, we go ahead and divide the dataset into train and test sets:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3)
lr = LinearRegression()
lr.fit(np.array(x_train).reshape(-1,1),np.array(y_train).reshape(-1,1))
lr.predict(np.array([[737630]]))
Sol:
model = Sequential()
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))
y_pred = model.predict_classes(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
Build a decision tree classification model, where dependent variable is “Species” and
independent variable is “Sepal.Length”.
Sol:
(22+7+9)/(22+2+0+7+7+11+1+1+9)
Code explanation:
y = iris[[‘Species’]]
x = iris[[‘Sepal.Length’]]
Then, we go ahead and divide the data into train and test set:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4)
dtc = DecisionTreeClassifier()
dtc.fit(x_train,y_train)
y_pred=dtc.predict(x_test)
confusion_matrix(y_test,y_pred)
(22+7+9)/(22+2+0+7+7+11+1+1+9)
Sol:
Sol:
<code>import sys import time from bs4 import BeautifulSoup import requests import
pandas as pd try: #use the browser to get the url. This is suspicious command that
might blow up. page=requests.get(‘cricbuzz.com’) # this might throw
an exception if something goes wrong. except Exception as e: # this
describes what to do if an exception is thrown error_type, error_obj, error_info =
sys.exc_info() # get the exception information print (‘ERROR FOR LINK:’,url)
#print the link that cause the problem print (error_type, ‘Line:’, error_info.tb_lineno)
#print error info and line that threw the exception #ignore
this page. Abandon this and go back. time.sleep(2)
soup=BeautifulSoup(page.text,’html.parser’)
links=soup.find_all(‘span’,attrs={‘class’:’w_tle’}) links for i in links: print(i.text)
print(“\n”)</code>
Sol:
Code Explanation:
Data
N_samples
Sample_size
Min_value
Max_value
import pandas as pd
import numpy as np
Then, we go ahead and create the first sub-plot for “Sampling distribution of bmi”:
plt.subplot(1,2,1)
sns.distplot(c.Mean)
plt.xlabel(‘data’)
plt.ylabel(‘freq’)
plt.subplot(1,2,2)
sns.distplot(data)
plt.xlabel(‘data’)
plt.ylabel(‘freq’)
plt.show()
Sol:
sol:
Sol:
Code solution:
import pandas as pd
iris = pd.read_csv(“iris.csv”)
iris.head()
Then, we will go ahead and extract the independent variables and dependent variable:
x = iris[[‘Sepal.Width’,’Petal.Length’,’Petal.Width’]]
y = iris[[‘Sepal.Length’]]
Following which, we divide the data into train and test sets:
lr = LinearRegression()
lr.fit(x_train, y_train)
y_pred = lr.predict(x_test)
mean_squared_error(y_test, y_pred)
Find the percentage of transactions which are fraudulent and not fraudulent. Also build
a logistic regression model, to find out if the transaction is fraudulent or not.
Sol:
Sol:
Sol:
class Node(object):
def __init__(self):
self.x=0
self.y=0
Example
class Node(object):
def __init__(self):
self.x=0
self.y=0
<code>If class A inherits from B and C inherits from A it’s called multilevel inheritance.
class B(object): def __init__(self): self.b=0 class A(B): def __init__(self): self.a=0
class C(A): def __init__(self): self.c=0</code>
1. How can you find the minimum and maximum values present
in a tuple?
Solution ->
We can use the min() function on top of the tuple to find out the minimum value present
in the tuple:
<code>tup1=(1,2,3,4,5) min(tup1) </code>
Output
Analogous to the min() function is the max() function, which will help us to find out the
maximum value present in the tuple:
2. If you have a list like this -> [1,”a”,2,”b”,3,”c”]. How can you
access the 2nd, 4th and 5th elements from this list?
Solution ->
We will start off by creating a tuple which will comprise of the indices of elements which
we want to access:
Then, we will use a for loop to go through the index values and print them out:
Solution ->
<code>a.reverse() a</code>
Solution ->
<code>fruit["Apple"]=100 fruit</code>
Give in the name of the key inside the parenthesis and assign it a new value.
Solution ->
You can use the intersection() function to find the common elements between the two
sets:
Solution ->
Code
We start off by initializing two variables ‘i’ and ‘n’. ‘i’ is initialized to 1 and ‘n’ is
initialized to ‘2’.
Inside the while loop, since the ‘i’ value goes from 1 to 10, the loop iterates 10 times.
Then, ‘i’ value is incremented and n*i becomes 2*2. We go ahead and print it out.
This process goes on until i value becomes 10.
Solution ->
even_odd(5)
We see that, when 5 is passed as a parameter into the function, we get the output -> ‘5
is odd’.
Solution ->
After that, we check,if ‘num’ is equal to zero, and it that’s the case, we print out ‘The
factorial of 0 is 1’.
On the other hand, if ‘num’ is greater than 1, we enter the for loop and calculate the
factorial of the number.
Solution ->
Below is the code to Check whether the given number is palindrome or not:
Then, we will enter a while loop which will go on until ‘n’ becomes 0.
Inside the loop, we will start off by dividing ‘n’ with 10 and then store the remainder in
‘dig’.
Then, we will multiply ‘rev’ with 10 and then add ‘dig’ to it. This result will be stored
back in ‘rev’.
Going ahead, we will divide ‘n’ by 10 and store the result back in ‘n’
Once the for loop ends, we will compare the values of ‘rev’ and ‘temp’. If they are equal,
we will print ‘The number is a palindrome’, else we will print ‘The number isn’t a
palindrome’.
22
333
4444
55555
Solution ->
<code>#10 is the total number to print for num in range(6): for i in range(num):
print(num,end=" ")#print number #new line after each row to display pattern correctly
print("\n")</code>
We are solving the problem with the help of nested for loop. We will have an outer for
loop, which goes from 1 to 5. Then, we have an inner for loop, which would print the
respective numbers.
#
##
###
####
#####
Solution –>
#
##
###
####
#####
Solution –>
0
01
012
0123
01234
Solution –>
<code>Code: def pattern_3(num): # initialising starting number number = 1
# outer loop always handles the number of rows # let us use the inner loop to control
the number for i in range(0, num): # re assigning number after every
iteration # ensure the column starts from 0 number = 0 # inner loop to
handle number of columns for j in range(0, i+1): # printing number
print(number, end=" ") # increment number column wise number =
number + 1 # ending line after each row print("\r") num = int(input("Enter
the number of rows in pattern: ")) pattern_3(num)</code>
1
23
456
7 8 9 10
11 12 13 14 15
Solution –>
A
BB
CCC
DDDD
Solution –>
<code>def pattern_5(num): # initializing value of A as 65 # ASCII value equivalent
number = 65 # outer loop always handles the number of rows for i in range(0,
num): # inner loop handles the number of columns for j in range(0, i+1):
# finding the ascii equivalent of the number char = chr(number) #
printing char value print(char, end=" ") # incrementing number
number = number + 1 # ending line after each row print(" \r") num =
int(input("Enter the number of rows in pattern: ")) pattern_5(num)< /code>
A
BC
DEF
GHIJ
KLMNO
PQRSTU
Solution –>
#
##
###
####
#####
Solution –>
<code>Code: def pattern_7(num): # number of spaces is a func tion of the input
num k = 2*num - 2 # outer loop always handle the number of rows for i in
range(0, num): # inner loop used to handle the number of spaces for j in
range(0, k): print(end=" ") # the variable holding information about
number of spaces # is decremented after every iteration k=k -1 #
inner loop reinitialized to handle the number of columns for j in range(0, i+1):
# printing hash print("# ", end="") # ending line after each row
print("\r") num = int(input("Enter the number of rows: ")) pattern_7(n)</code>
Code
Code
Ans. To generate a random, we use a random module of python. Here are some
examples To generate a floating-point number from 0-1