Crop Prediction System using Machine Learning
Crop Prediction System using Machine Learning
72 e-ISSN : 2348-4470
p-ISSN : 2348-6406
Abstract —India being an agricultural country, its economy predominantly depends on agriculture yield growth and
allied agro industry products. In India, agriculture is largely influenced by rainwater which is highly unpredictable.
Agriculture growth also depends on diverse soil parameters, namely Nitrogen, Phosphorus, Potassium, Crop rotation,
Soil moisture, Surface temperature and also on weather aspects which include temperature, rainfall, etc. India now is
rapidly progressing towards technical development. Thus, technology will prove to be beneficial to agriculture which
will increase crop productivity resulting in better yields to the farmer. The proposed project provides a solution for
Smart Agriculture by monitoring the agricultural field which can assist the farmers in increasing productivity to a great
extent. Weather forecast data obtained from IMD (Indian Metrological Department) such as temperature and rainfall
and soil parameters repository gives insight into which crops are suitable to be cultivated in a particular area. This work
presents a system, in form of an android based application, which uses data analytics techniques in order to predict the
most profitable crop in the current weather and soil conditions. The proposed system will integrate the data obtained
from repository, weather department and by applying machine learning algorithm: Multiple Linear Regression, a
prediction of most suitable crops according to current environmental conditions is made. This provides a farmer with
variety of options of crops that can be cultivated. Thus, the project develops a system by integrating data from various
sources, data analytics, prediction analysis which can improve crop yield productivity and increase the profit margins of
farmer helping them over a longer run.
Keywords-Data Analytics, Prediction, Machine learning, Multiple linear regression, android application.
I. INTRODUCTION
Agriculture is one of the most important occupation practiced in our country. It is the broadest economic sector and plays
an important role in overall development of the country. About 60 % of the land in the country is used for agriculture in
order to suffice the needs of 1.2 billion people. Thus, modernization of agriculture is very important and thus will lead
the farmers of our country towards profit. [1]
Data analytic (DA) is the process of examining data sets in order to draw conclusions about the information they contain,
increasingly with the aid of specialized systems and software.[2] Earlier yield prediction was performed by considering
the farmer's experience on a particular field and crop. However, as the conditions change day by day very rapidly,
farmers are forced to cultivate more and more crops. Being this as the current situation, many of them don’t have enough
knowledge about the new crops and are not completely aware of the benefits they get while farming them. Also, the farm
productivity can be increased by understanding and forecasting crop performance in a variety of environmental
conditions. Thus, the proposed system takes the location of the user as an input. From the location, the nutrients of the
soil such as Nitrogen, Phosphorous, Potassium is obtained. The processing part also take into consideration two more
datasets i.e. one obtained from weather department, forecasting the weather expected in current year and the other data
being static data. This static data is the crop production and data related to demands of various crops obtained from
various government websites. The proposed system applies machine learning and prediction algorithm like Multiple
Linear Regression to identify the pattern among data and then process it as per input conditions. This in turn will propose
the best feasible crops according to given environmental conditions.Thus, this system will only require the location of the
user and it will suggest number of profitable crops providing a choice directly to the farmer about which crop to cultivate.
As past year production is also taken into account, the prediction will be more accurate.
N. Hemageetha[6]
This paper discusses various data mining techniques like Market based Analysis, Association Rule Mining, Decision
Trees, Classification and Clustering. It entirely covers Data Mining concept. Various data mining algorithms such as
Naive Bayes classifier, J48, K-Mean are explained in this paper. It also provides classification of soil based on Naive
Bayes, Genetic algorithm, Association Rule Mining. Eventually, it covers Clustering in soil database. This paper helped
us in understanding and analysis of different data mining algorithms and classification mechanisms. This will prove to be
extremely beneficiary while developing our project and will help in mining the dataset obtained from sensors employed
remotely.
AwanitKumar,Shiv Kumar[8]
This paper proposes a system for prediction of production of crops in the current year. In order to determine the crop
production, it uses a data mining algorithm K-Means. This system also uses prediction mechanism in form of fuzzy logic.
Fuzzy logic is a rule based prediction logic wherein a set of rules are applied on the land for farming, rainfall and
production of crops. Using this paper, a clear insight of how K-Means can be used to analyze data sets is obtained.
Similar to set of rules as they have applied in form of fuzzy logic, we will be applying the set of rules to predict which
crop will yield maximum profit based on previous years cost of crops and current soil and weather data.
There is no existing system which recommends crops based on multiple factors such as Nitrogen, Phosphorus and
Potassium nutrients in soil and weather components which include temperature and rainfall. The proposed system
suggests an android based application, which can precisely predict the most profitable crop to the farmer. The user
location is identified with the help of GPS. According to user location, the feasible crops in the respective location is
identified from the soil and weather database. These soils are compared with past year production database to identify the
most profitable crop in the current location. After this processing is done at server side, the result is sent to the user’s
android application. The previous production of the crops is also taken into account which in turn leads to precise crop
proposition. Location is the only input for the extrapolation system. Depending on the numerous scenarios and additional
filters according to the user requirement the most producible crop is suggested.
Regression: Regression analysis is a form of predictive modelling technique which investigates the association between a
dependent (targets) and autonomous variable (s) (independent variables).
Linear Regression:Linear regression is a linear methodology for demonstrating the link between a scalar dependent
variable y and one or more independent variables denoted X. The instance of solitary independent variable is
called simple linear regression.[9]
Non Linear Regression:Nonlinear regression is a form of regression breakdown in which observational data are
displayed by a function which is a nonlinear amalgamation of the model parameters and depends on one or more
independent variables. The data is plotted by a technique of successive approximations.[10]
V. MULTI-LINEAR REGRESSINON
The difference between simple linear regression and multiple linear regression is that, multiple linear regression has (>1)
independent variables, whereas simple linear regression has only 1 independent variable.
In this project, Multiple Linear Regression algorithm is used to predict the crops. Multiple Regression is an extension of
simple Linear Regression. It is used when we want to predict the value of a variable based on the value of two or more
other variables. The variable we want to predict is called the dependent variable (or sometimes, the outcome, target or
criterion variable). The variables we are using to predict the value of the dependent variable are called the independent
variables (or sometimes, the predictor, explanatory or regressor variables). For example, Multiple Regression to
understand whether exam performance can be predicted based on revision time, test anxiety, lecture attendance and
gender. Multiple Regression also allows you to determine the overall fit (variance) of the model and the relative
contribution of each of the predictors to the total variance.
Formulae:
A Linear Regression model that contains more than one predictor variable is called a Multiple Linear Regression
model. The following model is A Multiple Linear Regression model with two predictor variables, 𝑥1 and 𝑥2 .
𝑌 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ∈
Where,
𝛽0 , 𝛽1 , 𝛽2 … are coefficients of Multiple Linear Regression
𝑥𝑖1 , 𝑥𝑖2 ... are independent variables.
The model is linear because it is linear in the parameters 𝛽0 , 𝛽1 and 𝛽2 . The model describes a plane in the three-
dimensional space of 𝑌, 𝑥1 𝑎𝑛𝑑 𝑥2 . The parameter 𝛽0 is the intercept of this plane. Parameters 𝛽1 and 𝛽2 are referred
to as partial regression coefficients. Parameter 𝛽1 represents the change in the mean response corresponding to a unit
change in 𝑥1 when 𝑥2 is held constant. Parameter 𝛽2 represents the change in the mean response corresponding to a
unit change in 𝑥2 when 𝑥1 is held constant.
Input: The prediction of crop is dependent on numerous factors such as Soil Nutrients, weather and past crop production
in order to predict the crop accurately. All these factors are location reliant and thus the location of user is taken as an
input to the system.
Data Acquisition: Depending on the current user location, the system mines the soil properties in the respective area
from the soil repository. In a similar approach, weather parameters are extracted from the weather data set.
Data Processing: A crop can be cultivable only if apropos conditions are met. These include extensive parameters allied
to soil and weather. These constraints are compared and the apt crops are ascertained. Multiple Linear Regression is used
by the system to predict the crop. The prediction is based on past production data of crops i.e.: identifying the tangible
weather and soil parameters and comparing it with current conditions which will predict the crop more accurately and in
a practical manner.
Output: The most profitable crop is predicted by the system using Multiple Linear Regression algorithm and the user is
provided with multiple suggestions of crop conferring to the duration of crop.
Set Theory:
S = {I, Fm, O, S, F}
I = {I1} ……………. set of Input.
I1 = Location of user
Fm = { GetLocation(), GetAttributes(latitude, longitude),
GetSoil(), GetWeather(),
FeasibleCrop(soil,weather), PastProduction(),
ProfitableCrop(FeasibleCrops, PastProduction)
MaxProfitableCrops() } ……………Set of functions.
Where, soil – N, P, K components
weather – Temperature and Rainfall values
O = {Crop predicted for given Location} ……….…. Set of output.
S= Correct prediction for High production and profit …………. Success Condition
F = Failure in prediction due to incorrect training data …………...Failure Condition
Prediction: 𝑌 = X𝛽
Result: res= 𝑌 − 𝑌
The proposed system takes into consideration the data related to soil, weather and past year production and suggests
which are the best profitable crops which can be cultivated in theapropos environmental condition. As the system lists out
all possible crops, it helps the farmer in decision making of which crop to cultivate. Also, this system takes into
consideration the past production of data which will help the farmer get insight into the demand and the cost of various
crops in market. As maximum types of crops will be covered under this system, farmer may get to know about the crop
which may never have been cultivated.
In the future, all farming devices can be connected over the internet using IOT. The sensors can be employed in farm
which will collect the information about the current farm conditions and devices can increase the moisture, acidity, etc.
accordingly. The vehicles used in farm like tractor will be connected to internet in future which will, in real time pass
data to farmer about crop harvesting and the disease crops may be suffering from thus helping the farmer in taking
appropriate action. Further the best profitable crop can also be found in light of the monetary and inflation ratio.
REFERENCES
[1]https://en.wikipedia.org/wiki/Agriculture
[2]https://en.wikipedia.org/wiki/Data_analysis
[4] M.R. Bendre, R.C. Thool, V.R.Thool, “Big Data in Precision agriculture”,Sept,2015 NGCT.
[5] Monali Paul, Santosh K. Vishwakarma, Ashok Verma, “Analysis of Soil Behavior and Prediction of Crop Yield using
Data Mining approach”, 2015 International Conference on Computational Intelligence and Communication Networks.
[6]Abdullah Na, William Isaac, ShashankVarshney, Ekram Khan, “An IoT Based System for Remote Monitoring of Soil
Characteristics”, 2016 International Conference of Information Technology.
[7] Dr.N.Suma, Sandra Rhea Samson, S.Saranya, G.Shanmugapriya, R.Subhashri, “IOT Based Smart Agriculture
Monitoring System”, Feb 2017 IJRITCC.
[8] N.Heemageetha, “A survey on Application of Data Mining Techniques to Analyze the soil for agricultural purpose”,
2016IEEE.
[9] https://en.wikipedia.org/wiki/Linear_regression
[10] https://en.wikipedia.org/wiki/Nonlinear_regression
[11] DhivyaB ,Manjula , Siva Bharathi, Madhumathi, “A Survey on Crop Yield Prediction based on Agricultural Data”,
International Conferencence in Modern Science and Engineering,March 2017.
[12] GiritharanRavichandran, ,Koteeshwari R S “Agricultural Crop Predictor and Advisor using ANN for Smartphones”,
2016 IEEE,
[13]R.Nagini, Dr. T.V. Rajnikanth, B.V. Kiranmayee, “Agriculture Yield Prediction Using Predictive Analytic Techniques
, 2nd InternationalConference on Contemporary Computing and Informatics (ic3i),2016
[14] Awanit Kumar, Shiv Kumar, “Prediction of production of crops using K-Means and Fuzzy Logic”, IJCSMC, 2015