Agricultural Data Analysis Using Machine Learningpdf
Agricultural Data Analysis Using Machine Learningpdf
Abstract: Agriculture is undoubtedly the largest livelihood provider in India and also contributes a significant figure to the
economy of our Country. The technological factors affecting the crop production includes practices used and also
managerial decisions. So, predicting the crop yield prior to its harvest would help farmers to take appropriate steps. We
attempt to resolve theissue by building a user-friendly prediction system. The results of the prediction are suggested to the
farmer such that suitable changes can be made in order to improve the produce. There are different techniques or
algorithms which help to predict crop yield. By analyzing all the parameters like location, soil nutrients, pH value, rainfall,
moisture a potential solution can be obtained to overcome the situation faced by farmers. This paper focuses on the analysis
of the agriculture data and finding optimal yield to provide an insight before the actual crop production using data mining
techniques and Machine Learning algorithms.
Keywords: Yield, Random Forest regress or, Decision Tree regress or, GDP, Digitalisation.
I. INTRODUCTION
Today, India is one of the leading producers across the world in the agriculture sector[1]. Agriculture is the broadest economic
sector and plays an outstanding role in the socio-economic part of India. Agriculture is an eccentric business crop production which
is influenced by many climate and economic factors. Andhra Pradesh, basically being an agro-Based economy contributes more
than 29% of the GDP as against 17% in the country's GDP. Periodical advice to the farmers either in terms of improved agricultural
strategies or advancements in factors affecting the production of crops may strengthen the state in the agriculture sector. Yield
prediction is one among the agricultural advancements. Due to these kinds of innovations agriculture is driving the interest of
modern man. In the past farmers used to predict their yield from previous experiences[2]. Digitalisation in farming gives awareness
about the cultivation of the crops at the right time and at the right place even to young farmers. These kinds of advancements need
the use of data analytics. This is one such system that can be used to address yield prediction. The main objectives are:
1) To analyse different parameters (soil nutrients, rainfall, area etc)
2) To use machine learning techniques to predict crop yield.
3) To provide an easy to use User Interface
IJRTI2211033 International Journal for Research Trends and Innovation (www.ijrti.org) 207
© 2022 IJRTI | Volume 7, Issue 11 | ISSN: 2456-3315
diversity in factors useful for agriculture at district level. Periodical data about the crop , soil and water a particular region is the
major focus of this study.The final dataset has been tabulated as in table-1:
S. No Feature Description
1 Year The year in which the crop will be cultivated. Generally, the
upcoming year
2 Season One among Kharif,Rabi and Whole Year.
3 Crop Name of the crop
4 District Name of the district
5 pH Level This describes the nature of the soil
6 Nitrogen Amount of nitrogen present
7 Potassium Amount of potassium present
8 Phosphorus Amount of phosphorus present
9 Rainfall Expected rainfall in millimeters
10 Area Area of field in hectares
The below diagram depicts the system architecture of our proposed system. Our whole system can be divided into 2 modules as a
whole i.e., one model predicts the optimal yield and the other model analyses the patterns in the dataset. The operation of these
models as a whole is specified clearly in the below diagram.
IV. METHODS
In the implementation of this yield prediction system Regression Analysis is used.Regression Analysis is considered as one of the
oldest,and widely used multivariate analysis techniques in the social sciences. Unlike others regression stands as an example of
dependence analysis in which the variables are treated asymmetrically. In regression analysis, the object is to obtain a prediction
of one variable, based on given the values of the others[8].Random Forest and Decision Tree algorithms are generally used in
classification problems but these can also be used in regression problems as well.
V. EXPERIMENTAL RESULT
A. Decision Tree Regression
Decision Tree algorithm on applying on the dataset resulted 100% on data and 82%(approx.) on test data.Fig-2 shows the accuracy
of Decision tree algorithm on data:
IJRTI2211033 International Journal for Research Trends and Innovation (www.ijrti.org) 208
© 2022 IJRTI | Volume 7, Issue 11 | ISSN: 2456-3315
Fig-5: Crops with lower increase in production but are increasing in price
IJRTI2211033 International Journal for Research Trends and Innovation (www.ijrti.org) 209
© 2022 IJRTI | Volume 7, Issue 11 | ISSN: 2456-3315
VI. CONCLUSION
Both Decision tree regression and Random Forest regression techniques are implemented on the input data to assess the best
performance yielding method. These methods are compared using performance metrics. According to the analyses of metrics both
the algorithms work well , but Random Forest regression gives a better accuracy score on test data than Decision tree regression.
The proposed work can also be extended to analyse the climatic conditions and other factors for the crop and to increase the crop
production.
IJRTI2211033 International Journal for Research Trends and Innovation (www.ijrti.org) 210
© 2022 IJRTI | Volume 7, Issue 11 | ISSN: 2456-3315
VII. REFERENCES
[1] https://www.investopedia.com/articles/investing/100615/4-countries-produce-most-food.asp
[2] http://www.fao.org/fileadmin/templates/ess/documents/meetings_and_workshops/GS_SAC_2013/Improving_methods_for_crops_
estimates/Crop_Yield_Forec asting_Methods_and_Early_Warning_Systems_Lit_review.pdf
[3] https://ijaers.com/uploads/issue_files/3%20IJAERS-MAY-2017-60-
Different%20Types%20of%20Data%20Mining%20Techniques.pdf
[4] https://docs.oracle.com/cd/B12037_01/datamine.101/b10698/4descrip.htm
[5] https://data.gov.in/
[6] http://www.apagrisnet.gov.in/2018/weekly/October/weekly_report_(Rabi)_06_21-11-18.pdf
[7] http://dataverse.icrisat.org/file.xhtml?fileId=1185&version=RELEASED version=.3
[8] https://www.sciencedirect.com/topics/medicine-and-dentistry/regression-analysis
[9] https://gdcoder.com/decision-tree-regressor-explained-in-depth/
[10] https://www.analyticsvidhya.com/blog/2020/12/lets-open-the-black-box-of-random-forests/
[11] https://journalofbigdata.springeropen.com/articles/10.1186/s40537-017-0077-4
IJRTI2211033 International Journal for Research Trends and Innovation (www.ijrti.org) 211