The Crop Price Prediction Using Machine Learning: Preliminary Stage

,


Introduction
India is regarded as an agrarian nation because around 55% of its people rely on agriculture or closely related industries to make a living.An important component of India's first sector of the economy is agriculture (Ghutake, I.,et.al., 2021).Farmers experience significant losses due to barren soil, global climate change, an abundance of products on the market, and numerous other unknowns.To manage and sell the products at the optimum time to maximize revenue and minimize loss, it is imperative to predict projected prices in the future (Rachana, P. S.,et.al., 2019).
In the past, farmers' experience with specific crops and fields was used to anticipate prices (Sabu, K. M., & Kumar, T. M., 2020).Assume that we have access to historical data with various matching price forecasts recorded and that we will utilize this data to categorize potential future price projections.Farmers need accurate price forecasting as well since they use it to guide their marketing and production choices, which may have an impact on their finances months down the road.This forecast is based on data from official sources and employs machine learning methods.The price of a crop can also be predicted via data mining.Data mining, in general, is the process of examining data from various angles and condensing it into helpful information (Rohith, R.,et.al, 2020).Crop price forecasting is a significant agricultural issue.Every farmer strives to understand how much of his expected price he will receive (Darekar, A., & Reddy, A. A., 2017).With regard to machine learning/deep learningbased models, updating the models could result in stability problems due to quality concerns with the crop price data since new price data is entered every day (Peng, Y. H., 2015).Other agriculture-based enterprises may find this website useful for planning the cost-effective sourcing of raw materials (Shakoor, M. T.,2017).Based on machine learning, the research will give farmers and those in related fields access to predictions at any time that will show them expected price trends for different crops 12 months in advance.As a result, our feeding farmers may find an ML model with outstanding prediction accuracy to be a tremendous benefit.
In this study, crop price predictions are made using machine learning techniques.Additionally, a thorough study of the crop was provided, and a future scenario was projected so that farmers could choose the best crop for their specific situation (Shakoor, M. T., et.al 2021).
This essay seeks to forecast the price and profit of the specified crop before sowing.Rainfall, Maximum-trade, Minimum Support Price, and Yield are the factors taken into account for price prediction.Crop price, yield, cultivation costs, and seed costs are the factors taken into account for profit prediction.The K-Nearest Neighbour (KNN) algorithm was used to forecast crop profit while the Naive Bayes Algorithm was utilized to forecast crop pricing (Ghutake, I.,et.al. 2021).
In this study, machine learning models and time-series models are used to forecast arecanut prices every month in Kerala.The arecanut dataset with prices from 2007 to 2017 was used to examine the performance of the model SARMIA, Holt-Winter's Seasonal approach, and the LSTM neural network.The model that best fits the data was determined to be the LSTM neural network model (Guruprasad, R. B, et. al., 2019).
Forecasts in great detail for the upcoming six months were made in this study.A supervised learning approach called Decision Tree Regressor was applied.Data on rainfall and crop prices were taken into account as prediction criteria (Thomas, D., 2023).
In this study, future paddy price predictions were made using the Auto Regressive Integrated Moving Average (ARIMA) model.The effectiveness of the model was evaluated by computing various goodness of fit metrics.According to the findings, the ARIMA model is the most effective model for predicting (Dhanapal, R. et.al., 2021).
From local markets and internet questionnaires, the concerned crops' data are gathered for this system's dataset for machine learning Models undergo training.Algorithms like Artificial Neural Networks, Partial Least Squares, and Autoregressive Integrated Moving Averages are used to predict prices (Ikram, A. et.al., 2022).Partial least square and artificial neural networks are superior when compared to other algorithms for both short-and long-term prediction, according to results from the use of the aforementioned algorithms using recent data.
This study attempted to assist farmers in making decisions by rating the compatibility of a crop to the area in question.supervised machine learning methods, such as the K closest neighbor regression algorithm, and decision tree learning, are used for prediction and ranking.

Research Methodology
The following methodology is applied to our research, and it is illustrated in Figure 1.
1. Data Collection: The first step is to collect historical data on crop prices and other relevant factors that can impact the prices, such as weather, market demand, and supply chain information.The data can be obtained from various sources, including government agencies, commodity exchanges, and private data providers.It includes year, month, state, district, crop, and type of crops.

Data Preprocessing:
Once the data is collected, it needs to be pre-processed.It involves cleaning, transforming, and organizing the data to make it suitable for further analysis.Here are some common steps for preprocessing crop price data.
3. Data Cleaning: Remove any duplicate records from the dataset.The rows with missing values are eliminated.

Transformation:
The dataset includes categorical values.These categorical values are converted into numerical values.
5. Feature Selection: Feature selection involves selecting the most relevant features from the dataset that can impact crop prices.This can be done using various techniques such as correlation analysis.

Model Training:
After feature selection, the Random Forest algorithm is trained on the pre-processed data.The algorithm creates multiple decision trees on randomly selected subsets of data and combines them to generate the final prediction model.The algorithm uses techniques like bagging and feature importance to improve the model's performance and interpretability.

Model Evaluation:
Once the model is trained, it needs to be evaluated to determine its accuracy and robustness.This can be done using various metrics such as mean squared error and mean absolute error.
8. Prediction: After model evaluation, the trained Random Forest algorithm can be used to predict future crop prices.The algorithm takes as input the relevant features for the prediction and generates a prediction for the crop price

Results and Analysis
This manuscript offers a succinct overview of crop yield forecasts made for the chosen region using multiple linear regression.The agricultural study of organic and nonorganic farming, timely crop cultivation, data on profits and losses, and analysis of local business land in a specific location are its main areas of concentration.
It concentrates on the real estate, organic, and inorganic data sets that will provide agricultural projections.The use of data mining techniques to forecast agricultural yields based on input factors has been demonstrated in this research.Numerous agro-climatic input characteristics have an impact on crop output.The input of climatic parameters into a system designed to predict crop yields shows a tendency for each crop to be predominately influenced by a specific climatic parameter.
To address numerous agricultural issues, this paper presents several applications of data mining.Because it combines the works of numerous authors in one location, scholars can use it to learn more about the status of data mining techniques and applications specific to the agriculture industry.
From the algorithms that we run to measure the better prediction of retail crop prices, the results of prediction are tabulated using bar charts like those in Figures 2 and 3.The graphs show the retail sale price of all the states in January the states in a year for the rice crop.
From the experiments, we can conclude that the Random Forest algorithm can be a better tool for crop price prediction in agriculture.The algorithm combines multiple decision trees to improve the accuracy and robustness of the prediction model and uses techniques like bagging and feature importance to enhance the model's performance and interpretability.By training the algorithm on historical data on crop prices and other relevant factors, stakeholders can use the model to predict future crop prices and make informed decisions about buying, selling, or storing their crops.
Using Random Forest for crop price prediction can also provide various benefits, such as improving market efficiency, reducing risks, and helping policymakers make informed decisions.However, it is essential to note that the accuracy of the predictions depends on the quality and quantity of the data, the feature selection, and the model's training and evaluation.Therefore, it is crucial to collect high-quality data, use appropriate feature selection techniques, and regularly update and monitor the model to ensure its accuracy and relevance.

Conclusion
In summary, Random Forest can be a valuable tool for crop price prediction in agriculture, providing accurate and reliable predictions that can help stakeholders make informed decisions and mitigate risks.In addition to these advantages, using Random Forest for crop price prediction can also lower risks, increase market efficiency, and support decision-making by policymakers.It is imperative to acknowledge that the precision of the forecasts is contingent upon the caliber and volume of the data, the feature selection process, and the training and assessment of the model.To guarantee the accuracy and applicability of the model, it is therefore essential to gather high-quality data, apply suitable feature selection approaches, and update and monitor it regularly.

Figure 1 .
Figure 1.Flow Chart of Methodology

Figure 2 :
Figure 2: The graph shows the retail price of all states