Machine learning for the management of agricultural soil data

In this work, we use a variety of machine learning techniques for the classification of patterns composed by chemical measurements acquired from soils devoted to agriculture in the Indian state of Maharashtra. The India devotes 60.5% of its land to agriculture, which represents 11.3% of its Gross State Domestic Product. The machine learning methods are very interesting for the the design of a cultivation plan, recommendation of fertilizers and prediction of the soil fertility. Specifically, we classify several village-wise fertility indices (organic carbon, phosphorus pentoxide, manganese and iron), soil nutrients (nitrous oxide, phosphorus pentoxide and potassium oxide), preferable crop (bajra/soybean or irrigated/rainfed cotton), soil pH (slightly acidic/neutral/slightly alkaline/moderately alkaline) and soil type (light/medium). The classifiers, selected due to their good performances in the study [1], belong to several families including bagging (with several decision tree base classifiers), boosting (adaboost.M1), decision trees (J48, and decorate ensemble of J48 and recursive partitioning trees), K-nearest neighbors, neural networks (extreme learning machine with and without Gaussian kernel, multi-layer perceptron, probabilistic neural network and radial basis function neural network), random and rotation forests, rule-based (hybrid decision table-naive Bayes and ripper) classifiers and Gaussian kernel support vector machine. Globally, we apply 20 classifiers on 10 classification problems. The random forest achieves the best results for 6 of 10 problems, and adaboost.M1 is the best in 2 problems, while the support vector machine and decision table-naive Bayes are the best for 1 problem each one.

keywords: Agriculture, soil type, soil fertility, crop recommendation, classification, random forest.