Machine Learning Applied to Logistics Decision Making: Improvements to the Soybean Seed Classification Process

de Oliveira Quadras, Djonathan Luiz; Cavalcante, Ian; Kück, Mirko; Mendes, Lúcio Galvão; Frazzon, Enzo Morosini

doi:10.3390/app131910904

Open AccessArticle

Machine Learning Applied to Logistics Decision Making: Improvements to the Soybean Seed Classification Process

¹

Graduate Program in Production Engineering, Federal University of Santa Catarina, Florianopolis 88040-970, SC, Brazil

²

Neosilos, Curitiba 81280-340, PR, Brazil

³

Faculty of Production Engineering, University of Bremen, 28359 Bremen, Germany

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(19), 10904; https://doi.org/10.3390/app131910904

Submission received: 10 August 2023 / Revised: 25 September 2023 / Accepted: 26 September 2023 / Published: 30 September 2023

(This article belongs to the Special Issue Design and Optimization of Manufacturing Systems, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

Soybean seed classification is a relevant and time-consuming process for Brazilian agribusiness cooperatives. This activity can generate queues and waiting times that directly affect logistics costs. This is the reason why it is so important to properly allocate resources, considering the most relevant factors that can influence their performance. This paper aims to present an approach to predicting the average lead time and waiting queue time for the soybean seed classification process, which supports the decision regarding the number of workers and machines to be deployed in the process. The originality of the paper relies on the applied approach, which combines discrete event simulation with machine learning algorithms in a real-world applied case. The approach comprises three steps: data collection to structure the simulation scenarios; simulation runs to generate artificial historical data; and machine learning applications to predict lead and queuing times. As a result, various scenarios using the data generated by machine learning were simulated, making it possible to choose the one that generated the best trade-off between performance, investments, and operational costs. The approach can be adapted to support the solution of different logistic-related decision-making problems that combine human and equipment resources.

Keywords:

machine learning; discrete event simulation; agribusiness 4.0

1. Introduction

The fourth industrial revolution, so-called Industry 4.0, has changed several sectors of the economy. The data generated in every process, which used to be a differentiator, is now necessary to maintain competitiveness in the market [1]. Besides the industrial scenario, agribusiness can also take advantage of the new technologies. Genetic engineering of plants, nanotechnology, biometric sensing, electrical agricultural machinery, computer vision integrating robotics with artificial intelligence (AI), and blockchain are examples of emerging technologies that have the potential to address challenges in all field states of the agricultural chain, pre-field, in-field, and post-field [2]. Agriculture 4.0 is based on principles such as data-driven decision-making, connectivity of devices, and growth of productivity, with added goals of adaptation to climate change and reduction of food waste [3,4,5]. Digital transformation can help farmers forecast the weather, control plagues, control temperature, and control moisture as well as uncountable other applications [6,7].

Capacity planning was approached by the examination of capacity investment decisions employing a multiperiod model to study the optimal one-time processing and storage capacity investment decisions [8] or for lot-sizing and pricing when supply and price-sensitive demand are uncertain [9]. Moreover, ref. [10] developed a capacity planning model for transport to estimate the number of locomotives and shifts, the number of bins, and the delays to harvesting operations resulting from harvesters waiting for bin deliveries. Nevertheless, machine learning algorithms could excel when facing decision-making problems based on historical data [11]. According to [12], “machine learning describes the capacity of systems to learn from problem-specific training data to automate the process of analytical model building and solve associated tasks”. It can forecast, cluster, and classify data. Thus, it can generate knowledge that helps decision-makers with problems such as how many machines to buy, how many employers to hire, or how many products to manufacture in the next month [13].

Machine learning methods have been successfully applied to solve several problems in production and logistics, such as forecasting customer demands [14], predicting energy consumption [15,16], or making travel time predictions [17]. In agribusiness, machine learning technique applications include crop yield production [18], predicting soil properties [19], irrigation management [20], weather prediction [21], crop quality [22], harvesting [23], demand prediction [24], detecting vegetable diseases [25], and determining crop production [26]. For transportation, machine learning algorithms were applied for the vehicle routing problem [27,28,29] and to model the potential distribution of different plant species [30].

Nevertheless, simulation was applied to agribusiness to understand factors influencing the long-term viability of an intermediated regional food supply network [31]; to analyze firms’ choices of spatial pricing policy [32]; to explore a hypothesis regarding the adoption of capabilities for entrepreneurship [33]; to model human behavior in food supply chains with asymmetric information about food quality and food safety [34]; to understand how technology, market dynamics, environmental change, and policy intervention affect a heterogeneous population of farm households and resources [35]; to analyze how adaptation affects the distribution of household food security and poverty under current climate and price variability [36]; to understand how government payments to enhance public values in social-ecological systems can contribute to the resilience of the system [37]; to explore nutrient mitigation potentials of different policy instruments [38]. Additionally, some studies use simulation to outline different what-if scenarios, e.g., for farmer decision-making on crop choice, fertilizer, and pesticide usage [39] or to identify global change impacts on farmland abandonment and test policy and management options [40].

In 2020, Brazil’s agribusiness sector played a significant role, accounting for a substantial 26.7% of the country’s Gross Domestic Product (GDP). Within this sector, the farming segment alone contributed 7% to the nation’s overall economic output. Notably, soybean production has taken center stage in the modernization of Brazilian agriculture over recent decades [41]. Back in 2007, the Brazilian Ministry of Agriculture, Livestock, and Supply (MAPA) established the standard governing the classification of soybean seeds [42]. This classification process precision is of the utmost importance, as it serves a dual purpose. Firstly, it ensures that soybean products maintain the highest quality standards. Secondly, it plays a crucial role in meeting food safety regulations [43]. These regulations are imperative for upholding product integrity and safeguarding consumer health. As a result, this task necessitates a comprehensive and dedicated approach to guarantee the accurate pricing of the highest-quality soybean seeds.

While a cooperative operates as a non-profit organization, its foremost objective remains the equitable distribution of benefits among its members, thus underscoring the importance of achieving operational excellence [44]. Agro-industrial cooperatives confront notable challenges in managing their operations. Firstly, resource constraints often curtail their capacity to invest in advanced inventory management systems or expand their workforce, leading to member dissatisfaction stemming from issues like stockouts, overstocking, and delayed service. These issues directly impact the cooperative’s competitiveness, resulting in financial inefficiencies and a diminished ability to attract new members. Secondly, the unpredictable nature of agribusiness, influenced by factors such as seasonal demand, climatic variability, or inefficient crop management practices, creates demand spikes that strain operations, leading to prolonged lead times and customer waiting queues. This, in turn, curtails the cooperative’s bargaining power. Consequently, it becomes imperative to embrace new technologies that streamline manual processes, thus reducing cycle times and enabling the cooperative to effectively address these challenges.

Despite the wide application opportunities for simulation and machine learning in agriculture, no papers were found regarding the usage of simulation to generate data regarding process performance to input a machine learning technique [45]. In addition, no paper considered using machine learning outputs for investment in process improvement. The present paper aims to present a novel approach combining computational simulation and machine learning techniques to enhance decision-making for agribusiness logistics. To forecast the average lead time and waiting queue time to help decision-makers determine the best number of workers to be hired and machines to be installed for the soybean seed classification process. To the best of the authors’ knowledge, this study represents the first endeavor to employ such an approach for assisting in decision-making. In particular, the case displays a method to determine the best configuration for the process, i.e., how many machines to implement and how many workers to hire. The paper is structured as follows: Section 2 presents the soybean injury classification process and describes the applied methodology. Section 3 shows the results of the conducted case study. Section 4 concludes the paper.

2. Materials and Methods

2.1. Scenario Description

The soybean seed classification process addressed in this study is based on the real-life case of a Brazilian agribusiness cooperative located in the state of Paraná. This process is performed twice: at the arrival and departure of grains.

The current process, to ensure transparency among all stakeholders, including farms and cooperatives, stores physically classified samples for up to three days before being discarded. However, there are still weaknesses in the process, such as the manual input of classification results into the system. This raises concerns among managers, who report attempted fraud and bribery every year.

As shown in Figure 1, the system comprises the following stages: Trucks arrive with a soybean grain load and wait in a queue until they are served. Once served, the worker performs the “Liming” step, during which samples are taken from different parts of the truck. The samples proceed to the “Quartering” stage, where the sample is homogenized (i.e., grains collected from different parts of the truck are mixed to form a single sample). Subsequently, the samples are separated into two parts for two parallel processes: the first involves the “Impurities Classification”, while the second involves both “Moisture Measurement” and “Injury Classification”. The “Impurities Classification” stage analyzes the presence of contaminants in the sample, such as small stones, branches, leaves, and dust. It is performed using a machine. In the subsequent process, “Moisture Measurement” assesses the humidity level of the sample, also carried out by a machine. “Injury Classification” is the stage where soybean grains are manually inspected by workers to identify the quantity of damaged grains, rotten grains, and unripe grains. Finally, a report is generated summarizing all the analyses conducted, and the truck exits the system.

Human vision cannot reliably detect tenuous injuries, and the “Injury Classification” step necessitates cutting the seed and observing its health, leading to extended processing times (averaging 10 min) and lengthy truck queue times. Currently, workers perform a superficial visual analysis, which is highly inefficient and can lead to substantial fines for misclassification (averaging 3 min). However, on days with a high influx of trucks, the workers do not perform the “Injury Classification” step, claiming that this activity requires a significant amount of time, increasing the total lead time, and consequently resulting in long queues and waiting times.

Thus, the company chose to invest in intelligent methods for carrying out the task. It was considered to use a machine capable of performing the “Injury Classification” step efficiently using computer vision, a technique that has been used by researchers in order to identify varieties or defects in different types of grains such as soybeans [43,46,47], wheat [48,49], corn [50,51], rice [52], and many others. The machine’s current lead time proves to be better than the ideal analysis time when performed by a worker, and the machine supplier anticipates that with new software updates, this lead time will be reduced to half.

Then, considering the different lead times for the same step, it is necessary to understand how the whole system will be affected when including the machine and changing parameters. Therefore, an approach combining computational simulation and machine learning techniques is applied to forecast the average lead time and waiting queue time to help decision-makers.

2.2. Methodological Procedure

The methodological procedure was split into three steps, as presented in Figure 2.

The study’s initial phase was conducted through on-site visits to the cooperative, focusing on data collection, process comprehension, and time measurement through discussions with employees. Then, historical data was gathered on truck arrivals, waiting times, and lead times. To mitigate potential biases, special care was given to consulting with the staff and ensuring cooperative representation. Staff consultations were conducted separately from data collection to prevent external influences on individual employee responses. Additionally, data was collected from various sites within the cooperative to provide a more comprehensive and unbiased assessment of its performance, reducing the risk of concluding based on a single site’s characteristics.

Following the data collection phase, the study embarked on the development of multiple scenarios to support the simulation process. Given the innovative nature of the machine intended for injury classification, the absence of prior research or studies elucidating its logistical impact on system performance necessitated the generation of artificial data through simulation. Within this context, the cooperative outlined three options for staffing levels and four for potential machine quantities, encompassing 1, 3, or 5 dedicated workers and 0, 1, 3, or 5 machines available for acquisition. Subsequently, the central objective revolved around determining the workforce composition (1, 3, or 5 employees to be hired) and machine deployment (0, 1, 3, or 5 machines to be purchased). This consideration involved exploring various permutations of the number of machines (

m

) and workers (

w

) while ensuring that

m \leq w

. In situations where there were more workers than machines (

m < w

), it was assumed that workers could simultaneously participate in the analysis process alongside the machines, eliminating the necessity for idle waiting periods. Moreover, it was firmly established that each worker could manage a single truck at a time, with the stipulation that once a worker initiated service for a truck, the process would proceed without interruption.

In the second step, a computer simulation model was designed to evaluate all the described scenarios. A Discrete Event Simulation using AnyLogic University v.8.5.2 was developed. As presented in Figure 3. The truck arrivals were input according to the historical data since 2017 provided by the cooperative. The simulation can explore different scenarios mixing real and virtual data, generating useful results for decision-making [53].

The simulation inputs are presented in Table 1. “Workers” and “Machines” determine the quantity w of workers and m machines available for the process. The values for both can be 1, 3, or 5 available workers/machines for future scenarios (as suggested by the decision-makers). “Worker Wait” is a Boolean parameter. When

m < w

, “Worker Wait” takes “True” if the worker waits for a machine to finish the task to use it or “False” if the worker executes the task in parallel. “Human Time” and “Machine Time” are the lead times for a human or a machine, respectively, to execute the task. “Truck Arrivals” are all the truck arrivals in the system.

The case of simulating different scenarios by varying the number of workers without including the machine was also considered. In this case, 0 machines and 1, 3, or 5 workers were considered. It is important to note that this scenario does not directly reflect the real-world situation. In the simulated scenario, it is established that the workers must perform the classification manually with a lead time of 195 or 615 s, which is not currently the case in the real world where the “Injury Classification” is not even performed.

Additionally, all the machines are shared by all workers, as the worker needs to move between each of the machines or workstations to carry out the activities and complete the process. Two KPIs were collected: (i) truck queue time and (ii) lead time. The simulation results for each scenario constituted the database utilized in the subsequent stage, involving the application of machine learning. Thus, simulation was employed to generate historical data based on what-if scenarios, producing a robust historical foundation that was used as labels for the machine learning algorithms.

Finally, in the third step, a machine learning method was applied. This step was considered to understand how the different scenarios would impact the system in the following periods. The forecasting step was divided into two components: (1) predicting truck queue times and (2) estimating lead times. Both processes followed the procedural steps outlined in Figure 4.

Table 2 presents the input variables, including “Workers”, “Machines”, “Worker Wait”, “Human Time”, and “Machine Time”, which align with the input variables utilized in the simulation phase. Additionally, we introduce “Day”, “Month”, and “Day of Week” as time-based parameters linked to truck arrivals dating back to 2017. The incorporation of these temporal variables allows the model to capture potential seasonal patterns and trends that have evolved, enhancing its ability to learn and adapt to time-related dynamics. The Labels are the simulated truck queue time and lead time resulting from the simulation, considering all scenarios described in Table 1.

Algorithm 1 presents the pseudocode for the machine learning step. Initially, data scaling was performed to ensure that features with differing scales or units exerted an equitable influence during model training. Next, the data was partitioned into training and testing sets. In the following phase, machine learning models were trained on the training sets and subsequently tested. The algorithms considered included Linear Regression [54], Random Forest [55], Gradient Boosting [56], and Decision Tree [30] as they are well applied in the literature [57,58]. Model selection was based on the root mean squared error (RMSE), a metric that offers a comprehensive assessment of prediction accuracy, assigning equal importance to both small and large errors. This attribute makes RMSE sensitive to both overestimation and underestimation [59]. Finally, the model with the smallest RMSE was chosen for forecasting future scenarios.

Algorithm 1 Machine learning pseudocode.

Inputs:
- features

\leftarrow

Dataframe
- scenarios

\leftarrow

Dataframe (dates just for 4 first months of 2023)
- labels <- “queue_time” (double) or “lead_time” (double)
- training_models <- [“Linear Regression”, “Random Forest”,
“Gradient Boosting”, “Decision Tree”]

Output:
- forecasted_data <- “queue_time” (double) or “lead_time” (double)

# Scale features
scaler <- initialize scaler
scaler(features)
scaler(labels)

# Split into test and training sets
feature_train, feature_test, label_train, label_test <- split(features, labels)

# Train Models
for model in training_models:
  train(feature_train,label_train)
  test(feature_test_, label_test)
  new_rmse <- test_rmse
  if new_rmse < old_rmse
   best_model <- model
   old_rmse <- new_rmse

# Forecast
forecasted_data <- best_model(scenarios)
return forecasted_data

Following model training and selection, the forecasting model was applied to all days during the first four months of 2023. Subsequently, for queue time, the total waiting time was aggregated, while for lead time, the average was computed for each month. Finally, for both targets, the monthly averages were calculated.

3. Results and Discussion

Table 3 presents the results for all the scenarios presented in Table 2. All the times are in minutes, and the table is sorted by queue time, ascending.

The results show that the variation in the lead time is considerably lower compared to the queuing time. Its values predominantly varied from 6 to 8 min, with a few exceptions close to 15 min. The queue time, on the other hand, presents a high variation, returning values lower than 1 min or greater than 270 days.

Table 4 presents the computed statistics for the predicted queue times, including averages, maximum values, and standard deviations, for each of the scenarios over the projected period. An observed trend reveals that a reduction in the number of machines and workers correlates with decreased system responsiveness during historically high-demand periods. This trend is corroborated by the observed escalation in standard deviation values. Indeed, as the average queue time increases, both maximum values and deviations follow suit. These findings underscore the nuanced relationship between resource allocation and system responsiveness in the context of fluctuating demand scenarios.

In contrast, when analyzing the statistics related to lead time in Table 5, a minimal variation can be observed across all values. However, particular attention is drawn to scenarios 43 and 44. In these scenarios, where classifications are conducted without the aid of machines and workers are required to manually split grains in half, a significant increase in total processing time is evident, nearly doubling the results obtained in previous scenarios.

Table 4 and Table 5 present the results, while Table 6 offers a correlation map between input variables, averages, maximum values, and standard deviations. Concerning lead time, there is a direct proportionality with machine processing time, indicated by a factor of 0.82. Additionally, fewer machines result in longer average and maximum times, highlighting the significant impact of machine quantity on service speed. Other factors did not yield significant results.

Regarding queue time, the number of workers has a more substantial impact than machine quantity. Increasing the number of workers leads to shorter waiting times for trucks. Surprisingly, both machine and worker processing times have limited impacts on queue time, contradicting expectations of a more significant influence. These findings indicate that service capacity is a more critical factor in queue time than the speed at which services are performed.

Besides the discussion above, a final aspect needs to be explored. The machine currently has a lead time of 2 min. Nevertheless, according to the supplier, the new version of the software can carry out the task in half of it (1 min). Then, considering that the workers are not analyzing in parallel and the possible improvements in the machine, Table 7 shows the best alternatives for both scenarios.

Based on Table 7, the best decision is to use 3 machines, as it requires less investment and has the same efficiency as 5 machines. Additionally, considering that trucks can wait in the queue and that the machine can be improved, the best number of workers is 3 as well. Then, it can be concluded that in the best scenario, when the machine is improved, the best arrangement is 3 workers to 3 machines. Nevertheless, if the machine is not improved, then the best arrangement is 5 workers to 3 machines.

4. Conclusions

This paper presented a novel approach combining Discrete Event Simulation and machine learning to enhance decision-making in logistics. A systematic three-step methodology was employed. The initial phase involved conducting on-site visits to the cooperative to gather relevant data. Subsequently, a simulation model was developed to generate data encompassing various scenarios. Finally, in the third step, a machine learning approach was applied to predict both the queue time of trucks and the lead time. This comprehensive methodology enabled a structured and analytical approach to tackle the problem at hand. This study aimed to understand the logistical impact of selecting different scenarios to support decision-makers in determining the best configuration for the process. Therefore, in the real case of a Brazilian cooperative, the best number of workers to be hired and machines to be allocated to the Soybean Seeds Classification Process was calculated. The results confirm the expected strong connection between the queue time and the lead time, i.e., the higher the lead time, the higher the queue time. In addition, the queue time is affected mainly by the system capacity (number of workers), while the lead time is affected by machine capacity.

With the integration of the new machine into the system, the process gains accuracy, ensuring that the results of “Injury Classification” are truthful, mitigating the possibility of fraud and bribery, thereby generating greater reliability for both the company and its clients. Additionally, with this suggested configuration, the machine’s implementation should not result in significant queue time, which would render its utilization impractical. As a result, the process becomes more efficient and effective.

The utilization of machine learning presents significant potential by facilitating decision-making. This data-driven approach empowers decision-makers to provide informed recommendations for achieving the optimal configuration of the process, thereby enhancing efficiency and productivity. Moreover, the combination of discrete event simulation with machine learning highlights the capacity to leverage diverse techniques in addressing real-world challenges. For theoretical implications, this is the first article, to the best of the authors’ knowledge, to employ a combined approach of simulation and machine learning to generate data that aids decision-making regarding investments in process improvements.

Regarding the contributions inherent in this work, the first relates directly to the field of agriculture. Against the backdrop of the growing importance of the efficiency of agricultural systems, a method is proposed that aims to better plan the allocation of critical resources—in this case, directly in the classification process—that can be used by other similar operations, considering other types of grains or products, reducing logistical costs, waiting times, and lead times.

In terms of practical implications, this work has contributed to the capacity planning process. In the approach proposed in this research, the company’s managers obtained improved forecast data for future scenarios, supporting better decision-making in line with the company’s operational objectives. Although the results cannot be generalized to other companies as this is a specific case, companies can adapt this method to make accurate decisions based on hypothetical scenarios, benefiting from a simple and low-cost implementation process. Additionally, the model could encompass more forecasting models or include different data for training.

As a limitation of this research, the costs of hiring workers and purchasing machinery were not analyzed. Furthermore, the proposed approach was not compared to traditional methods such as Little’s law, the funnel model, or factory physics. These, then, are some recommendations for future directions of this research. Furthermore, the development of a framework that enhances the applied method to achieve generalization for other cases is recommended.

Author Contributions

Conceptualization, D.L.d.O.Q., I.C. and E.M.F.; methodology, D.L.d.O.Q., M.K. and L.G.M.; software, D.L.d.O.Q., M.K. and E.M.F.; validation, D.L.d.O.Q., M.K., I.C. and L.G.M.; formal analysis, D.L.d.O.Q., M.K., I.C., E.M.F. and L.G.M.; investigation, D.L.d.O.Q. and I.C.; resources, D.L.d.O.Q.; writing—original draft preparation, D.L.d.O.Q., I.C. and L.G.M.; writing—review and editing, D.L.d.O.Q., M.K., I.C., E.M.F. and L.G.M.; visualization, D.L.d.O.Q. and M.K.; supervision, M.K. and E.M.F.; project administration, I.C. and E.M.F.; funding acquisition, I.C. and E.M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the German Research Foundation (DFG), grant number FR 3658/4-1, and by the Brazilian Coordination for the Improvement of Higher Education Personnel (CAPES), grant number 88881.364431/2019-01, in the scope of the Collaborative Research Initiative on Smart Connected Manufacturing program and by the National Council for Scientific and Technological Development (CNPq), grant number 424195/2021-6.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bernardo, S.M.; Rampasso, I.S.; Quelhas, O.L.G.; Filho, W.L.; Anholon, R. Method to Integrate Management Tools Aiming Organizational Excellence. Production 2022, 32, 1–14. [Google Scholar] [CrossRef]
da Silveira, F.; Lermen, F.H.; Amaral, F.G. An Overview of Agriculture 4.0 Development: Systematic Review of Descriptions, Technologies, Barriers, Advantages, and Disadvantages. Comput. Electron. Agric. 2021, 189, 106405. [Google Scholar] [CrossRef]
Liu, Y.; Ma, X.; Shu, L.; Hancke, G.P.; Abu-Mahfouz, A.M. From Industry 4.0 to Agriculture 4.0: Current Status, Enabling Technologies, and Research Challenges. IEEE Trans. Industr. Inform 2021, 17, 4322–4334. [Google Scholar] [CrossRef]
Braun, A.-T.; Colangelo, E.; Steckel, T. Farming in the Era of Industrie 4.0. Procedia CIRP 2018, 72, 979–984. [Google Scholar] [CrossRef]
Belaud, J.-P.; Prioux, N.; Vialle, C.; Sablayrolles, C. Big Data for Agri-Food 4.0: Application to Sustainability Management for by-Products Supply Chain. Comput. Ind. 2019, 111, 41–50. [Google Scholar] [CrossRef]
Mendes, J.A.J.; Carvalho, N.G.P.; Mourarias, M.N.; Careta, C.B.; Zuin, V.G.; Gerolamo, M.C. Dimensions of Digital Transformation in the Context of Modern Agriculture. Sustain. Prod. Consum. 2022, 34, 613–637. [Google Scholar] [CrossRef]
Baierle, I.C.; da Silva, F.T.; de Faria Correa, R.G.; Schaefer, J.L.; Da Costa, M.B.; Benitez, G.B.; Benitez Nara, E.O. Competitiveness of Food Industry in the Era of Digital Transformation towards Agriculture 4.0. Sustainability 2022, 14, 11779. [Google Scholar] [CrossRef]
Boyabatl, O.; Nguyen, J.; Wang, T. Capacity Management in Agricultural Commodity Processing and Application in the Palm Industry. Manuf. Serv. Oper. Manag. 2017, 19, 551–567. [Google Scholar] [CrossRef]
Golmohammadi, A.; Hassini, E. Capacity, Pricing and Production under Supply and Demand Uncertainties with an Application in Agriculture. Eur. J. Oper. Res. 2019, 275, 1037–1049. [Google Scholar] [CrossRef]
Higgins, A.; Davies, I. A Simulation Model for Capacity Planning in Sugarcane Transport. Comput. Electron. Agric. 2005, 47, 85–102. [Google Scholar] [CrossRef]
Sendhil Kumar, K.S.; Anbarasi, M.; Shanmugam, G.S.; Shankar, A. Efficient Predictive Model for Utilization of Computing Resources Using Machine Learning Techniques. In Proceedings of the 2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 29–31 January 2020; IEEE: New York, NY, USA, 2020; pp. 351–357. [Google Scholar]
Janiesch, C.; Zschech, P.; Heinrich, K. Machine Learning and Deep Learning. Electron. Mark. 2021, 31, 685–695. [Google Scholar] [CrossRef]
Khoa, B.T.; Huynh, T.T. Is It Possible to Earn Abnormal Return in an Inefficient Market? An Approach Based on Machine Learning in Stock Trading. Comput. Intell. Neurosci. 2021, 2021, 2917577. [Google Scholar] [CrossRef] [PubMed]
Kück, M.; Freitag, M. Forecasting of Customer Demands for Production Planning by Local Nearest Neighbor Models. Int. J. Prod. Econ. 2021, 231, 107837. [Google Scholar] [CrossRef]
Amasyali, K.; El-Gohary, N.M. A Review of Data-Driven Building Energy Consumption Prediction Studies. Renew. Sustain. Energy Rev. 2018, 81, 1192–1205. [Google Scholar] [CrossRef]
Gumz, J.; Fettermann, D.C.; Frazzon, E.M.; Kück, M. Using Industry 4.0′s Big Data and IoT to Perform Feature-Based and Past Data-Based Energy Consumption Predictions. Sustainability 2022, 14, 13642. [Google Scholar] [CrossRef]
Zhang, Y.; Haghani, A. A Gradient Boosting Method to Improve Travel Time Prediction. Transp. Res. Part C Emerg. Technol. 2015, 58, 308–324. [Google Scholar] [CrossRef]
Pantazi, X.E.; Moshou, D.; Alexandridis, T.; Whetton, R.L.; Mouazen, A.M. Wheat Yield Prediction Using Machine Learning and Advanced Sensing Techniques. Comput. Electron. Agric. 2016, 121, 57–65. [Google Scholar] [CrossRef]
Prasad, R.; Deo, R.C.; Li, Y.; Maraseni, T. Soil Moisture Forecasting by a Hybrid Machine Learning Technique: ELM Integrated with Ensemble Empirical Mode Decomposition. Geoderma 2018, 330, 136–161. [Google Scholar] [CrossRef]
Goap, A.; Sharma, D.; Shukla, A.K.; Rama Krishna, C. An IoT Based Smart Irrigation Management System Using Machine Learning and Open Source Technologies. Comput. Electron. Agric. 2018, 155, 41–49. [Google Scholar] [CrossRef]
McNider, R.T.; Handyside, C.; Doty, K.; Ellenburg, W.L.; Cruise, J.F.; Christy, J.R.; Moss, D.; Sharda, V.; Hoogenboom, G.; Caldwell, P. An Integrated Crop and Hydrologic Modeling System to Estimate Hydrologic Impacts of Crop Irrigation Demands. Environ. Model. Softw. 2015, 72, 341–355. [Google Scholar] [CrossRef]
Folberth, C.; Baklanov, A.; Balkovič, J.; Skalský, R.; Khabarov, N.; Obersteiner, M. Spatio-Temporal Downscaling of Gridded Crop Model Yield Estimates Based on Machine Learning. Agric. For. Meteorol. 2019, 264, 1–15. [Google Scholar] [CrossRef]
Haghverdi, A.; Washington-Allen, R.A.; Leib, B.G. Prediction of Cotton Lint Yield from Phenology of Crop Indices Using Artificial Neural Networks. Comput. Electron. Agric. 2018, 152, 186–197. [Google Scholar] [CrossRef]
Hofmann, E.; Rutschmann, E. Big Data Analytics and Demand Forecasting in Supply Chains: A Conceptual Analysis. Int. J. Logist. Manag. 2018, 29, 739–766. [Google Scholar] [CrossRef]
Rahamathunnisa, U.; Nallakaruppan, M.K.; Anith, A.; Kumar, K.S.S. Vegetable Disease Detection Using K-Means Clustering and Svm. In Proceedings of the 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 6–7 March 2020; IEEE: New York, NY, USA, 2020; pp. 1308–1311. [Google Scholar]
Ramachandran, A.; Sendhil Kumar, K.S. Tiny Criss-Cross Network for Segmenting Paddy Panicles Using Aerial Images. Comput. Electr. Eng. 2023, 108, 108728. [Google Scholar] [CrossRef]
Qiang, L.; Jiuping, X. A Study on Vehicle Routing Problem in the Delivery of Fresh Agricultural Products under Random Fuzzy Environment. Int. J. Inf. Manag. Sci. 2008, 19, 673–690. [Google Scholar]
Padilla, M.P.B.; Canabal, P.A.N.; Pereira, J.M.L.; Riaño, H.E.H. Vehicle Routing Problem for the Minimization of Perishable Food Damage Considering Road Conditions. Logist. Res. 2018, 11, 1–18. [Google Scholar]
Rabbani, M.; Farshbaf-Geranmayeh, A.; Haghjoo, N. Vehicle Routing Problem with Considering Multi-Middle Depots for Perishable Food Delivery. Uncertain Supply Chain. Manag. 2016, 4, 171–182. [Google Scholar] [CrossRef]
Lorena, A.C.; Jacintho, L.F.O.; Siqueira, M.F.; De Giovanni, R.; Lohmann, L.G.; De Carvalho, A.C.P.L.F.; Yamamoto, M. Comparing Machine Learning Classifiers in Potential Distribution Modelling. Expert. Syst. Appl. 2011, 38, 5268–5275. [Google Scholar] [CrossRef]
Krejci, C.C.; Stone, R.T.; Dorneich, M.C.; Gilbert, S.B. Analysis of Food Hub Commerce and Participation Using Agent-Based Modeling: Integrating Financial and Social Drivers. Hum. Factors 2016, 58, 58–79. [Google Scholar] [CrossRef]
Graubner, M.; Balmann, A.; Sexton, R.J. Spatial Price Discrimination in Agricultural Product Procurement Markets: A Computational Economics Approach. Am. J. Agric. Econ. 2011, 93, 949–967. [Google Scholar] [CrossRef]
Ross, R.B.; Westgren, R.E. An Agent-Based Model of Entrepreneurial Behavior in Agri-Food Markets. Can. J. Agric. Econ. 2009, 57, 459–480. [Google Scholar] [CrossRef]
Tykhonov, D.; Jonker, C.; Meijer, S.; Verw, T.; Verwaart, T. Agent-Based Simulation of the Trust and Tracing Game for Supply Chains and Networks. J. Artif. Soc. Soc. Simul. 2008, 11, 1–30. [Google Scholar]
Schreinemachers, P.; Berger, T. An Agent-Based Simulation Model of Human-Environment Interactions in Agricultural Systems. Environ. Model. Softw. 2011, 26, 845–859. [Google Scholar] [CrossRef]
Wossen, T.; Berger, T. Climate Variability, Food Security and Poverty: Agent-Based Assessment of Policy Options for Farm Households in Northern Ghana. Environ. Sci. Policy 2015, 47, 95–107. [Google Scholar] [CrossRef]
Schouten, M.; Opdam, P.; Polman, N.; Westerhof, E. Resilience-Based Governance in Rural Landscapes: Experiments with Agri-Environment Schemes Using a Spatially Explicit Agent-Based Model. Land Use Policy 2013, 30, 934–943. [Google Scholar] [CrossRef]
Zheng, C.; Liu, Y.; Bluemling, B.; Mol, A.P.J.; Chen, J. Environmental Potentials of Policy Instruments to Mitigate Nutrient Emissions in Chinese Livestock Production. Sci. Total Environ. 2015, 502, 149–156. [Google Scholar] [CrossRef] [PubMed]
Malawska, A.; Topping, C.J. Evaluating the Role of Behavioral Factors and Practical Constraints in the Performance of an Agent-Based Model of Farmer Decision Making. Agric. Syst. 2016, 143, 136–146. [Google Scholar] [CrossRef]
Brändle, J.M.; Langendijk, G.; Peter, S.; Brunner, S.H.; Huber, R. Sensitivity Analysis of a Land-Use Change Model with and without Agents to Assess Land Abandonment and Long-Term Re-Forestation in a Swiss Mountain Region. Land 2015, 4, 475–512. [Google Scholar] [CrossRef]
Medina, G.d.S. The Economics of Agribusiness in Developing Countries: Areas of Opportunities for a New Development Paradigm in the Soybean Supply Chain in Brazil. Front. Sustain. Food Syst. 2022, 6, 842338. [Google Scholar] [CrossRef]
Ministry of Agriculture, Livestock, and Food Supply. Brazil Normative Instruction No. 11 of May 15, 2007; Ministry of Agriculture, Livestock, and Food Supply: Brasilia, Brazil, 2009; pp. 1–9.
Huang, Z.; Wang, R.; Cao, Y.; Zheng, S.; Teng, Y.; Wang, F.; Wang, L.; Du, J. Deep Learning Based Soybean Seed Classification. Comput. Electron. Agric. 2022, 202, 107393. [Google Scholar] [CrossRef]
Anheier, H.K.; Toepler, S. International Encyclopedia of Civil Society; Springer Science & Business Media: Berlin, Germany, 2009. [Google Scholar]
Utomo, D.S.; Onggo, B.S.; Eldridge, S. Applications of Agent-Based Modelling and Simulation in the Agri-Food Supply Chains. Eur. J. Oper. Res. 2018, 269, 794–805. [Google Scholar] [CrossRef]
Zhao, G.; Quan, L.; Li, H.; Feng, H.; Li, S.; Zhang, S.; Liu, R. Real-Time Recognition System of Soybean Seed Full-Surface Defects Based on Deep Learning. Comput. Electron. Agric. 2021, 187, 106230. [Google Scholar] [CrossRef]
Lin, P.; Xiaoli, L.; Li, D.; Jiang, S.; Zou, Z.; Lu, Q.; Chen, Y. Rapidly and Exactly Determining Postharvest Dry Soybean Seed Quality Based on Machine Vision Technology. Sci. Rep. 2019, 9, 17143. [Google Scholar] [CrossRef] [PubMed]
Laabassi, K.; Belarbi, M.A.; Mahmoudi, S.; Mahmoudi, S.A.; Ferhat, K. Wheat Varieties Identification Based on a Deep Learning Approach. J. Saudi Soc. Agric. Sci. 2021, 20, 281–289. [Google Scholar] [CrossRef]
Zhao, X.; Que, H.; Sun, X.; Zhu, Q.; Huang, M. Hybrid Convolutional Network Based on Hyperspectral Imaging for Wheat Seed Varieties Classification. Infrared Phys. Technol. 2022, 125, 104270. [Google Scholar] [CrossRef]
Javanmardi, S.; Miraei Ashtiani, S.-H.; Verbeek, F.J.; Martynenko, A. Computer-Vision Classification of Corn Seed Varieties Using Deep Convolutional Neural Network. J. Stored Prod. Res. 2021, 92, 101800. [Google Scholar] [CrossRef]
Xu, P.; Tan, Q.; Zhang, Y.; Zha, X.; Yang, S.; Yang, R. Research on Maize Seed Classification and Recognition Based on Machine Vision and Deep Learning. Agriculture 2022, 12, 232. [Google Scholar] [CrossRef]
Koklu, M.; Cinar, I.; Taspinar, Y.S. Classification of Rice Varieties with Deep Learning Methods. Comput. Electron. Agric. 2021, 187, 106285. [Google Scholar] [CrossRef]
Quadras, D.; Frazzon, E.M.; Mendes, L.G.; Pires, M.C.; Rodriguez, C.M.T. Adaptive Simulation-Based Optimization for Production Scheduling: A Comparative Study. IFAC PapersOnLine 2022, 55, 424–429. [Google Scholar] [CrossRef]
de Oliveira, R.C.; Mendes-Moreira, J.; Ferreira, C.A. Agribusiness Intelligence: Grape Production Forecast Using Data Mining Techniques. In Proceedings of the Advances in Intelligent Systems and Computing; Springer: Berlin/Heidelberg, Germany, 2018; Volume 747, pp. 3–8. [Google Scholar]
Rusli, N.I.A.; Zulkifle, F.A.; Ramli, I.S. A Comparative Study of Machine Learning Classification Models on Customer Behavior Data. In Proceedings of the 7th International Conference on Soft Computing in Data Science 2023, Virtual Event, 24–25 January 2023; Springer: Singapore, 2023; pp. 222–231. [Google Scholar]
Gao, B.; Zhang, L.; Ou, D.; Dong, D. A Novel Deep Learning Model for Short-Term Train Delay Prediction. Inf. Sci. 2023, 645, 119270. [Google Scholar] [CrossRef]
Xu, Z.; Kurek, A.; Cannon, S.B.; Beavis, W.D. Predictions from Algorithmic Modeling Result in Better Decisions than from Data Modeling for Soybean Iron Deficiency Chlorosis. PLoS ONE 2021, 16, e0240948. [Google Scholar] [CrossRef]
Li, H.; Yao, B.; Yan, X. Data-Driven Public R&d Project Performance Evaluation: Results from China. Sustainability 2021, 13, 7147. [Google Scholar] [CrossRef]
Soares, L.D.; Franco, E.M.C. BiGRU-CNN Neural Network Applied to Short-Term Electric Load Forecasting. Production 2022, 32, e20210087. [Google Scholar] [CrossRef]

Figure 1. The soybean seed classification process.

Figure 2. Methodological process.

Figure 3. Process simulation in AnyLogic.

Figure 4. Forecasting process.

Table 1. Simulation inputs for different scenarios.

Parameter	Values
Workers	[1, 3, 5]
Machines	[0, 1, 3, 5]
Worker Wait	[False, True]
Human Time	[195, 615] (s)
Machine Time	[60, 120] (s)
Truck Arrivals	All from 2017 until 2022

Table 2. Machine learning features.

Parameter	Values
Workers	[1, 3, 5]
Machines	[0, 1, 3, 5]
Worker Wait	[False, True]
Human Time	[195, 615]
Machine Time	[60, 120]
Day	All from 2017 until 2022
Month	All from 2017 until 2022
Day of the Week	From Monday

Table 3. Forecasts (time in minutes).

Scenario	Workers	Machines	Worker Wait	Machine Time	Worker Time	Queue Time	Lead Time
1	5	5	No	1	3.25	0.62	6.12
2	5	3	No	1	3.25	0.64	6.12
3	5	5	Yes	1	10.25	0.64	6.11
4	5	3	Yes	1	10.25	0.65	6.12
5	5	3	No	1	10.25	0.66	6.11
6	5	3	Yes	1	3.25	0.66	6.12
7	5	5	Yes	1	3.25	0.67	6.10
8	5	5	No	1	10.25	0.67	6.11
9	5	5	Yes	2	3.25	1.27	7.11
10	5	5	No	2	10.25	1.28	7.11
11	5	5	No	2	3.25	1.28	7.13
12	5	3	No	2	3.25	1.30	7.12
13	5	5	Yes	2	10.25	1.31	7.12
14	5	3	Yes	2	3.25	1.31	7.12
15	5	3	Yes	2	10.25	1.33	7.12
16	5	3	No	2	10.25	1.36	7.14
17	5	1	Yes	1	10.25	1.49	6.35
18	5	1	Yes	1	3.25	1.58	6.35
19	5	1	No	1	10.25	1.59	6.32
20	5	1	No	1	3.25	1.60	6.35
21	5	0	-	-	3.25	3.20	8.37
22	3	3	No	1	3.25	17.54	6.12
23	3	3	Yes	1	3.25	17.59	6.12
24	3	3	Yes	1	10.25	17.59	6.11
25	3	3	No	1	10.25	17.85	6.11
26	3	1	Yes	1	3.25	25.71	6.25
27	3	1	No	1	3.25	25.78	6.26
28	3	1	Yes	1	10.25	25.90	6.24
29	3	1	No	1	10.25	26.09	6.26
30	3	3	No	2	3.25	54.14	7.12
31	3	3	Yes	2	10.25	55.14	7.13
32	3	3	No	2	10.25	55.39	7.13
33	3	3	Yes	2	3.25	55.47	7.11
34	5	1	Yes	2	10.25	67.81	7.98
35	5	1	No	2	10.25	68.47	7.98
36	5	1	No	2	3.25	68.64	7.96
37	5	1	Yes	2	3.25	68.89	7.95
38	3	1	Yes	2	3.25	87.93	7.43
39	3	1	Yes	2	10.25	88.47	7.43
40	3	1	No	2	10.25	89.30	7.44
41	3	1	No	2	3.25	89.85	7.43
42	3	0	-	-	3.25	140.45	8.36
43	5	0	-	-	10.25	218.99	15.37
44	3	0	-	-	10.25	1255.33	15.37
45	1	1	Yes	1	3.25	9249.24	6.13
46	1	1	No	1	3.25	9250.02	6.12
47	1	1	Yes	1	10.25	9254.78	6.13
48	1	1	No	1	10.25	9262.24	6.12
49	1	1	Yes	2	3.25	17806.45	7.11

Table 4. Statistics for queue time forecasting (time in minutes).

Scenarios	Average	Max	Standard Deviation
1, 3, 7, 8	0.1	27.0	0.9
2, 4, 5, 6	0.1	27.7	0.9
9, 10, 11, 13	0.4	59.4	3.0
12, 14, 15, 16	0.4	63.1	3.2
17, 18, 19, 20	0.6	101.6	5.0
21	1.6	160.4	9.6
22, 23, 24, 25	4.4	312.3	20.4
26, 27, 28, 29	5.6	356.8	24.1
30, 31, 32, 33	10.5	474.4	36.4
34, 35, 36, 37	13.5	557.4	45.9
38, 39, 40, 41	15.5	574.5	48.5
42	25.7	677.9	68.4
43	37.4	829.1	92.2
44	957.4	7266.5	1918.4
45, 46, 47, 48	2051.3	14307.7	3781.4
49	3428.9	23684.1	6000.7

Table 5. Statistics for lead time forecasting (time in minutes).

Scenarios	Average	Max	Standard Deviation
1, 3, 7, 8, 22, 23, 24, 25, 45, 46, 47, 48	6.2	7.0	0.6
2, 4, 5, 6	6.2	8.1	0.6
26, 27, 28, 29	6.4	9.7	0.7
17, 18, 19, 20	6.6	12.4	0.9
9, 10, 11, 13, 30, 31, 33, 34, 49	7.2	8.0	0.6
12, 14, 15, 16	7.2	10.0	0.6
38, 39, 40, 41	7.6	12.7	0.8
21, 42	8.4	9.2	0.7
34, 35, 36, 37	8.9	17.4	2.2
43, 44	15.4	16.2	0.7

Table 6. Correlation map.

Feature	Lead Time			Queue Time
Feature	Avg	Max	Std	Avg	Max	Std	Scale
Nº Workers	−0.05	0.25	0.29	−0.45	−0.59	−0.56	−1.00
Nº Machines	−0.43	−0.54	−0.26	−0.29	−0.35	−0.33	−0.50
Worker Wait	−0.24	−0.10	0.04	−0.14	−0.13	−0.14	0.00
Worker Time	0.19	0.12	0.00	0.11	0.09	0.10	0.50
Machine Time	0.82	0.35	0.28	0.11	0.12	0.10	1.00

Table 7. Best alternatives.

Machine Time	Workers	Machines	Avg. Queue Time	Avg. Lead Time
2	5	5	1.27	7.11
	5	3	1.31	7.12
	3	3	55.47	7.11
1	5	3	0.66	6.12
	5	5	0.67	6.10
	5	1	1.58	6.35
	3	3	17.59	6.12

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

de Oliveira Quadras, D.L.; Cavalcante, I.; Kück, M.; Mendes, L.G.; Frazzon, E.M. Machine Learning Applied to Logistics Decision Making: Improvements to the Soybean Seed Classification Process. Appl. Sci. 2023, 13, 10904. https://doi.org/10.3390/app131910904

AMA Style

de Oliveira Quadras DL, Cavalcante I, Kück M, Mendes LG, Frazzon EM. Machine Learning Applied to Logistics Decision Making: Improvements to the Soybean Seed Classification Process. Applied Sciences. 2023; 13(19):10904. https://doi.org/10.3390/app131910904

Chicago/Turabian Style

de Oliveira Quadras, Djonathan Luiz, Ian Cavalcante, Mirko Kück, Lúcio Galvão Mendes, and Enzo Morosini Frazzon. 2023. "Machine Learning Applied to Logistics Decision Making: Improvements to the Soybean Seed Classification Process" Applied Sciences 13, no. 19: 10904. https://doi.org/10.3390/app131910904

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Applied to Logistics Decision Making: Improvements to the Soybean Seed Classification Process

Abstract

1. Introduction

2. Materials and Methods

2.1. Scenario Description

2.2. Methodological Procedure

3. Results and Discussion

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI