Skip to main content

Identification of maize seed vigor based on hyperspectral imaging and deep learning

Abstract

Background

Seed vigor identification is critical to guaranteeing the quality and yield of maize. Although seeds with impaired vigor may germinate under normal conditions, planting under unfavorable conditions makes it difficult to produce healthy plants. Therefore, non-destructive and rapid detection of seed vigor using hyperspectral imaging (HSI) technology is crucial for improving crop production efficiency.

Methods

Hyperspectral images of maize seeds were acquired employing the HSI system, the original spectra were preprocessed using Savitzky–Golay smoothing and multiplicative scatter correction, and the feature wavelengths were extracted using the successive projections algorithm (SPA). Discriminant models were constructed based on support vector machine (SVM), random forest, artificial neural network (ANN), and convolutional neural network (CNN-DC).

Results

The results showed that SVM, ANN, and CNN-DC could discriminate well between maize seeds with different vigor levels, and their accuracy rate was over 70%. The SPA algorithm showed that the RMSE value achieved a minimum of 0.3406, while the number of variables was 49. The CNN-DC model outperformed the other models, which reached the highest accuracy of 92.06%. This study demonstrates that DL combined with HSI has excellent potential for identifying seed vigor.

Conclusions

This study shows that the proposed method has excellent results for hyperspectral image data processing and can accurately identify maize seed vigor.

Background

Maize (Zea mays L.), one of the top three global food crops, has more vital environmental adaptability than conventional crops such as wheat and rice. Furthermore, corn is rich in nutrients and has a wide range of applications, such as production and industrial product manufacturing (Klopfenstein et al. 2013). Seed vigor is a seedling's ability to achieve robust and tenacious growth based on germination, and it has an essential impact on factors such as the plant's germination rate, neatness, and disease resistance, making it a key indicator for assessing the quality of maize seed (Wang et al. 2020). Moreover, accurate identification of seed vigor before planting can productively select high-quality seeds for cultivation, thereby dramatically improving the production and quality of corn and contributing to agricultural development and economic stability (Hao et al. 2020). It is difficult to accurately determine seed vigor using human senses (color, texture, and shape) (Qiu et al. 2018). Despite their reliability, traditional physiological and biochemical methods and germination tests require dedicated reagents, cause significant damage to the sample, and restrict the number of trials (Xu et al. 2018; Wang et al. 2021a). Emerging biotechnological approaches, such as protein electrophoresis and DNA molecular marker techniques, are costly and operationally demanding (Wang et al. 2022). Hence, developing a non-destructive and high-efficiency approach for seed viability identification has emerged as a popular research topic. Numerous academics have worked on non-destructive inspection techniques for seeds, such as laser light scattering, hyperspectral, machine vision, and nuclear magnetic resonance imaging. In contrast, hyperspectral imaging (HSI) can acquire seeds' spectral and image information and reflect their internal tissue structure and nutrient content.

Deep learning (DL) has become a popular research topic in data analysis in recent years, with successful applications in hyperspectral image processing (Ma et al. 2020). As a representative of DL, convolutional neural networks (CNNs) have found widespread application in the analysis of crop seed quality (Xia et al. 2019). Pang et al. (2020) evaluated the potential of the HSI technique (370.2–1042.3 nm) to identify and predict the viability of maize seeds, and a one-dimensional convolutional neural network (1D-CNN) achieved the best accurate identification rate of 90.11%. Jin et al. (2022) utilized the near-infrared hyperspectral imaging (NIR-HSI) technique (900–1700 nm) to detect the viability and vigor of naturally aged seeds of three rice seed cultivars, and the CNN achieved the best classification result (more than 85%). Pang et al. (2021) used the HSI technique (370–1042 nm) to identify acacia seed vigor rapidly, and the CNN achieved an accuracy of more than 90%. Ma et al. (2020) effectively classified the viability of Japanese mustard seeds using the NIR-HSI technique (913–2519 nm), and CNN's classification accuracy could attain around 90%. In summary, research by many scholars has proved the potential of CNN to provide a useful reference for seed vigor detection.

This paper is an excellent attempt to use HSI combined with CNN to recognize maize seed viability as a research object. This study builds upon existing research and aims to delve deeper into the use of DL methods for recognizing the vitality of maize seeds (Xu et al. 2022). It will also compare and analyze the recognition abilities of DL models and traditional machine learning models when processing large-scale, high-dimensional spectral feature information. The primary purpose of this research is to investigate the combination of HSI techniques and DL methods to identify the vigor of maize seeds. Its particular goals include the following: (1) Collect HSI data on corn seeds with different vigor levels and analyze their differential characteristics. (2) Select a suitable preprocessing algorithm to decrease noise interference and enhance the validity of spectral data. (3) Decrease the data dimension of spectral features by utilizing the feature wavelength selection method and developing the seed vigor identification model. (4) Assess the performance of various models by contrasting the results and determining the best discriminative model.

Methods

Maize seed samples

The maize cultivar selected for this trial was Zhengdan 958, and the test seeds were procured from the seed market (Haikou, Hainan). Agronomy experts carefully selected the seeds, separated them into seven groups, and then subjected them to artificial aging. One set was left without any treatment (healthy seeds), and the rest six sets were subjected to thermal treatment in an artificial climate chamber at 50 °C and 80% relative humidity. After 1.5, 3, 4.5, 6, 7.5, and 9-h treatments, the samples were withdrawn and placed back at room temperature (26 °C). Next, 340 seeds of consistent size were randomly chosen from each of the seven sets for follow-up experiments (1680 samples in total).

Standard germination test

Research conducted a germination test on 40 randomly selected seeds based on International Seed Testing Association (ISTA) standards to verify the variations in the viability of different groups (Dadlani and Yadava 2023). It placed them on sterilized filter paper and kept them in a thermostatic incubator at 26 °C and 60% relative humidity. Throughout the seven days of germination, the experimenters inspected and recorded the seeds daily. Seeds with root lengths greater than 5 mm are considered vigorous and capable of normal germination (Qiu et al. 2018). Table 1 shows the germination test results for all seeds.

Table 1 Results of seed germination tests

Hyperspectral imaging system

This study applied the "GaiaSorter" hyperspectral imaging system to capture spectral image data from maize seed samples (Xu et al. 2022). The system consists of five units (Fig. 1). The "Image-λ" series hyperspectral camera (Imspector, SPECIM, Finland) includes an imaging spectrometer and CCD camera. The image resolution is 1392 × 1040, and the spectral resolution is 5 nm with 254 spectral bands from 900 to 1700 nm. The light source consisted of four 150 W halogen lamps (2900-ER + 9596-E, Illumination, USA) placed at a 45° vertical angle. When collecting hyperspectral images of the sample seeds, they were deposited endosperm side up, and the instrument was preheated by starting it 30 min in advance.

Fig. 1
figure 1

Hyperspectral imaging system

Hyperspectral image data acquisition

The obtained hyperspectral data include random noise and require black-and-white correction to reduce the effects of light source variations and dark currents (Esquerre et al. 2012). The camera lens was blocked entirely to acquire a reflectivity close to 0 black references (\(I_{{\text{d}}}\)), and a white plastic plate with a reflectivity near 100% was used as a white reference (\(I_{{\text{w}}}\)). The raw image (\(I_{{\text{r}}}\)) can be obtained as a corrected image (\(R_{{\text{c}}}\)) according to Eq. 1 (Wang et al. 2021b).

$$R_{{\text{c}}} = \frac{{I_{{\text{r}}} - I_{{\text{d}}} }}{{I_{{\text{w}}} - I_{{\text{d}}} }}$$
(1)

To determine the region of interest (ROI), identify and extract individual maize seeds from the original images. This research selected 226 bands in the wavelength range of 959.3 nm to 1697.9 nm for analysis to minimize the impact of noise at both ends of the spectrum on viability identification. During spectral acquisition, the samples were susceptible to disturbances caused by factors such as stray light and seed structure. Therefore, this paper preprocessed the spectral data using Savitzky–Golay (SG) smoothing and multiplicative scattering correction (MSC). SG smoothing reduces random noise, and MSC improves the signal-to-noise ratio of the spectral data (Gerretzen et al. 2015).

Multivariate data analysis

Traditional machine learning model

Support vector machine (SVM) is a supervised learning algorithm that purports to find an optimal hyperplane to classify data points into different categories (Zhang et al. 2021). This study determines that the SVM's kernel function is a radial basis function (RBF) with a gamma of 12 and a penalty factor of 100. The random forest (RF) is an integrated learning algorithm that constructs each decision tree by randomly selecting features and samples to mitigate the risk of overfitting (You et al. 2020). This paper determines that the maximum number of iterations for RF should be 10, and the maximum depth should be 6. An artificial neural network (ANN) is an algorithm that mimics the neural network of the human brain. It consists of multiple neurons that can learn to determine how much each input affects the output (Azarmdel et al. 2020). The proposed ANN architecture is 226-18-12-7 (Xu et al. 2022).

Convolutional neural network model proposed

A convolutional neural network (CNN) is a deep learning algorithm that extracts features from images through convolutional and pooling layers and classifies them through fully connected layers, becoming a novel tool for solving complex modeling tasks (Wang and Song 2023). This paper proposes a one-dimensional CNN architecture (CNN-DC) based on deformable convolutional structures for identifying corn seed vigor (Dai et al. 2017), as shown in Fig. 2. It contains several convolutional layers, pooling layers, batch normalization layers, dropout layers, and fully connected layers, in which a deformable convolution (DC) layer is also used. It introduces offsets in the convolution operation to achieve the deformability of the convolution kernels. After using the DC layer to extract features, the model uses convolutional, max-pooling, batch normalization, dropout, and other layers for further processing and optimization. Finally, the fully connected layer outputs the classification results. The model uses the "ELU" activation function and "He" initialization method and introduces the L2 regularization method to prevent overfitting. The training process uses cross-entropy as the loss function and Adam's algorithm for parameter updating.

Fig. 2
figure 2

Architecture of CNN-DC model

Software tools

Spectral extraction, preprocessing, and image segmentation of seed samples were implemented using ENVI 5.3 (NV5 Geospatial, Broomfield, USA) and MATLAB R2020a (MathWorks, Natick, USA). SVM, RF, ANN, and CNN-DC models were constructed in Python 3.8, running the Tensorflow framework on an NVIDIA GeForce RTX 3060 (GPU). Accuracy was utilized as the assessment metric for the models in this study (Wang et al. 2021b).

Results

Spectral characterization of seed vigor

There are raw and averaged spectra of maize seeds with different viability levels in the 900–1700 nm wavelength range, as shown in Fig. 3. While the raw spectra showed plenty of superimposition, the spectral profiles' global trend was similar. The spectral reflectance decreases sharply in the wavelength range of 1000–1500 nm, accompanied by the appearance of several distinct peaks and troughs. The spectral reflectance keeps increasing in the range of 1500–1700 nm. According to the available studies, the hyperspectral in the 900–1700 nm wavelength range reveals chemical information about components with oxygen, hydrogen, carbon, and nitrogen functional groups (Zhang et al. 2022). In particular, the absorption peaks around 1000 nm are associated with the O–H functional groups in water, those near 1200 nm with the hydrocarbon functional groups in carbohydrates, and those around 1500 nm with the N–H functional groups in proteins (Yang et al. 2017; Alhamdan and Atia 2017; Xu et al. 2020). In conclusion, maize seeds have discrepancies in spectral reflectance in various characteristic bands, which can be applied to discriminate seed viability.

Fig. 3
figure 3

Maize seed spectra: a original, b average

Feature selection using successive projection algorithm

This study uses the successive projection algorithm (SPA) to select the characteristic wavelengths to reduce the spectral dimensionality. Select the combination of variables corresponding to the minimum value of the root mean square error of cross-validation (RMSECV) as a characteristic wavelength. Figure 4 demonstrates the selection results of the SPA algorithm. The RMSE value achieves a minimum value of 0.3406, while the number of variables is 49.

Fig. 4
figure 4

SPA algorithm to select feature wavelengths: a RMSE, b selected variables

This study utilized the SPA method to obtain 49 characteristic wavelengths from the preprocessed spectral data (226 variables), which accounted for 21.68% of the total wavelengths, respectively. They are 959, 973, 983, 987, 990, 993, 1014, 1038, 1068, 1072, 1075, 1078, 1082, 1092, 1105, 1115, 1119, 1126, 1136, 1139, 1142, 1156, 1166, 1169, 1173, 1183, 1199, 1206, 1269, 1272, 1282, 1286, 1292, 1309, 1328, 1335, 1341, 1374, 1407, 1417, 1423, 1452, 1459, 1468, 1523, 1542, 1571, 1574, 1698 nm.

Vigor identification results of the spectral-based model

This study builds SVM, RF, ANN, and CNN-DC models based on feature wavelengths to evaluate the accuracy of the algorithms. It randomly partitions the dataset into a calibration set (1260) and a prediction set (420) in a 6:4 ratio and then uses a tenfold cross-validation approach to determine the mean accuracy as the final result. Figure 5 compares the accuracy and loss of the CNN-DC network on both the calibration and prediction sets. When the epoch is only 50, the accuracy and loss reach 87.10% and 0.344, respectively, which indicates that HSI combined with DL can obtain better results for identifying maize seed vigor. In addition, the CNN-MFF model only needs to process one-dimensional data, which are simple to train and perform satisfactorily. As a result, the model can balance performance and hardware resource requirements, and it has promising prospects for practical production application and promotion.

Fig. 5
figure 5

CNN-DC training process: a accuracy, loss

The results of the discrimination of maize seeds with various vigors are shown in Table 2. SVM, ANN, and CNN-DC achieved relatively good results with accuracy rates above 70%. Among them, CNN-DC achieved the best accuracy of 92.06%, which is excellent for the rest of the models. This indicates that CNN is more conducive to modeling spectral data and effectively utilizing feature information than traditional machine learning methods. The CNN-DC is effective in processing original high-dimensional data and reflects deformable convolutional structures' powerful abilities. The results show that the proposed network employs deformable convolutional operations on top of CNN to accurately accommodate more complex feature shapes for identifying maize seed vigor.

Table 2 Results of maize seed vigor identification

Discussion

In the CNN-DC model, the DC layer automatically learns spatial offsets by introducing offsets in the convolution operation, which makes the convolution kernel deform the input feature map, thus enhancing the model's ability to perceive local structures and represent input data. It is worth mentioning that seed aging is a sophisticated physiological process. Existing studies have pointed out that in hot and humid environments, the activity of antioxidant enzymes within seeds decreases, resulting in a continuous accumulation of reactive oxygen species (ROS). This accumulation triggers lipid peroxidation, damage to protein synthesis, and DNA degradation, leading to seed inactivation (Wu et al. 2022). Seeds with low vigor will not be able to grow into robust seedlings when planted in the field, with consequences for yield. As a result, the rapid and non-destructive characterization of corn seed vitality is significant for agricultural production.

The results for identifying spectral data based on characteristic wavelengths are satisfactory. The proposed method not only ensures the model's computational speed, but it also does not damage the samples. Therefore, it can be considered to extend the method to seeds of different crops and combine phenotypic and spectral feature information for identification. Furthermore, studies have shown that high temperature and high humidity climatic characteristics may have sharp effects on seed vigor (Hao et al. 2020). Thus, it is essential to consider the influence of local climate on maize seed vigor when planning breeding in tropical regions. In future studies, it is necessary to establish the relationship between seed ROS content and non-destructive testing, which will more effectively help researchers clarify the physiological changes in seed vigor (Xing et al. 2023). In addition, further improvement in the reliability of the spectral-based analysis is required, as is the development of hand-held seed quality inspection instruments suitable for portable and low-cost applications.

Conclusions

This study used HSI technology and deep learning technology to identify the vigor of maize seeds, constructed maize seeds with different vigor levels by artificial aging, and then collected the respective hyperspectral image data. The samples showed similar trends in spectral reflectance but differed at specific characteristic wavelengths. When comparing the recognition performances of SVM, RF, ANN, and CNN-DC on the spectral dataset, CNN-DC has the best discrimination rate (92.06%), which is superior to the other models. The findings demonstrated that the proposed approach has excellent results in hyperspectral image data processing and can accurately identify maize seed vigor. Future studies will distinguish maize seed vigor from various varieties, years, and geographies, establishing a comprehensive seed spectral image database.

Availability of data and materials

The authors do not have permission to share data.

Abbreviations

HSI:

Hyperspectral imaging

DL:

Deep learning

CNN:

Convolutional neural network

1D-CNN:

One-dimensional convolutional neural network

NIR-HSI:

Near-infrared hyperspectral imaging

ISTA:

International Seed Testing Association

ROI:

Regions of interest

SG:

Savitzky–Golay

MSC:

Multiplicative scatter correction

SVM:

Support vector machine

RBF:

Radial basis function

RF:

Random forest

ANN:

Artificial neural network

CNN-DC:

CNN architecture based on a deformable convolution

SPA:

Successive projection algorithm

RMSE:

Root mean square error

References

Download references

Acknowledgements

Not applicable.

Funding

This study was supported by the National Key R&D Program of China (2023YFD2000400) and the National Talent Foundation Project of China (T2019136).

Author information

Authors and Affiliations

Authors

Contributions

RBY, PX, YFP, and DQC contributed to conceptualization; PX, LXF, and SMY contributed to investigation; RBY, PX, and LXF contributed to methodology; PX contributed to software; PX contributed to writing—original draft, data curation, and visualization; RBY, PX, and SMY contributed to validation; RBY, PX, YFP, DQC, and SMY contributed to formal analysis; RBY, PX, and LXF contributed to writing—review and editing; RBY contributed to resources, supervision, funding acquisition, and project administration. All authors reviewed the data and approved the final version of the manuscript.

Corresponding author

Correspondence to Ranbing Yang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors have declared no conflicts of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, P., Fu, L., Pan, Y. et al. Identification of maize seed vigor based on hyperspectral imaging and deep learning. Bull Natl Res Cent 48, 84 (2024). https://doi.org/10.1186/s42269-024-01239-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s42269-024-01239-6

Keywords