Permanent pastures identification in Portugal using remote sensing and multi-level machine learning

Morais, Tiago G.; Domingos, Tiago; Falcão, João; Camacho, Manuel; Marques, Ana; Neves, Inês; Lopes, Hugo; Teixeira, Ricardo F. M.

doi:10.3389/frsen.2024.1459000

ORIGINAL RESEARCH article

Front. Remote Sens., 22 October 2024

Sec. Land Cover and Land Use Change

Volume 5 - 2024 | https://doi.org/10.3389/frsen.2024.1459000

Permanent pastures identification in Portugal using remote sensing and multi-level machine learning

Tiago G. Morais^1,2*

Tiago Domingos¹

João Falcão³

Manuel Camacho³

Ana Marques³

Inês Neves³

Hugo Lopes³

Ricardo F. M. Teixeira¹

¹MARETEC − Marine, Environment and Technology Centre, LARSyS, Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal
²VirtuaCrop, Lda., Lisbon, Portugal
³Instituto de Financiamento da Agricultura e Pescas (IFAP), Lisboa, Portugal

Introduction: The Common Agricultural Policy (CAP) is a vital policy framework implemented by the European Union to regulate and support agricultural production within member states. The Land Parcel Identification System (LPIS) is a key component that provides reliable land identification for administrative control procedures. On-the-spot checks (OTSC) are carried out to verify compliance with CAP requirements, typically relying on visual interpretation or field visits. However, the CAP is embracing advanced technologies to enhance its efficiency.

Methods: This study focuses on using Sentinel-2 time series data and a two-level approach involving recurrent neural networks (RNN) and convolutional neural networks (CNN) to accurately identify permanent pastures.

Results: In the first step, using RNN, the model achieved an accuracy of 68%, a precision of 36%, a recall of 97% and a F1-score of 52%, which indicates the model’s ability to identify all the true positive parcels (correctly identified permanent pasture parcels) and minimize the false negative parcels (non-identified permanent pasture parcels). This occurs due to the difficulty in distinguishing between permanent pastures and other similar land covers (such as temporary pastures and shrublands). In the second step, it was possible to distinguish the permanent pasture parcels from the others. The obtained results improved significantly from the first to the second step. Using CNN, an accuracy of 93%, a precision of 89%, and a recall of 98% were achieved for the “Permanent pasture” class. The F1-score was 94%, indicating a balanced measure of the model’s performance.

Discussion: The integration of advanced technologies in the CAP’s control mechanisms, as demonstrated, has the potential to automate the verification of farmers’ declarations and subsequent subsidy payments.

1 Introduction

The Common Agricultural Policy (CAP) is a comprehensive policy framework implemented by the European Union to support and regulate agricultural production within its member states (Heyl et al., 2021). With a substantial budget, the CAP plays a crucial role in ensuring the stability and sustainability of Europe’s agricultural sector (Heyl et al., 2021; European Commission, 2024). A significant portion of this budget is managed and controlled through the Integrated Administration and Control System (IACS) (Commission, 2023). The IACS serves as the backbone of the CAP’s financial management and control mechanisms. It encompasses various tools and procedures to ensure compliance with the CAP’s rules and regulations. One of the key components of the IACS is the Land Parcel Identification System (LPIS) (European Court of Auditors, 2016). The LPIS acts as a reliable reference for land identification and serves as the basis for several administrative control procedures. Within the CAP’s control system, on-the-spot checks (OTSC) are carried out by the paying agency to verify the accuracy of the information provided by farmers and to ensure that they are complying with the CAP requirements (European Court of Auditors, 2016; European Commission, 2023). These checks are carried out on an annual basis but are limited to approximately 5% of all agricultural holdings. The selection process for these checks is typically based on visual interpretation of recent aerial or satellite images or through field visits. By using these methods, authorities can verify the accuracy of the declared agricultural areas and activities. However, the CAP is continuously evolving to adopt modern technologies that enhance its efficiency and cost-effectiveness (European Court of Auditors, 2022). In this regard, there is a growing recognition of the role that advanced technologies can play in improving the IACS. For instance, Copernicus Sentinel satellites and other Earth Observation (EO) data are increasingly utilized to strengthen the control mechanisms for area-based payments.

The application of remote sensing and machine learning techniques for the identification of permanent pastures poses some unique challenges compared to other soil uses (Allen et al., 2018; Navarro et al., 2019). Permanent pastures, characterized by a diverse mix of grass species and varying phenological stages, exhibit high spectral and spatial heterogeneity (O’Mara, 2012). This complexity makes it more challenging to accurately classify and differentiate them from other land cover types using remote sensing data (Reinermann et al., 2020; Morais et al., 2021). Additionally, the dynamic nature of permanent pastures, which can undergo changes in vegetation density and species composition over time, further complicates their identification (Thrippleton et al., 2021). Furthermore, the spectral similarity between permanent pastures and other vegetation types, such as temporary crops or natural grasslands, can lead to misclassification (Ali et al., 2016; Amin et al., 2024).

Machine learning algorithms applied to Earth observation data, although powerful, require large and diverse training datasets to accurately classify land cover types (Tong et al., 2020; Tassi et al., 2021; Morais et al., 2022). To address these difficulties, the incorporation of time-series data from satellite bands and vegetation indices has been used to enhance the discrimination of permanent pastures from other crops (Zhu et al., 2016; Zhong et al., 2019; Vilar et al., 2020). Common machine learning models such as random forests and support vector machines have shown commendable performance in land cover classification tasks (Duro et al., 2012; Pflugmacher et al., 2019; Phiri et al., 2020), but they encounter limitations when handling the temporal dynamics inherent in time-series data (Sahu et al., 2023). For instance, when a random forest model is employed on a remotely-sensed monthly time series, a combination of month and band would be treated as independent inputs. Conversely, recurrent neural networks (RNNs) capture the temporal dynamics and patterns present in time series data (Reichstein et al., 2019). These architectures enable RNNs to learn and remember important information over extended periods, allowing them to capture patterns that occur across multiple time steps (Reichstein et al., 2019; Zhong et al., 2019). Another family of models are convolutional neural networks (CNNs) (LeCun et al., 1989; O’Shea and Nash, 2015) that, in contrast to random forests and support vector machines, incorporate spatial patterns that cannot be adequately captured by those models. CNN is specifically designed to leverage spatial features and capture intricate patterns through convolutional layers, making it well-suited for tasks that require spatial analysis and recognition (Tong et al., 2020; Trenčanová et al., 2022).

By leveraging advancements in remote sensing and machine learning, the CAP’s IACS can more accurately assess policy implementation, relying less on costly human and computational resources (López-Andreu et al., 2022). The integration of these technologies offers the potential to streamline administrative control procedures, enhance cross-checking capabilities, and optimize on-the-spot checks by the paying agency, ultimately contributing to the overall efficiency and effectiveness of the CAP. For example, in mainland Portugal, an operational monitoring system for the identification and classification of annual crops has already been proposed in the context of CAP’s IACS (Navarro et al., 2021). This system utilizes Sentinel-2 data and machine learning algorithms, applying a hierarchical approach. Vizzari et al. (2024) developed a Random Forest approach to classify annual crops in the Lake Trasimeno area (Italy). Similarly, in the Navarra region (Spain), multiple approaches have also been proposed to validate annual crops from CAP declarations. González-Audícana et al. (2020) proposed the use of a Random Forest algorithm to classify crop types based on NDVI time series data. Sitokonstantinou et al. (2018) concluded that Random Forest was outperformed by Support Vector Machines in the classification of annual crops using Sentinel-2 time series data, utilizing individual bands and vegetation indices. Additionally, (Campos-Taberner et al., 2021) found that for the same task, in Valencia region (Spain), RNN outperformed random forest models. Papadopoulou et al. (2023) also found that neural network models performed better than Random Forest models in the classification of annual crops in the Prefecture of Serres region (Greece).

Our study is the first to propose a model for identifying permanent pastures in Portugal using machine learning and Sentinel-2 time series data, leveraging this technology for the region’s specific agricultural landscape. To achieve this, we adopted a two-step approach. In the first step, we employed a recurrent neural network (RNN) that utilized the monthly time series of Sentinel-2 data. For each parcel, we extracted the mean values of each band and vegetation indices for each date of the time series, which were then fed into the RNN. This allowed us to capture the temporal patterns and changes in vegetation over time. In the second step, we employed a convolutional neural network (CNN) that operated on the monthly time series data at the parcel level. By treating each monthly image as an input, we trained the CNN to analyse the spatial patterns within each parcel, specifically focusing on those introduced by landscape features, such as the presence and extent of shrubs. This two-level approach enabled us to leverage both the temporal and spatial information present in the Sentinel-2 data, enhancing the accuracy of permanent pasture identification.

2 Material and methods

2.1 Study area and sampling design

This research focused on mainland Portugal. Spanning approximately from 36.98° N to 42.14° N in latitude and 6.19° W to 9.53° W in longitude, it encompasses a wide range of geographical features and climatic conditions. Portugal’s unique location along the Atlantic Ocean and the Mediterranean Sea gives rise to a diverse climate across the country, influencing its natural landscapes and environmental characteristics. Mainland Portugal has several climatic regions, as classified by the Köppen climate classification system (Rubel and Kottek, 2010). The northern and central parts of the country predominantly fall under the Csb climate category, characterized by mild, wet winters and warm, dry summers. The southern coastal areas experience a Csa climate, marked by hot, dry summers and mild, relatively wet winters. The interior regions, the transition from a Csa to a Csb climate occurs due to increased continentality, resulting in colder winters and greater temperature variations.

2.2 Land parcel information and remote sensing data

In Portugal, the Land Parcel Identification System (LPIS) is managed by the National Paying Agency for Agriculture and Fisheries (IFAP, 2021). The LPIS is updated annually and provides information on the geographic location and delimitation of each agricultural parcel, as well as the soil use type declared by farmers during their Common Agricultural Policy (CAP) subsidy application. For this study, we considered the 2020, 2021, and 2022 versions of the LPIS provided by IFAP. The ground truth data used in this study was collected by IFAP through on-the-spot checks (OTSC) conducted each year (IFAP, personal communication).

One notable source of Earth Observation data is the Sentinel-2 satellite mission (ESA, 2015; ESA, 2023). Sentinel-2, developed by the European Space Agency (ESA), provides high-resolution, multispectral imagery of Earth’s surface. It consists of a constellation of two identical satellites, Sentinel-2A and Sentinel-2B, operating in a sun-synchronous orbit. Sentinel-2 carries a multispectral instrument with 13 spectral bands, ranging from the visible to the shortwave infrared spectrum. This wide spectral coverage enables detailed monitoring of various land and coastal areas, including vegetation health, land use, urban development, and water quality. The spatial resolution of Sentinel-2 imagery is impressive, with bands offering 10-m, 20-m, and 60-m resolutions, allowing for detailed analysis and mapping at different scales. The frequent revisit time of Sentinel-2, with a global coverage every 5 days, facilitates monitoring of rapidly changing phenomena such as vegetation growth, deforestation, and natural disasters (Morais et al., 2021; Venter and Sydenham, 2021; Dusseux et al., 2022). Furthermore, the free and open data policy associated with Sentinel-2 has democratized access to high-quality Earth observation data in Europe, enabling researchers, scientists, and policymakers around the world to benefit from its rich and diverse information (Ali et al., 2016).

Here, Sentinel-2 time series were used, considering nine spectral bands, i.e., B2 (blue), B3 (green), B4 (red), B5-B7 (red-edge), B8 (near infrared), and B11-B12 (short-wave infrared) (ESA, 2015) and one vegetation index, the Normalized Difference Vegetation Index (NDVI) (Rouse Jr et al., 1973). Bands 1 (Coastal Aerosol), 9 (Water Vapor), and 10 (Short-Wave Infrared - Cirrus) were not considered as they are primarily used for atmospheric correction and cloud detection, rather than being directly relevant to vegetation analysis or land surface monitoring in this context (ESA, 2015; Phiri et al., 2020). Thus, the selected spectral bands and NDVI reflect a methodical approach designed to capture a wide range of vegetation characteristics, from leaf structure and chlorophyll content to water stress and biomass, enabling a robust analysis of vegetation over time (Phiri et al., 2020).

Only Sentinel-2 images with less than 30% cloud cover were used. To address gaps due to cloudy images, a linear interpolation method was employed, based on Inglada et al. (2015). Additionally, in the non-cloudy images, cloudy pixels were identified using the corresponding Sentinel-2 ″quality_cloud_confidence” masks. To handle these cloudy pixels, a temporal linear interpolation approach was employed, based on the work of Inglada et al. (2015), who demonstrated that a temporal linear interpolation provided the best balance between accuracy and processing time for handling cloudy data. In mainland Portugal, there are regions with long periods of cloud cover and that is also a limiting factor to use remote sensing data as Sentinel-2. For example, in the year 2018, for the tile 29TNE between April and September there were only 8 cloud-free Sentinel-2 images (cloud cover lower than 30%), but for 2019 the number of cloud-free images increased to 13 (Navarro et al., 2021). Here, to deal with this limitation, we followed the same approach as others (Navarro et al., 2019; Navarro et al., 2021; Catalão et al., 2022), i.e., linear interpolation between two cloud-free images. However, this introduces error in the data and consequently in the obtained results.

Monthly composite images were generated by integrating data from multiple time points within each month. This approach allowed for the extraction of representative information while reducing the impact of small temporal variations in the imagery. This approach minimizes the impact of short-term fluctuations caused by factors like weather events, cloud cover, atmospheric disturbances, or sensor noise. Monthly composites provide a more stable dataset for long-term monitoring, capturing essential trends while reducing the influence of temporary anomalies (Xu et al., 2019). We used a median approach to obtain the composite. The median is preferred because it is more robust to outliers and extreme values, which can arise due to residual cloud cover, atmospheric disturbances, or sensor noise. This approach ensures a more reliable and accurate representation of the typical conditions in each time period, preserving the integrity of the data. The resulting composite images were subsequently employed for the identification and mapping of permanent pastures.

Figure 1 displays the spatial distribution of the utilized parcels across mainland Portugal. A comprehensive dataset comprising a total of 153,883 parcels was used for this study.

Figure 1

Figure 1. Spatial distribution of the parcels used to identify permanent pastures in mainland Portugal.

2.3 Classification procedure and accuracy assessment

Figure 2 provides an overview of the procedure employed to identify permanent pasture parcels. This procedure consists of two main steps: the first step utilizes recurrent neural networks (RNN), while the second step employs a convolutional neural network (CNN).

Figure 2

Figure 2. Graphical representation of the process used in this work.

In the first step, the Sentinel-2 time series was transformed into a tabular format. For each parcel and month, the mean values of each band and the vegetation index were extracted from the time series. So, the data structure for each parcel was represented as 1 × 10 × 12, where 10 represents the number of bands (including individual bands and NDVI) and 12 denotes the number of months in a year. The RNN architecture incorporated multiple LSTM layers with decreasing units (256, 128, 64, 32, 16, and 8) and dropout layers with a rate of 0.2 after each LSTM layer. The size of these layers was chosen through an iterative procedure, optimizing for performance and generalization. Stacking LSTM layers with decreasing units allows the network to gradually compress information, making it easier for the model to focus on essential features while reducing overfitting risks. This hierarchical reduction is effective in many deep learning applications, particularly those involving time-series or sequence data (Smagulova and James, 2020). The addition of dropout layers after each LSTM layer serves as a regularization technique to mitigate overfitting, ensuring the network generalizes well to unseen data (Salehin and Kang, 2023). The architecture concluded with a dense layer comprising 16 units and a rectified linear activation function (ReLU), followed by a final dense layer with a sigmoid activation function. The model was compiled using the binary cross-entropy loss function, the Adam optimizer, and the accuracy metric. The ReLU activation function is commonly used in hidden layers of neural networks due to its simplicity and effectiveness. This helps prevent the vanishing gradient problem and allows the model to learn complex patterns (Vargas et al., 2021). Sigmoid is used in the final layer when binary classification is involved. It maps the output to a range between 0 and 1, making it suitable for predicting probabilities, particularly for binary outcomes (Dubey et al., 2022). The binary cross-entropy loss function is commonly used in binary classification tasks. It measures the difference between the predicted probability and the actual label, penalizing incorrect predictions (Ramos et al., 2018). Adam is an efficient optimization algorithm that adjusts learning rates adaptively based on first- and second-order moments of gradients (Dubey et al., 2022). The goal of this first step was to weed out parcels that, using the tabular format only, can be clearly discarded as not being pasture. Therefore, after training, in order to minimize the occurrence of false negative parcels (i.e., parcels that are permanent pasture but are incorrectly classified as non-pasture), the classification threshold was reduced to 0.20. Although this adjustment decreased the accuracy and precision of the model in the first step, it ensured the maximization of the probability that all permanent pastures were correctly classified as such. This threshold of 0.20 was defined iteratively through validation experiments, where it was found to ensure that more than 95% of the permanent pastures were correctly identified. This procedure ensured the maximization of the probability that all permanent pastures were correctly classified.

Subsequently, only the parcels classified as potentially permanent pasture proceeded to the second step. Here, the Sentinel-2 data was employed as a time series of images. For each parcel, the Sentinel-2 images were clipped based on their boundaries and divided into 16px x 16px images with at least 40% of their area falling within the parcel boundaries. As a result, the data structure for each parcel becomes 1 × 12xNx (16 × 16 × 10), where 12 represents the number of months in a year, N represents the number of 16 × 16 images clipped, and 10 represents the number of bands (including individual bands and NDVI). In this step, each image per parcel was treated independently. Thus, there is no temporal dependency between images of the same parcel.

The CNN model had the following architecture: it started with a 2D convolutional layer comprising 32 filters of size 2 × 2, accepting input images with 16 × 16 pixels and 11 channels. An activation function, ReLU, was applied to introduce non-linearity. Subsequently, a max pooling layer with a pool size of 2 × 2 was added to downsample the spatial dimensions. This process was repeated with two additional convolutional layers, each consisting of 32 and 64 filters of size 2 × 2, respectively. These were followed by ReLU activation and max pooling. The resulting feature maps were then flattened into a 1D vector and passed through a dense layer with 64 units, followed by ReLU activation. To mitigate overfitting, a dropout layer with a rate of 0.5 was introduced. Finally, a dense layer with a single unit and sigmoid activation was appended to produce the final binary classification output. The model was compiled using the binary cross-entropy loss function and the Adam optimizer.

The accuracy assessment procedure was the same for both steps. Regarding model training, validation, and testing, the total dataset was randomly divided into three parts: 70% for training, 15% for validation, and 15% for testing. The training set was used to train the models, while the validation set provided an unbiased evaluation of the model’s performance and aided in fine-tuning the model’s hyperparameters. The test set was used after the model had been fully trained to assess its performance on completely unseen data. Multiple evaluation metrics were assessed, including the confusion matrix, overall accuracy, precision, recall, and the F1-score. The performance of the classification was evaluated using the output of the second step model.

All methods were implemented on Python 3.9.12, using multiple packages. The packages used were Numpy 1.18.5 to handle all the data processing, scikit-learn 1.0.24 for data partition, keras 2.9 to construct the neural network models and TensorFlow 2.7 as the backend for keras.

3 Results

3.1 Descriptive analysis of the parcel information and remote sensing data

In total, we utilized 153,883 unique parcels that underwent on-the-spot checks for this study. Approximately 30% of the total number of parcels were classified as permanent pasture, amounting to 46,687 parcels. These permanent pasture parcels are distributed across the study area, with a concentration predominantly observed in the southern part of mainland Portugal, specifically in the Alentejo region, constituting around 30% of the total. This is the region of the country with the highest overall pasture area.

Figure 3 demonstrates that there is minimal disparity in reflectance values between permanent pastures and other land cover types across all spectral bands. This phenomenon can be attributed to the temporal similarity shared by permanent pastures and other land cover classes, such as temporary pastures or shrublands. However, Bands 6, 7, 8, 11 and 12 exhibit subtle, but relevant, variations in reflectance values across the year. For instance, in Band 6 and Band 7, while the overall trends are closely aligned between permanent pastures and other land cover types, there are observable differences during the early part of the year, particularly between November and February. In Band 8, permanent pastures tend to display slightly higher reflectance values compared to other land cover classes, particularly in the early part of the year. In Band 11, for example, reflectance values for permanent pastures diverge slightly from other land cover types between June and September. Similarly, Band 12 shows a growing disparity between July and September, where permanent pastures exhibit higher reflectance values compared to other land cover classes.

Figure 3

Figure 3. Temporal variation of the used bands (individual and NDVI) from the Sentinel-2 time series. In each subplot, permanent pastures are represented by the blue line, and other land classes by the orange line. The blue and orange lines represent the averages for permanent pastures and other land classes, respectively. The range of variation for each is indicated by the dark and light blue bands. Oct - October; Nov - November; Dec - December; Jan - January; Feb - February; Mar - March; Apr - April; May - May; Jun - June; Jul - July; Ago - August.

3.2 Permanent pasture identification

Table 1 presents the confusion matrix of step one (using RNN). In this step the classification threshold was reduced to 0.2 in an effort to minimize false negative cases. The model became more sensitive and captured a higher proportion of actual positive samples, specifically those belonging to the permanent pasture. This adjustment aimed to increase the recall, which indicates the model’s ability to correctly identify positive cases. For example, the model correctly identifies 97% of the actual permanent pasture samples (accuracy of 68%, a precision of 36% and a F1-score of 52%). However, this approach may also lead to a higher number of false positives (e.g., shrubland parcels), as the model becomes more prone to classifying samples as permanent pasture even if they belong to the other class. There were 13,634 non-pasture parcels misclassified as permanent pasture. Consequently, the precision of the model was affected, with an accuracy for the permanent pasture class equal to 46%. The precision was about 36%, indicating that out of all samples predicted as “Permanent pasture,” only 36% are actually permanent pasture. The F1-score, which considers both precision and recall, was 52% for the “Permanent pasture” class.

Table 1

Table 1. Confusion matrix on the test set for permanent pasture identification in step 1 (using recurrent neural networks).

In the OTSC conducted by IFAP, the actual land cover of all parcels was collected. Consequently, it became feasible to identify the specific true land cover of the parcels that were falsely classified as positives. These parcels primarily comprised temporary pastures (30%), shrublands (23%), and fallow land (8%).

Table 2 presents the confusion matrix of step two (using CNN). In this classification step the threshold was kept at 0.5. The model achieved an accuracy of 93%, which means that 93% of the samples were correctly classified. The precision for the permanent pasture class was 89%, indicating that out of all the samples predicted as permanent pasture, 89% of them were actually permanent pasture. The recall was 98% for the permanent pasture parcels. This means that 98% of the actual permanent pasture parcels were correctly identified by the model. The F1-score was 94%, indicating a balanced measure of the model’s performance, considering both false positives and false negatives. In this classification step the number of false negatives and false positives was considerably reduced.

Table 2

Table 2. Confusion matrix on the test set for permanent pasture identification in step 2 (using convolutional neural networks).

Figure 4 presents an example of the application of the proposed procedure to identify permanent pastures at the parcel level. The area displayed is one of the regions in the test set (i.e., it was not used to train the model). This region was chosen because it is particularly challenging for the algorithm as it exhibits a high density of parcels. In this area, non-permanent pasture parcels (light green) are predominant. Although the identification procedure performed well in this area, its accuracy was lower than the general performance on the test set, achieving 71% compared to 93%. Some parcels were misclassified, particularly those where permanent pasture was not detected (indicated by pink parcels), representing one of the lowest accuracy cases in the country.

Figure 4

Figure 4. Example of the identified permanent pastures and other areas at the parcel level.

4 Discussion

In this paper, we leveraged the advances in Earth Observation (EO) data, specifically Sentinel-2 data, in conjunction with machine learning methods for the identification of permanent pastures in Portugal. To enhance the identification process, a two-step approach was employed. In the first step, RNN were utilized to perform an initial identification of permanent pastures as well as similar land cover classes, such as temporary pastures and shrublands. In the second step, CNN were employed to effectively differentiate between permanent pastures and other land cover classes, achieving good performance with accuracy rates exceeding 85% across all evaluation metrics utilized.

To our knowledge, this study is one of the few specifically focused on the identification and classification of ‘permanent pastures,’ distinguishing them from other pasture types such as annual pastures and shrub pastures. In contrast, most existing models primarily target annual crops, and when pastures are included in land cover classification schemes, they are often aggregated into broader categories. This distinction is crucial for agricultural monitoring and policy applications, as permanent pastures play a different ecological and agricultural role compared to other types of vegetation. When comparing our approach with Navarro et al. (2021), who also conducted a study in Portugal, their focus was on verifying farmer declarations under the CAP using Random Forest and Support Vector Machine models. Their model achieved about 97% compliance accuracy, but the classification covered only annual crops and did not include any type of pastures. Our results, in contrast, achieved an accuracy of 93% and a recall of 98% for permanent pastures. In comparison with Campos-Taberner et al. (2021), who also used neural networks, their bi-directional Long Short-Term Memory (Bi-LSTM) network achieved a high overall accuracy of 97.5% in classifying various land uses, including abandoned lands. Like Navarro et al. (2021), their study did not focus on separating different types of pastures. Finally, when comparing our study with models that use traditional machine learning techniques, such as those by Sitokonstantinou et al. (2018) and Vizzari et al. (2024), both of which employed Support Vector Machines and Random Forest classifiers, we observed that these studies often focused only on annual crops. While they achieved strong classification metrics (e.g., Sitokonstantinou reported a kappa coefficient of 0.87, and Vizzari reported accuracies up to 89%), these models did not differentiate permanent pastures from other pasture types.

The Sentinel-2 time series was utilized to identify permanent pastures in mainland Portugal. Sentinel-2 possesses a significantly superior spatial resolution, up to 10 m, when compared to other satellites such as LandSat-8, which has a spatial resolution of 30 m (Zhong et al., 2019). However, in Portugal, the sizes of the parcels vary greatly, with a considerable number of parcels being small or very small (less than 0.1 ha). It was for areas with small parcels that our model performed the worse. In these cases, the spatial resolution of Sentinel-2 is too coarse to be applied effectively. To overcome this limitation, high-resolution satellite imagery such as Pleiades or GEOSat, which offer sub-meter spatial resolution, can be used as a viable alternative (Catalão et al., 2022). Nonetheless, acquiring such data incurs significant costs, which hinders their application in large areas like mainland Portugal. Furthermore, orthophoto maps derived from aerial images can also provide higher spatial resolution (Costa et al., 2020), which is essential for accurately identifying permanent pastures in very small parcels. However, orthophoto maps lack the temporal resolution necessary to monitor inter-annual patterns, as they are available at most once a year, whereas satellites pass over the same region multiple times throughout the year. The use of super-resolved images may be one strategy for moving forward and using high spatial and temporal resolution data.

There have been limited studies specifically focused on identifying permanent pastures and grassland systems, as most research has concentrated on broader areas (Phiri et al., 2020; D’Andrimont et al., 2021). Previous investigations conducted in Portugal have primarily emphasized crops and broad agroforestry systems (Allen et al., 2018; Navarro et al., 2021). It is noteworthy that the accuracy achieved in the present study surpasses that reported in the literature for studies with similar objectives. For instance, (Phiri et al., 2020) conducted a review on the utilization of Sentinel-2 satellite imagery for land cover and land use classification, where the overall accuracy for object-based classification ranged between 61% and 98% among the reviewed papers. In comparison with average performances, the present study attains higher accuracy levels. In the context of Portugal, (Allen et al., 2018) developed a model for land cover classification encompassing three municipalities in the Alentejo region, yielding an accuracy of 63% specifically for the identification of the agroforestry land cover class.

The procedure presented for identifying permanent pastures represents a valuable tool for automating the verification of farmers’ declarations and subsequent subsidies payment. However, additional validations are necessary to ensure its robustness and reliability. These validations should include testing the procedure in different years, distinct from the ones utilized in the present study (2019, 2020, and 2021). Moreover, it is important to acknowledge that the study area was limited to mainland Portugal. In order to encompass the entire country, including the Madeira and Azores archipelagos, further developments are required. These unique regions possess distinct conditions that differ significantly from those found in the mainland (Gil and Abadi, 2015; Shrestha et al., 2019).

5 Conclusion

We have successfully developed a novel model that integrates advanced remote sensing and machine learning techniques to significantly improve the precision of identifying permanent pasture parcels in Portugal. The key innovation of this study lies in the two-step approach, which leverages the strengths of both RNNs and CNNs in combination with Sentinel-2 time series data. In the first step, the RNN demonstrated its capacity to capture the temporal dynamics of the data, although it faced challenges in distinguishing between permanent pastures and similar land cover types. However, the critical novelty is the second step, where the CNN’s ability to capture spatial features led to a substantial improvement in classification performance, achieving 93% accuracy, 89% precision, 98% recall, and a 94% F1-score for the “Permanent pasture” class. This two-tiered model represents a novel approach to automated land cover classification within the CAP framework, addressing the longstanding challenge of distinguishing permanent pastures from other land covers that can be applied to large areas or at country-scale efficiently in computational terms. Our method not only enhances the accuracy of pasture identification but also offers a scalable, cost-effective solution for integrating advanced technologies into the CAP’s control mechanisms, streamlining subsidy verification and supporting more efficient agricultural management.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

TM: Conceptualization, Formal Analysis, Methodology, Software, Visualization, Writing–original draft, Writing–review and editing. TD: Supervision, Writing–original draft, Writing–review and editing. JF: Conceptualization, Writing–original draft, Writing–review and editing. MC: Conceptualization, Writing–original draft, Writing–review and editing. AM: Conceptualization, Writing–original draft, Writing–review and editing. IN: Conceptualization, Writing–original draft, Writing–review and editing. HL: Conceptualization, Writing–original draft, Writing–review and editing. RT: Conceptualization, Methodology, Supervision, Writing–original draft, Writing–review and editing.

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work was supported by Fundação para a Ciência e Tecnologia through project "GrassData - Development of algorithms for identification, monitoring, compliance checks and quantification of carbon sequestration in pastures" (DSAIPA/DS/0074/2019) and CEECIND/00365/2018 (R. Teixeira). The work was also supported by FCT/MCTES (PIDDAC) through projects UIDB/50009/2020, UIDP/50009/2020, and LA/P/0083/2020.

Conflict of interest

Author TM was employed by VirtuaCrop, Lda.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Ali, I., Cawkwell, F., Dwyer, E., Barrett, B., and Green, S. (2016). Satellite remote sensing of grasslands: from observation to management. J. Plant Ecol. 9, 649–671. doi:10.1093/jpe/rtw005