1 Introduction

Habitat loss and fragmentation caused by extensive changes in land cover are among the main drivers of biodiversity decline (Foley et al. 2005; Tscharntke et al. 2012). The different anthropogenic matrices resulting from fragmentation processes may both limit and promote the movement of organisms among remnant habitat patches (Pflüger and Balkenhol 2014; Spear et al. 2010; Taylor et al. 1993), therefore facilitating or constraining gene flow across the landscape (Storfer et al. 2007; Taylor et al. 1993). When the surrounding matrix limits the movement of individuals, gene flow between populations is reduced, resulting in increased genetic drift, inbreeding, and eventual loss of genetic diversity (Kennedy et al. 2013; Manel et al. 2003). These isolated populations may become more vulnerable to demographic variability, inbreeding depression, and environmental stochasticity (Manel et al. 2003), increasing their extinction probability. Social bees have particular biological attributes, including their sex-determination system and colony reproduction (Cook and Crozier 1995), making this group extremely sensitive to anthropogenic matrices (Goulson et al. 2010; Jaffé et al. 2019; Jha and Kremen 2013). Their haplodiploid sex-determination system results in lower effective population size and, consequently, reduced heterozygosity (Cook and Crozier 1995; Packer and Owen 2001). When the effective population size is low and inbreeding is high, the production of sterile diploid males increases, jeopardising the genetic diversity of bee colonies and the future evolutionary potential of their populations (Zayed 2009).

Stingless bees are critical pollinators for a wide range of wild plants and crops and are therefore of paramount importance to sustaining tropical biodiversity and food security (Grüter 2020; Slaa et al. 2006). In this diverse social bee group, colonies are perennial and composed of a few hundred to thousands of workers and a single once-mated queen (Grüter 2020; Michener 1974; Vollet-Neto et al. 2018). Daughter colonies are founded by fission, wherein the virgin queen and a swarm of workers leave the parental colony for a new nest site, mostly in tree cavities (Grüter 2020; Michener 1974). Since this is a gradual process, due to the long-term connection between parental and daughter colonies for nest resources, colony dispersal occurs over short distances and at low frequencies (Slaa 2006). Once a colony is established, it remains at the same nest site for several years since the egg-laying queen cannot fly, and queen replacement events are frequent (Vollet-Neto et al. 2018).

Due to these biological features, stingless bees may be much more sensitive to forest fragmentation than solitary bees (Brosi et al. 2008; Brosi 2009). Under the negative environmental impacts of human activities in the tropics, formerly continuous forests are fragmented and reduced to smaller and more widely spaced fragments, hampering bee movements from one fragment to another (Davis et al. 2010; Jha and Kremen 2013; Landaverde-González et al. 2017). This is especially true in landscapes where forests are composed of small, isolated fragments, such as those found in rainforests. The Atlantic Forest is one of the world’s most highly threatened biodiversity conservation hotspots (Myers et al. 2000) and has undergone continuous deforestation for more than five centuries through several economic cycles (Ribeiro et al. 2009). Despite the extreme difficulty of finding wild nests within rainforests, Atlantic Forest is an ideal system to study wild bee populations experiencing habitat fragmentation and degradation, mainly because more than 80% of the forest remnants are smaller than 50 ha, with a 1.4-km median distance from each other (Armenteras et al. 2017; Ribeiro et al. 2009).

Tetragonisca angustula, one of the most common and abundant stingless bee species of the Atlantic Forest, is widely distributed across the Neotropical region from southern Brazil (Camargo and Pedro 2013). Due to its tiny body size (4–5 mm length), T. angustula uses a wide variety of nesting sites, with different volumes and shapes of cavities, and it commonly founds new colonies in disturbed habitats (Silva et al. 2021; Slaa 2006), including urban environments where it is very successful (Velez-Ruiz et al. 2013). Although T. angustula has a wide geographical distribution and provides essential pollination services to many tropical flowering plants, few studies have assessed its genetic response in the face of environmental changes (Jaffé et al. 2016b; Francisco et al. 2017) to ensure its conservation. Indeed, landscape genomic approaches have not previously been used to evaluate how this stingless bee species responds to environmental changes. Furthermore, the use of landscape genetic tools to understand the role of resistance surface in wild stingless bee populations, enhancing or limiting gene flow, is still scarce (Jaffé et al. 2016ab; Landaverde-González et al. 2017).

In this context, relying on single nucleotide polymorphisms (SNPs) and landscape genomic tools, we investigated the genetic consequences of anthropogenic landscape changes for T. angustula. Specifically, we assessed (1) whether genetic diversity is influenced by forest cover and (2) whether gene flow responds to landscape resistance. Considering the biological attributes of our study species, we predicted that (1) reductions in forest amount would decrease the availability of nesting sites, resulting in small effective population sizes, and therefore tend to maintain lower genetic diversity in their populations, and (2) decrease in forest amount would result in low gene flow, due to the reduced connectivity among forested areas.

2 Materials and methods

2.1 Study area and bee sampling

We carried out this study in the Southeastern Atlantic Forest, in the Cantareira-Mantiqueira corridor region of São Paulo State, Brazil (Figure 1), in fragments embedded in a matrix of pasture, crops, forestry, and urban and degraded areas (Joly et al. 2014). To sample bees, we selected eight forest fragments over an area of 40 km2 (Figure 1), and we randomly searched for wild colonies of T. angustula and set trap nests (i.e., 280 sets of traps spaced at least 50 m apart) in tree trunks. The landscapes around the fragments consisted mainly of the agricultural matrix, pasture, and urban environments, and the forest amount around the forest fragments varied from 16 to 69% (when considering a landscape of 5 km radius). Between 2015 and 2017, we sampled 185 T. angustula workers from 46 nests (~4 workers/nest) and stored them in cryogenic tubes with absolute alcohol at –22 °C.

Figure 1.
figure 1

Original and remaining distribution of the Atlantic Forest throughout Brazil (a) and sampling sites in São Paulo State (b). Tetragonisca angustula, a native Neotropical stingless bee: worker (c); standing guards at the nest entrance tube (d); forager collecting nectar on the flower (e).

2.2 Genotyping and SNPs identification

We extracted genomic DNA from the whole insects, following the protocol described by Brito et al. (2009), and the amount of DNA was quantified in 1.5% agarose gels. Then, we prepared genomic genotyping-by-sequencing (GBS) libraries with two restriction enzymes, following the protocol described by Poland et al. (2012) (See Methods in the supplementary material). First, we used one “rare-cutter” enzyme (PstI) and one “common-cutter” enzyme (MspI) to digest the genomic DNA. After digestion, the fragmented DNA was ligated to adapters containing specific barcode sequences for each of 185 samples. The restriction-ligation products were then pooled, and the GBS library was enriched through PCR. We validated and quantified the libraries by qPCR in a real-time thermocycler model CFX 384 (BioRad), using a KAPA Library Quantification kit (KAPA Biosystems). Next, the GBS library was sequenced in an Illumina NextSeq 500 equipment with a 150-base pair single read-configuration, using a Mid Output v2 kit.

To detect SNPs markers, we used Stacks v.1.42 software (Catchen et al. 2011). The raw sequences were separated by samples according to unique barcodes, truncated to 140 bp, and screened for quality control by removing sequences with ambiguous bases and low quality, using the module process_radtags. As the T. angustula genome is not available, we used the ustacks module to detect loci. We used the following combinations of parameters: minimum sequencing depth (–m) ≥ 3, the maximum distance between stacks (–M) = 2, and maximum distance between primary and secondary sequences (–N) = 2. Next, we used the cstacks module to create a reference catalogue of common loci, allowing a maximum of 2 differences between stacks (–n) of different individuals. The module sstacks matched individual loci with those present in the Catalogue. Finally, we applied the “correction module” rxstacks module to eliminate loci with low probability values (–lnl_lim -10.0). Marker identification was performed with the population module, retaining only one SNP per sequenced tag, with a minimum depth of 3 × sequencing.

We used the R package r2vcftools (https://github.com/nspope/r2vcftools) — a wrapper for VCFtools (Danecek et al. 2011) — to perform additional quality control on the genotype data. First, we removed individuals with more than 30% missing data and repeated individuals, prioritising those that showed the lowest proportion of missing SNPs and retaining one bee per nest (n = 46 nests). Then, we filtered the data by depth (considering a minimum of 30 and a maximum of 220), linkage disequilibrium (L.D., r2 < 0.6), and Hardy–Weinberg equilibrium (HWE, p < 0.0001; see sample filtering script at: https://github.com/rojaff/r2vcftools_basics). To select a subset of markers that are likely neutral, we run a genome scan for selection based on the FST statistics to exclude all outlier loci. First, the FST statistics were computed from the ancestral allele frequencies using the snmf function from LEA package v2.0 (Frichot and François 2015), and FST values were converted into absolute z-scores. Next, the estimated z-scores were recalibrated before applying a test for neutrality at each locus by using the genomic inflation factor (λ) to consider the existence of population structure. Finally, to correct for multiple testing, a false discovery rate of q = 0.05 was applied (Benjamini and Hochberg 1995; François et al. 2016; see example script here http://membres-timc.imag.fr/Olivier.Francois/LEA/files/LEA_snmf.html). After excluding all outlier loci, 745 neutral and independent loci were used to perform the genetic analysis at the cluster and individual levels (Figure S1).

2.3 Genetic structure analyses

To assess population genetic structure, three complementary genetic clustering approaches were performed: snmf function, using the LEA package v2.0 (Frichot and François, 2015); Admixture models (Alexander et al. 2009); and discriminant analysis of principal components (DAPC), using the Adegenet package (Jombart 2008). Then, to estimate the genetic diversity of each genetic cluster, we used the GenDiv function from the R package r2vcftools, and we calculated observed heterozygosity (HO), expected heterozygosity (HE), nucleotide diversity (π), and inbreeding coefficient (FIS). Effective population size (Ne) employing the linkage disequilibrium method was estimated using NeEstimator 2.0.1, using a minimum allele frequency value of 0.05 (Do et al. 2014) and assuming single mating (for a review of reproductive biology in stingless bees, see Vollet-Neto et al. 2018). Additionally, we analyzed the fine-scale spatial genetic structure (SGS) among individuals. SGS represents the mean of relatedness (coefficient of coancestry; fij) between pairs of individuals for a series of 10 classes of geographic distances. These classes were defined by optimizing the proportion of individuals participating in each one. Confidence intervals were built for each distance class by permutating the individuals and generating random coancestries. The coefficient of coancestry was calculated using the estimator of Loiselle et al. (1995) using the SPAGeDi software (Hardy and Vekemans 2002).

To test the null hypothesis of migration drift equilibrium Tajima’s D was calculated and used as an indication of population expansion if negative and as an indication of population contraction if positive (Tajima 1989; see a sample script at: https://github.com/rojaff/TajimaD).

2.3.1 Objective 1: To assess whether genetic diversity is influenced by forest cover

To assess at what scale the forest amount influences genetic diversity, we calculated the percentage of forest cover surrounding each nest at different buffer sizes (radii of 200 m, 500 m, 750 m, and 1 km). By testing different buffer sizes, we could understand at what scale the genetic diversity is most influenced by forest cover (Millette and Keyghobadi 2015; Wiens 1989). Buffer sizes were considered the maximum foraging distance of T. angustula, estimated at 951 m (Araujo et al. 2004).

To estimate genetic diversity at the individual level, we calculated heterozygosity (H) using the package r2vcftools. We also estimated the inbreeding coefficient (f) with the same package as a measure of reduction in genetic diversity due to mating among relatives. Logit transformations were performed to normalise H. These parameters were modelled as a function of forest amount at different buffer sizes, using a linear mixed model by the lmer function of the nlme package (Pinheiro et al. 2014); the genetic clusters were included as random effects in the model. We generated univariate models and selected the best model, using the model selection approach and the model.sel function of the MuMIn package (Barton 2018). The sample‐size‐corrected Akaike information criterion (AICc) was used to determine the most plausible models (ΔAICc < 2) to explain the observed pattern (Zuur et al. 2009). We visually checked if model assumptions were met by plotting residuals versus fitted values and by checking for residual autocorrelation. To account for spatial autocorrelation, we then used the lme function from the nlme package (Pinheiro et al. 2014). We compared autocorrelation models for different correlation structures (no autocorrelation, linear, exponential, Gaussian, rational, and spherical quadratics). The models were selected by AICc values, using the model.sel function.

2.3.2 Objective 2: To assess whether gene flow responds to landscape resistance

To evaluate the effect of landscape features on gene flow, we assessed isolation by resistance (IBR). For this, we used the individual pairwise genetic relatedness as a proxy for gene flow and modelled it in resistance distances estimated from different resistance surfaces. Each resistance surface corresponded to the landscape features hypothesised to affect gene flow. For each resistance surface, a value of resistance was assigned for each pixel that represented how much that feature limited the movement of the species. The variables used to infer the landscape features influencing the T. angustula gene flow were: land cover (the surface cover on the ground, like vegetation, urban infrastructure, water, bare soil or other), forest cover (percentage of forest cover), terrain roughness (irregularity in elevation of a surface), elevation, and climatic variables. To avoid the use of highly correlated variables, we selected a set of orthogonal variables that explained most of the climatic variation across our study area by performing a principal component analysis (PCA) using the extracted values from all 19 WorldClim bioclimatic layers (retrieved from http://worldclim.org). All variables were standardised before performing the PCA. We then selected the three variables [temperature seasonality (66%), minimum temperature of the coldest month (20%), and precipitation in the wettest quarter (6%)] that showed the strongest correlation with the first, second and third PCA axes.

To evaluate the effect of elevation and terrain roughness on gene flow, we used a digital elevation map (DEM, retrieved from USGS Earth Explorer) and the terrain roughness index in QGIS 2.18.0 to build a roughness raster. We assumed that pixels with high values represented the highest resistance to gene flow for each bioclimatic, elevation, and roughness raster. We obtained a raster of forest cover (University of Maryland, http://earthenginepartners.appspot.com/science-2013-global-forest/download.html) in which each pixel contained a value for forest cover ranging from 0 to 100%. We thereby tested whether forest cover limits or promotes gene flow. We used the actual raster values to test for the forest in limiting gene flow. To test for forest-promoting gene flow, we used inverted forest-cover raster values by subtracting 100 from each pixel (Jaffé et al. 2016a, b).

The high‐resolution land-use map from Mapbiomas (the year 2016, http://mapbiomas.org) was cropped to the extent of the study area and reclassified into six classes: agriculture, annual and semi-perennial crops, forest, pasture, urban area, and water body. This classification allowed us to remove low-frequency classes (bare soil and forestry) and group others (for instance, annual and semi-perennial crops) since after cropping images, the representation of rare classes becomes evident and may interfere with the analysis (Turner and Gardner 2015). To avoid any bias when determining resistance values for land-use classes, we optimised a resistance surface for land cover using the R package ResistanceGA (Peterman et al. 2014). Based on the genetic algorithm and model-selection approach using MLPE models, ResistanceGA optimises the resistance surface using genetic distance data (here, we used Yang’s genetic relatedness among pairwise individuals). This analysis returns resistance values for each landscape feature used to build the resistance surface for land cover (Peterman et al. 2014, 2018).

We built a resistance-surface map representing the null model of isolation by distance, attributing 0.1 to each pixel from the land-use map. We cropped all rasters (resistance surfaces) to the extent of the study areas. We converted the resolution to 900 m (raster resolution obtained in WorldClim) to enable map comparison and variable influence. The projection applied to all rasters was the Universal Transverse Mercator (UTM), and spatial analyses were performed using the R package raster (Hijmans 2014). Finally, we used Circuitscape v.4.0 (McRae and Beier 2007) to estimate pairwise resistance distances between sampling locations for all resistance surfaces. Since the program does not accept zero resistance values, all rasters with zero values were replaced by 0.0001.

To determine the influence of landscape on gene flow, we modelled the response variable (Yang’s genetic relatedness among pairwise individuals) as a function of resistance distances obtained by Circuitscape for each explanatory variable (elevation, terrain roughness, temperature, precipitation, land cover, and forest cover). We tested for three different rasters representing forest cover (forest cover, forest cover inverted, and land cover) and used model selection to choose which would be included in the final model. Therefore, to determine which of the three rasters best explained the genetic relatedness, we performed a model-selection approach based on the Akaike information criterion (AIC), and we selected the variable with ΔAIC < 2 (Table S1). The variable that best explained forest cover was included in the global model. To test gene flow concerning landscape variables, we first tested models accounting and not accounting for genetic structure, using linear mixed-effect models (lmm) and generalised least-squares models (gls), respectively, in the R package nlme (Pinheiro et al. 2014). For both models, we implemented the MLPE correlation structure (https://github.com/nspope/corMLPE), and for the lmm models, we used genetic clusters as a random effect. Then, we used model-selection criteria to identify which model best explained the genetic relatedness. Since the lmm model considers the use of random effect (random =  ~ 1|genetic clusters), we used it to build the global model. All explanatory variables were standardised to allow comparisons among them. After creating the global model, we used the dredge function in the R package MuMIn v1.42.1 to build models with all combinations of variables (see sample script at: https://github.com/rojaff/dredge_mc). The set of best models (∆AIC ≤ 2) was compared to reduced models without predictor variables, using likelihood-ratio tests (LRT, α = 0.05), and all models were visually validated by plotting residual vs fitted values and by checking for residual autocorrelation.

3 Results

3.1 Genetic structure

We identified 3.573 SNPs in T. angustula, and after filtering for quality, depth, L.D., HWE, and candidate adaptive loci, we obtained 745 neutral loci from 46 colonies (1 bee/nest). Using different methods to determine population structure, we detected two genetic clusters (K = 2), one composed of 38 and the other of 8 individuals (Figure 2). HO, HE, and FIS estimates were similar in both genetic clusters, while π was higher in cluster 1. When estimating Ne for cluster 1 with the same number of individuals as cluster 2, the Ne result did not differ between groups (Table I). The values of Tajima’s D indicated a recent contraction in cluster 1, which was not different from zero 0 in cluster 2 (Table I). We found a positive spatial autocorrelation in pairwise genetic relatedness for up to 10 km, after which it became negative (Figure 3).

Figure 2.
figure 2

Tetragonisca angustula assigned to two genetic clusters, according to Admixture software, over a land-use map (Mapbiomas) in the Southeastern Atlantic Forest.

Figure 3.
figure 3

Autocorrelograms of the mean of the coefficient of coancestry between pairs of individuals of Tetragonisca angustula for different distance intervals. The continuous line shows the coefficient of coancestry and the dashed line the 95% confidence interval of the expected values for a pattern of random distribution of genotypes.

3.2 Effects of landscape features on genetic diversity and gene flow

While forest cover showed a significant and negative effect on the heterozygosity within a 500-m radius, we did not find a significant association between forest cover and inbreeding (Tables II and S2). Optimizing the land cover resistance surface revealed that forest was the class with the lowest resistance to gene flow, while urban environments had the highest resistance (1.00 and 2057.6, respectively; Table S4 and Figure S3). Besides that, we found that resistance distances estimated from climatic variables and elevation had the same effect on gene flow (Table III), suggesting that geographic distance was the simplest model to explain gene flow as much as the other resistance distance (Table S3). Although landscape features did not significantly affect genetic relatedness, the relationship between land use and relatedness showed a negative trend (Figure S2). To better assess this relationship, further studies with more sampling sites should be performed.

4 Discussion

This is the first study to investigate the genetic consequences of contemporary human‐altered landscapes for tropical social bees, sampled exclusively from wild nests and based on hundreds of independent genetic markers. Our results show that T. angustula individuals are structured into two genetic clusters, with moderate genetic diversity and low inbreeding levels. Furthermore, our analyses indicate fine-scale spatial genetic structure within a 10-km scale. Whereas forest amount affects the heterozygosity of T. angustula at a 500-m scale, inbreeding was not influenced by this land cover type. Finally, geographic distance explained gene flow as much as resistance distance based on climatic variables and elevation, suggesting that T. angustula can disperse across heterogeneous landscapes.

Consistent with previous genetic studies (Baitala et al. 2006; Francisco et al. 2014, 2017; Oliveira et al. 2004; Santiago et al. 2016; Stuchi et al. 2014), our data show that T. angustula has genetically structured populations, which are likely shaped by life-history traits, such as short-distance dispersal capabilities, female philopatry, habitat generalist and wide-diet-breadth, and local environmental features (Francisco et al. 2017; Lichtenberg et al. 2017; Silva et al. 2021). Our analyses reveal that populations across human-altered landscapes were assigned to two distinct genetic clusters. Still, they were similar in terms of effective population size (when considering the same sample size), inbreeding and heterozygosities. We found a positive Tajima’s D value in cluster 1, suggesting a sudden population contraction that may be linked to the loss of suitable habitat. On the other hand, Tajima’s D was not different from zero in cluster 2, resulting from a lack of power due to limited sampling. A significant positive spatial autocorrelation in genetic relatedness for up to 10 km goes against previous microsatellite-based studies for other stingless bee species (Jaffé et al. 2016b; Landaverde-González et al. 2017). These findings are important to determine the occurrence range of related T. angustula colonies and provide useful guidelines to avoid long-distance colony transportation and safeguard genetic composition and local adaptations of wild bee populations.

4.1 Objective 1: To assess whether genetic diversity is influenced by forest cover

Our results show that forest cover influences genetic diversity in T. angustula. However, unexpectedly, forest cover negatively affected the genetic diversity at a 500-m radius, and neither of the models tested was significant for other geographic scales nor for inbreeding. Despite the positive trend observed at greater scales, we did not find any effect, suggesting that our analysis at 500-m radius is underpowered, most likely due to the small sampling size. The decreasing genetic diversity at a smaller geographical scale is most likely due to the lower habitat amount and edge effects. Relationships between landscape context and population outcomes are inferred at the spatial scale most relevant to a particular species-landscape interaction (Jackson and Fahrig 2014). In this sense, in a multi-scale analysis for T. angustula, it might be that with a small amount of forest cover, the availability of a nesting sites is limited, and areas with reduced forest cover restrict colonisation events. Therefore, small-sized and/or low-quality habitats lead to more isolated populations with smaller effective sizes and reduced genetic diversity (Kelemen and Rehan 2021; Steffan-Dewenter and Schiele 2008; Zayed 2009). Thus, we infer that the quality of trees at a 500-m radius does not provide enough resources to maintain a large T. angustula population with high genetic diversity.

The similar effective population sizes of both genetic clusters suggest a recent colonisation by T. angustula, indicating that the populations did not have time to accumulate high genetic variability. The low inbreeding is presumably related to the limited dispersal capability of T. angustula (Araujo et al. 2004; Santos et al. 2016), but also might indicate the start of inbreeding depression.

If inbreeding is indeed starting to occur in these T. angustula populations, there will be profound implications for their conservation. Due to their life-history traits, small sized-populations may be more detrimental for stingless bees (Alves et al. 2011; Zayed 2009). The complementary sex determination system is one factor that makes the effects of small population size stronger in stingless bees, in which matched matings will result in increased production of unviable or sterile diploid males, imposing significant costs at either colony and population level (Zayed 2009; Cook and Crozier 1995). High diploid male production further reduces effective population size and increases population genetic load that can lead populations to be extinct within a few generations (Zayed 2009). Therefore, any landscape feature that may favour inbreeding avoidance in native eusocial bees would be selectively advantageous and an effective mechanism to break any potential extinction risk (Mueller et al. 2012).

Even though genetic diversity change might be a slow process, the time lag depends on the nature of landscape changes, species dispersal capacity and its effective population size (Kelemen and Rehan 2021). Through dispersal, regional population dynamics are linked to local patterns of diversity, and the temporal scale of the process determines the spatial scale of this linkage (Jackson and Fahrig, 2014). In this sense, the decreased genetic diversity observed in T. angustula may be a response to severe changes in the spatial pattern of forest cover over time, which have been intensified in recent decades (Armenteras et al. 2017), and we may now be observing just a part of this process.

Although previous studies on tropical bees (Jaffé et al. 2016a, b; Soro et al. 2017) did not detect any significant association between landscape features and genetic diversity parameters, in temperate regions, altitude significantly influenced allele richness of Bombus flavifrons and B. sylvicola populations (Koch et al. 2017), reinforcing the discussion about geographic differences explaining bee responses from tropical and temperate regions (Kelemen and Rehan 2021). These differences may be an artifact of historical differences in land-use since temperate regions have suffered deforestation centuries ago, while deforestation in the tropics began in the last century (William 2003). Thus, it is possible that these responses can be found in the current days in temperate population bees, while in tropical regions, these changes are only just starting to affect these species (Jaffé et al. 2019). However, the incidence of time-lag effects on the response of genetic diversity to habitat loss is nevertheless more difficult to assess since different metrics respond at different rates (Carvalho et al. 2019). In addition, the studied bees in temperate regions are mainly ground-nesting species and may have different responses to landscape changes.

4.2 Objective 2: To assess whether gene flow responds to landscape resistance

Contradicting our second hypothesis, we found that geographic distance explained gene flow in T. angustula as much as landscape resistance, suggesting that landscape features are not limiting bee movement across the study landscape. It is possible that both variables are highly correlated and then had the same effect on explaining gene flow. Due to low dispersal capacity of T. angustula (Araujo et al. 2004; Santos et al. 2016), geographic distance played an essential predictor of gene flow. Although it is well established that land cover changes threaten bee populations (Pots et al. 2010; Grab et al. 2019; Renauld et al. 2016; Warzecha et al. 2016), landscape resistance did not limit gene flow in T. angustula, showing that this pollinator can maintain gene flow across heterogeneous and fragmented landscapes, as long as the geographic distance is adequate to allow bee movement from one forest patch to another.

In bees, population genetic differentiation may be influenced by geographic distance, body size, and degree of social structure (López‐Uribe et al. 2019). Even though there is a controversy regarding body size and geographic distance (Jaffé et al. 2016b), earlier evidence showed that body size strongly impacts population genetic structure, where larger‐bodied bees exhibit lower genetic structure than smaller‐bodied bees (López‐Uribe et al. 2019). For example, due to its tiny body size, T. angustula cannot fly long distances, exhibiting lower levels of gene flow and, therefore, increased population genetic structure. Moreover, T. angustula is a habitat generalist stingless bee, founding its colonies in various nesting sites with different cavity sizes (Brosi et al. 2008; Slaa, 2006).

Our results highlight the importance of conserving T. angustula populations within complex heterogeneous landscapes, including agricultural lands. This bee species can be regarded as rescue pollinators in human-altered landscapes since it can survive in degraded habitats, as suggested for other stingless bees, such as Trigona spinipes (Jaffé et al. 2016a, b). This ability to persist in heavily altered landscapes allows it to maintain large effective population sizes, avoids inbreeding and promotes genetically diverse populations across landscapes (Allendorf et al. 2012; Jaffé et al. 2016a).

So far, different organisms have shown that their gene flow is influenced by various factors, including forest cover, land cover, elevation, roads, rivers, and climatic conditions. Responses vary significantly across species (Monteiro et al. 2019). Urban areas showed the highest resistance distance among the tested land cover classes, while forest cover had the lowest resistance distance to T. angustula. This might be probably due to low frequencies of urban area class in the sampled landscape biasing the power of analysis (Turner and Gardner 2015). Another strategy to clarify our findings would be sampling in urban areas to observe genetic patterns in urbanised populations. The urban area has been considered one of the key drivers of impact to reduced gene flow in other studied bees, mainly in solitary species (Dreier et al. 2014; Jha and Kremen 2013; López-Uribe et al. 2015). Elevation, roughness, precipitation and temperature did not explain any pattern of gene flow in T. angustula. This could be due to the size of our sampled area, where these features did not vary significantly, since it does not encompass the full distribution range of this bee species. However, elevation and temperature are essential features to investigate the potential influence of landscape on gene flow patterns in tropical native bees (Jaffé et al. 2016a, b, 2019) and in some bumble bees in temperate regions (Koch et al. 2017; Jackson et al. 2020).

Our results showed that forest cover had the lowest resistance to gene flow among T. angustula populations and that gene flow is limited by geographical distance. Therefore, forest loss may increase population differentiation due to habitat fragmentation processes that result in small, isolated populations, which may lead to two main biological implications.

First, the small effective number of breeding queens and males in the population and the high energy costs required to disperse through large impervious covers limit gene flow in a broader landscape context. Second, it is believed that the limited queen dispersal capacity prevents colonisation of new habitat patches within a few hundred meters (Francisco et al. 2017), and males play an important role in contributing to gene flow between populations over local geographical scales (Santos et al. 2016). Third, even though the dispersal of T. angustula males is well studied, whether virgin queens disperse as opportunistic queens is still unknown. In this case, if the dispersal distance of the queen is shorter as in males, and if the distance between the studied forest fragments increases, in the near future, we may even witness a halt in gene flow.

This is the fifth study with tropical wild stingless bees to address landscape components in order to determine patterns of genetic structure (Jaffé et al. 2016a, b2019); Landaverde-González et al. 2017). Consistent with previous genetic studies on T. angustula (Francisco et al. 2017; Jaffé et al. 2016a, b), our results show that isolation by resistance was not detected. Among 18 stingless bee species, the gene flow was limited by habitat loss only in the Melipona subnitida (Jaffé et al. 2016b, 2019). This species, unlike T. angustula, relies mainly on tree cavities (in some regions, they also nest in termite nests) to find its colonies, which constrains colony dispersal even more than habitat generalist stingless bee species.

Even though we had a smaller sample size than previous studies, which could have influenced the results on gene flow, the sample size of T. angustula used by Jaffé et al. (2016b) was 15 times larger than ours. Still, they did not find any evidence of isolation by resistance. Hence, the response to landscape resistance indicates to be species-specific. Importantly, although we sampled bees from wild nests, we cannot rule out the possibility that human-mediated nest movement changed the population’s genetic structure in the past. For example, Jaffé et al. (2016a, b) acknowledged anthropogenic nest transportation across large areas to enhance gene flow in T. angustula. By contrast, recent human land cover changes in temperate regions lead to a reduction in the gene flow of Bombus vosnesenskii (Jha 2015) and elevation may have limited gene flow in B. bifarius (Lozier et al. 2013).

Moreover, the gene flow of both bumblebee species was limited by environmental niche resistance (Jackson et al. 2018), bumblebees have large body size and found nests alone, that allows extensive range dispersal. Signatures of selection by environment were observed in B. vancouverensis and M. subnitida in tropical regions, showing trends of genetic diversity loss and limitation of gene flow across landscapes for different pollinator species. In this sense, studies using this approach with T. angustula are necessary to further understand how landscape affects it at an evolutionary scale and ensure its conservation. Even though studies using landscape genomics on bees are increasing, they are still scant (e.g., Jackson et al. 2018, 2020; Jaffé et al. 2016a, b, 2019; Jha and Kremen 2013; Jha 2015; Landaverde-González et al. 2017, 2018; López-Uribe et al. 2015; Lozier et al. 2013; Suni and Brosi 2012), and our study reinforces that the influence of the landscape must be tracked to fully understand the processes that shape genetic diversity and gene flow of native pollinators.

5 Availability of data

Data will be made available from the corresponding author on reasonable request.

Table I Genetic diversity estimates for Tetragonisca angustula
Table II Effect of forest cover on individual heterozygosity (H) and inbreeding coefficient (f)
Table III Summary of model selection of all MLPE models for Tetragonisca angustula in the Atlantic Forest. All models had inter-individual genetic relatedness as the response variable, the different resistance distances as predictor variables, and genetic clusters as random effect