Remote sensing of three‐dimensional coral reef structure enhances predictive modeling of fish assemblages

LiDAR (light detection and ranging) allows for the quantification of three‐dimensional seascape structure, which is an important driver of coral reef communities. We hypothesized that three‐dimensional LiDAR‐derived covariables support more robust models of coral reef fish assemblages, compared to models using 2D environmental co variables. Predictive models of coral reef fish density, diversity, and biomass were developed using linear mixed effect models. We found that models containing combined 2D and 3D covariables outperformed models with only 3D covariables, followed by models containing only 2D covariables. Areas with greater three‐dimensional structure provide fish more refuge from predation and are crucial to identifying priority management locations that can potentially enhance reef resilience and recovery. Two‐dimensional seascape metrics alone do not adequately capture the elements of the seascape that drive reef fish assemblage characteristics, and the application of LiDAR data in this work serves to advance seascape ecology theory and practice in the third dimension.


Introduction
Coral reefs are among the most diverse and complex ecosystems in the world and provide valuable ecosystem services (Moberg and Folke 1999). Coral reefs worldwide are experiencing increased global stressors, from more frequent and severe mass bleaching events (Berkelmans et al. 2004;Hughes et al. 2017), as well as a suite of local anthropogenic stressors. It is essential to develop methods of assessing coral reef communities across broad geographic areas to support urgent management actions, without the time constraints and high costs associated with in situ field surveys. Remote sensing was first utilized to study coral reef ecosystems when the Landsat multispectral sensor was applied in the 1970s (Smith et al. 1975). Since the first application of remote sensing in a coral reef environment, the technology and capabilities of remote sensors have significantly progressed and are becoming increasingly important for mapping, monitoring and understanding global marine systems.
Mapping and identifying structurally complex habitat that represents important biodiversity hotspots have significant management implications for the coastal communities that rely on the many services provided by healthy reef fish populations (e.g., fisheries food security, tourism, culture, etc.). The recent advent of remote sensing technology that provides the ability to incorporate vertical habitat structure shows promise for the advancement of landscape analysis (Turner 2005) ecology. LiDAR (light detection and ranging) is an active remote sensor that allows for quantification and spatial analysis of structurally complex habitats (Lefsky et al. 2002). LiDAR has been applied in the marine environment to map coral reef structure (Storlazzi et al. 2003), measure reef habitat complexity (Brock et al. 2004) and was also found to demonstrate strong predictive relationships with several measures of fish assemblage structure .
Remotely sensed data can inform predictive maps and models of reef fish assemblages that highlight critical habitats of high conservation value and represent an essential step in advancing conservation planning, ocean zoning and marine reserve evaluation (Pittman et al. 2007b;Mellin et al. 2009). The two main types of remotely sensed data utilized in predictive modeling are (1) two-dimensional [2D] categorical maps (e.g., benthic habitat maps) and (2) three-dimensional [3D] continuously varying surfaces (e.g., digital elevation models). A majority of predictive modeling studies have been conducted using environmental variables based on 2D categorical maps of the marine environment Pittman et al. 2004Pittman et al. , 2007a. Until recently, this has limited seascape ecology to a two-dimensional science focused on the spatial patterning of planar surfaces which represents an oversimplification of the true variability in structural complexity that exists in nature (Lausch et al. 2015). However, there has been a shift toward applying continuously varying 3D surfaces to derive environmental variables that characterized physical complexity to support predictive modeling (Pittman et al. 2009;Knudby et al. 2010;Darling et al. 2017;Stamoulis et al. 2018).
Moving from 2D categorical maps to 3D surface morphometrics, which incorporate vertical structure and habitat complexity, shows great promise for the advancement of landscape pattern quantification (Turner 2005;McGarigal & Cushman 2005;Pittman et al. 2009;Wedding et al. 2011). Understanding the world from a 3D perspective offers great potential to advance our knowledge of landscape patterns and processes (Davies and Asner 2014) and can expand our understanding of the relationship between organisms and the geometry of the seascape. Pittman et al. (2007b) utilized both 2D and 3D data to explain faunal responses to seascape structure and found that when fine-scale 3D topographic complexity was integrated with the 2D seascape metrics, it became the single best predictor. This indicates that an integrated 2D and 3D seascape model (patch-mosaiccontinuum model) captures more of the ecologically meaningful seascape heterogeneity (Pittman et al. 2007b). There is a need to evaluate integrated 2D and 3D seascape models in order to and determine if these models support more robust predictive modeling of fish communities when compared to the classic 2D metric-based approach. Here, we hypothesized that three-dimensional LiDAR-derived covariables support more robust spatial predictive modeling of coral reef fish assemblage structure when coupled with models using 2D environmental variables derived from benthic habitat maps. To test this hypothesis, we applied statistical models that contain both sets of 2D and 3D covariables and compared them with models with only 2D covariables and only 3D covariables.

Study sites
Fish surveys were conducted at 625 transect study sites in the Main Hawaiian Islands between 2002 and 2005 ( Fig. 1). These locations represented a range of management regimes (Fig. 2), human population pressures, and shoreline exposures. In addition, these areas have a wide range of habitat types that ranged from extensive coral dominated, large boulder habitat, and seagrass beds. These study sites were chosen because they represent a variety of habitat complexities and habitat types. Furthermore, sites were randomly selected across the complete network of no-take marine-protected areas (marine life conservation districts) and sampling occurred both inside and outside of the protected areas.

Remotely sensed data
Benthic habitat maps at each study site were created by the NOAA Biogeography Branch using panchromatic sharpened multispectral imagery obtained from the Quickbird Satellite (DigitalGlobe Inc.). Benthic habitats were manually delineated from the imagery by photo interpretation at a scale of 1:6000 and a with a minimum mapping unit of 0.4 hectares (Coyne et al. 2003).
Bathymetric data were collected by the US Army Corps of Engineers using the Scanning Hydrographic Operational Airborne LiDAR Survey (SHOALS) system in the Main Hawaiian Islands between 1999 and 2000. The SHOALS system collected bathymetric and topographic soundings using infrared (1064 nm) and blue-green (532 nm) scanning laser pulses that generally collect data at a horizontal spot density of 4 m with a vertical accuracy of AE 20 cm and a horizontal accuracy of AE1.5 m (Irish and Lillycrop 1999). The depth range for the SHOALS sensor was approximately 1-40 m in locations with optimal water clarity.

Fish assemblage data
Fish assemblages were assessed using standard underwater visual belt transect survey methods (Brock 1954(Brock , 1982. within 2.5 m to either side of the centerline (125 m 2 transect area). The total length of fishes were estimated to the nearest centimeter. Field surveys were conducted using a stratified random sampling design (Friedlander et al. 2007a,b). The habitat strata [sand (UCS), colonized (CHB), macroalgae (MAC), and uncolonized hard bottom habitats (UCH)] were based on NOAA's Biogeography Branch benthic habitat maps (Coyne et al. 2003) (Fig. 2). This stratified random sampling methodology was used to guide the sampling design and account for variation in fish assemblages that may be influenced by the benthic habitat at the study site Christensen et al. 2003;Friedlander et al. 2003a,b).

Fish assemblage characteristics
Density, species diversity, and biomass were calculated to characterize the fish assemblage. Density represented the total number of fishes per square meter on transects. Species diversity was calculated from the Shannon-Weaver Diversity Index (Ludwig and Reynolds 1988): H' = S (p i ln p i ), where p i is the proportion of all individuals counted that were of species i. Length estimates of fishes from visual censuses were converted to weight using the following length-weight conversion: W = aSL b , where the parameters a and b are constants for the allometric growth equation and SL is the standard length in millimeters and W is weight in grams. Total length was converted to standard length by multiplying standard length to total lengthfitting parameters obtained from FishBase (www.fishbase. org). Length-weight fitting parameters were available for 150 species commonly observed on visual fish transects in Hawaii (Hawaii Cooperative Fishery Research Unit unpublished data). These data were supplemented with information from other published and web-based sources. When length-weight information did not exist for a given species, the parameters from similar bodied congeners were used. All biomass estimates were converted to metric tons per hectare (t ha À1 ) to facilitate comparisons with other studies in Hawaii.

Remotely sensed environmental variables
The benthic habitat maps allowed for characterization of the seascape surrounding a 100 m radius buffer around the transect locations. A 100 m buffer was chosen based on previous marine landscape ecology studies that have demonstrated strong relationships within this scale across the seascape Mellin et al. 2007;Pittman et al. 2007a,b). Within the 100 m buffer, the habitat richness and area (m 2 ) of (1) coral; (2) coralline algae; (3) macroalgae; (4) turf; and (5) sand/mud were calculated in MATLAB software. LiDAR data were interpolated using Inverse Distance Weighting in ArcGIS Spatial Analyst. Bathymetric grids were created at a grid cell size of 4 m. All subsequent geomorphic metrics were derived from these 4-m bathymetric grids using (1) bathymetric variance; (2) rugosity; (3) slope of slope; and (4) fractal dimension. Bathymetric variance was derived from the bathymetric grids using a Matlab moving window analysis. Rugosity was derived from the bathymetric grids using the Benthic Terrain Modeler for ArcGIS, where the raster cell values reflected the ratio of the seascape surface area to the planimetric area determined in a neighborhood analysis (Jenness 2003(Jenness , 2004Lundblad et al. 2006). The rugosity value represented the ratio of surface area to the planimetric area. Slope and slope of slope were derived from the bathymetric grids using the ArcGIS Spatial Analyst extension slope function, where the raster cell values represented the maximum rate of maximum slope change between neighboring cells and were calculated in degrees. Aspect was calculated using the ArcGIS Spatial Analyst extension aspect function. Fractional dimension was calculated in Matlab using a single pass technique (Eastman 1985). The formula is based on calculated slopes as follows:

Statistical analysis
Spatial predictive models of fish density, diversity and biomass were developed using linear mixed effect models ( Pinheiro and Bates 2009;Zuur et al. 2009). Data exploration was carried out following the protocol described in Zuur et al. (2010). Cleveland dot plots were used to determine whether there were outliers present in covariates. Collinearity was investigated between covariates using the Pearson correlation coefficients, multi-panel scatter plots and variance inflation factors (VIF) (Montgomery and Peck 1992). The initial statistical analysis indicated that there was a location effect for density, diversity and biomass. Therefore, we fit a linear mixed effect model with location and used as a random intercept. In order to allow for spatial correlation in the models, a spherical correlation structure was included. Parameters of these models (range and nugget) were selected from variogram residuals. Models for biomass indicated considerable heterogeneity, and therefore, the variance was modeled using variance covariates (Pinheiro and Bates 2009;Zuur et al. 2009).
The first set of predictive models was run using all 2D and 3D covariables combined (Table 1). Three models were run and included Model 1a (2D and 3D covariables), Model 1b (2D covariables only), and Model 1c (3D covariables only) ( Table 2). A likelihood ratio test was applied to compare Model 1a versus Model 1b. The second set of predictive models were run using only 2D covariables from the benthic habitat maps and the third using only 3D covariables. To test whether incorporating 3D LiDAR metrics enhances model performance, the three models were compared using likelihood ratio tests and Akaike information criteria (AIC) (Burnham and Anderson 2002). Model validation of the optimum model was applied by inspecting homogeneity (plotting residual vs. fitted values) and independence (variogram of residuals and plotting residuals vs. each covariate). Calculations were computed using R software, including the nlme package (Pinheiro et al. 2018).
The Cleveland dot plots indicated the need for a square root transformation of four covariates (e.g., rugosity, fractal dimension, area of coralline algae, variance in depth).
Based on VIFs and Pearson correlation coefficients, five covariates were removed from the set of covariates (e.g., fractal dimension, area of coralline algae, slope, variance in depth, and rugosity). To investigate whether 3D variables provided more robust predictive models of fish assemblage structure, the following three linear mixed effect models were applied and compared: Density ij is the square root transformed density for observation j at location i, a i is the random intercept for location i (it is assumed to be normally distributed with mean 0 and variance r 2 location), and e ij are the residuals (they are assumed to be normally distributed with mean 0 and variance r 2 , and they are also spatially correlated). A spherical correlation structure was adopted for the residuals; range and nugget were set to 400 m and 0.8, respectively.
A similar model was applied to biomass and diversity; range and nugget were 400 and 0.75 and 500 and 0.70, respectively. For biomass, we modeled the heterogeneity by using management, habitat as well as management and habitat combined as variance covariates for the residual variance r 2 . Where the first variance structure allowed for different spread per management regime, the second variance structure allowed for different spread per habitat, and the third variance structure models different spread per management and habitat combined. Models were  compared using the AIC. The null hypothesis in this comparison was whether the regression parameters (b1, b2, b3) from the 3D variables in Model 1a are equal to zero (H0: b1 = b2 = b3 = 0) and the alternative is that they were not equal to zero (HA: b1 6 ¼ b2 6 ¼ b3 6 ¼ 0).

Results
The results of the likelihood ratio test indicated that we could reject the null hypothesis at the 5% level for fish density (L = 28.28, d.f. = 3, P < 0.0001), fish biomass (L = 21.63, d.f. = 3, P < 0.0001), and fish diversity (L = 24.61, d.f. = 3, P < 0.0001). We also applied the likelihood ratio test to compare Model 1a versus Model 1c. The null hypothesis in this comparison was whether the regression parameters (b4, b5, b6, b7, b8) from the 2D variables in Model 1a are equal to zero and the alternative is that they are not equal to zero. Results indicated that there was weak evidence to reject this null hypothesis for fish density (L = 13.66, d.f. = 5, P = 0.02), biomass ((L = 16.96, d.f. = 5, P = 0.005), and diversity (L = 39.51, d.f. = 5, P < 0.0001), and the AIC values (Table 2, 3, 4) confirmed these results. Tables 5, 6, 7 show the estimated parameters, standard errors, t-values, and P-values of all the parameters in Model 1a. Model validation was applied on M1a, and there were no issues with the homogeneity of the residuals and independence, indicating that there was no need for a more complicated model (e.g., additive mixed effect model). Results for fish diversity indicate that the slope of slope (e.g., habitat complexity) was highly significant, aspect and sand are significant, and there was an effect of management and habitat. The effect of the slope of slope and aspect was positive, and sand was negative. Also, habitat levels of soft bottom habitat were considerably lower than colonized hard bottom.
The AIC for fish biomass indicated that it was necessary to model the heterogeneity using the management and habitat combined variance structure. Results for fish biomass demonstrate that depth, slope of slope, algae, turf, and sand was significant, and there was an effect of management and habitat. The effect of slope of slope was positive, and depth was negative. The effect of management indicates that closure to fishing results in higher  biomass of fishes. Results for fish diversity indicated that the slope of slope (e.g., habitat complexity) was highly significant, sand was significant, and there was an effect of management and habitat. The effect of the slope of slope and aspect was positive, and sand was negative.

Discussion
This is the first study to directly compare model performance and demonstrate that three-dimensional LiDARderived environmental covariables, in combination with 2D covariables, support more robust spatial predictive modeling of coral reef fish assemblage structure compared to simpler 2D models alone. The models containing combined 2D and 3D covariables outperformed models with only 3D covariables, followed by models containing only 2D covariables. Two-dimensional landscape metrics alone do not adequately capture the intricate patterns and organism relationships with multidimensional spatial heterogeneity that exist (McGarigal & Cushman 2005). A number of other studies have characterized the seascape structure in the third dimension using multibeam data (Lundblad et al. 2006;Young et al. 2010), coastal relief models (Dunn & Halpin 2009), satellite remote sensing (Purkis et al. 2008), and LiDAR (Brock et al. 2004;Brock et al. 2006;Kuffner et al. 2007;Pittman et al. 2007a,b;Purkis et al. 2008;). This movement toward 3D metrics represents a fundamentally different approach for landscape ecologists to measure the seascape in contrast to the classic approach of mapping habitat cover in 2D from a remotely sensed image.
Of the 3D covariables in our study, the slope of slope (e.g., a measure of habitat complexity) was significant in the models for fish density and biomass. The slope of slope represents the rate of change of slope in the bathymetric raster and characterized habitat complexity in a coral reef environment. Areas with extensive three-dimensional structure and habitat complexity provide fish more refuge from predation and these areas often have greater fish density, diversity, and biomass and can be used to define productive habitats for reef fisheries (Stamoulis et al. 2018). Pittman et al. (2009) similarly used LiDAR data for predictive modeling of coral reef fish species diversity and abundance at multiple spatial scales and found that slope of slope was the single best predictor. These findings are also similar to those of Knudby et al. (2010), who found that depth range (a proxy for habitat complexity) was the most critical covariable in a majority of the fish habitat models implemented during a comparative modeling study. Establishing predictive relationships between remotely sensed data and fish assemblage structure, at a regional scale in Hawaii, will lay the foundation for the scaling-up and expansion of predictive fisheries mapping to inform resource management decisions across broader geographic areas.
A spherical correlation structure was adopted and applied to the mixed-effect models in this study, to account for the spatial dependence in transects that were closer than 400 to 500 m. The spatial dependence identified in all of the fish assemblage data ranged from 400 m (density and biomass) to 500 m for diversity is an essential consideration for future sampling design. This finding suggests that future sampling of reef fish assemblage data should be conducted with a minimum of 500 m distance to assure that the samples are independent.
Management was also a significant factor in all of the models in our study. Management represents protection from fishing, and locations that were complete no-take marine reserves in our study harbored higher fish biomass values. In addition, the effect of management indicates that closure to fishing results in higher densities, biomass, and diversity of fishes. Many of these study sites were impacted by the 2014-2016 coral bleaching event, and the State of Hawai'i subsequently pledged through a commitment at the International Union for Conservation of Nature (IUCN) World Conservation Conference to effectively manage 30 % of its nearshore waters by 2030 (Jouffray et al. 2019;Stamoulis et al. 2018;Ige 2016). Creating marine protected areas in locations with complex 3D reef structure should be a management priority as complexity can provide some resilience to coral bleaching and potentially enhance the recovery of reef fish biomass ( McClanahan et al. 2011;Graham et al. 2015;MacNeil et al. 2015;Darling et al. 2017). The application of LiDAR data in this work serves to advance landscape ecology theory and application in the third dimension by demonstrating that more robust predictive models are derived from the integration of both 2D and 3D covariables. Future tropical and temperate reef fish predictive modeling studies could benefit from including a combination of both 2D and 3D covariables. LiDAR-derived habitat complexity can also provide a cost-effective spatial approach for forecasting the potential impacts to fish communities through changes in reef complexity at local (e.g., hurricanes, direct anthropogenic stressors) and regional spatial-temporal scales (e.g., climate-induced thermal stress and ocean acidification) . Major findings could relate directly to the spatial scale of the analysis (Estes et al. 2018) and LiDAR provides the flexibility to explore relationships between the seascape and fish assemblage structure at multiple scales. Furthermore, seascape ecology studies could benefit from integrating data in the fourth dimension where available, by including a temporal component in order to capture the dynamic and multidimensional nature of marine systems (Wedding et al. 2016). Seascape ecologists can increasingly conduct sophisticated, multidimensional spatial analyses by utilizing big data and harnessing the potential of the geospatial data revolution in order to address many of the complex questions in the field of seascape ecology (Pittman 2018).