Keywords

1 Introduction

The Sensailor is an unmanned surface vehicle, originally developed by the Polytechnic University of Catalonia [1]. This investigative work has as main objective the development of a more robust navigation system. To fulfill this, a NVIDIA Jetson Nano embedded system will be used to obtain data from a Lidar sensor and a camera for obstacle recognition. Then, the same CPU will be able to process the information and generate new trajectories to be follow by the controller system. This will increase the Sensailor versatility, allowing it to perform in different unknown environments and generating new feasible trajectories in real time.

2 Methodology

2.1 Autonomous Mapping and Localization

The autonomous sailboat will move in a highly changing environment. It is needed to ensure the integrity of the unmanned vehicle by implementing an estimation of its position and orientation, as well as mapping its surroundings. Multiple sensors are reinforced, information is collected, merging the data obtained from a camera with a Lidar point cloud. In the first instance, a layer is created to preprocess the filtering images and the filter of the LIDAR point cloud. Therefore, to improve the effectiveness of obstacle detection, a list of steps has been stipulated in Fig. 1 [1].

Fig. 1.
figure 1

Module diagram for obstacles avoidance

The first stage is eliminating the reflection generated on the surface of water since it interferes when detecting and tracking moving objects on the surface. Therefore, the input image turns into a grayscale, and the mean threshold function [2] converts it into a binary image Fig. 2 (a–b); consequently, the 9-square grid method is applied to detect the outline. If a difference of 8 pixels does not exist near a pixel, a contour is created.

Fig. 2.
figure 2

a) Grayscale image b) Binary image. c) Original image. d) Image with noise. e) Image with Gaussian filter.

The existing noise interference in the images taken by the USV also affects the detection of obstacles. If an image has poor quality, a denoising processing is performed, Fig. 2 (c–e), otherwise this step is omitted. To determine the quality of the image techniques such as MSE (Mean Square Error), PSNR (Peak Signal to Noise Ratio), and SSIM (Structured Similarity Indexing Method) are used. Some results are in Fig. 3 (a–e). In addition, if it is necessary to remove the noise due to an unacceptable distortion, the BM3D (Block-matching and 3D filtering) algorithm is applied.

Fig. 3.
figure 3

a) Original image. b) Impulsive Noise with MSE: 10590 and PSNR: 18.11. c) Gaussian noise. MSE: 619.06 and PSNR: 20:21 d) Impulsive noise suppression with MSE:279.18 and PSNR: 23.67 e) Noise suppression with Gaussian filter. MSE: 238.15 and PSNR: 24.36.

The Watershed algorithm is applied to segment complex images by detecting contours and finding overlapping objects within the image. For the detection of the sea sky, the Otsu method [3] is included, calculating the threshold to decrease dispersion in each segment, but increasing it when contrasted with others (Fig. 4). In contrast, by using the Hough transform, possible horizons are extracted, and a single horizon is chosen through probabilistic intervals [4].

Fig. 4.
figure 4

Sky and sea line identification.

It was decided to merge the high-quality images of the camera with the Lidar point cloud. The camera obtains high resolution, and the Lidar gets depth data more accurate.

The relationship between the coordinates of the arbitrary world and the coordinate system [5], for the image pixels, is estimated by taking the center of the lens as the system origin. The “X” axis, and the y-axis as parallel to the opposite sides of the phase. Finally, the “Z” axis, which represents the optical axis of the lens, is perpendicular to the image plane as shown in Eq. (1).

$${Z}_{c}\left[\begin{array}{c}x\\ y\\ 1\end{array}\right]= K\left[\begin{array}{c}{x}_{c}\\ {y}_{c}\\ {z}_{c}\end{array}\right]=\left[\begin{array}{ccc}{f}_{x}& 0& {c}_{x}\\ 0& {f}_{y}& {c}_{y}\\ 0& 0& 1\end{array}\right]\left[\begin{array}{c}{x}_{c}\\ {y}_{c}\\ {z}_{c}\end{array}\right]= \left[\begin{array}{ccc}{f}_{x}& 0& {c}_{x}\\ 0& {f}_{y}& {c}_{y}\\ 0& 0& 1\end{array}\right]\left(\left[\begin{array}{ccc}{R}_{11}& {R}_{12}& {R}_{13}\\ {R}_{21}& {R}_{22}& {R}_{23}\\ {R}_{31}& {R}_{32}& {R}_{33}\end{array}\right]\left[\begin{array}{c}{X}_{w}\\ {Y}_{w}\\ {Z}_{w}\end{array}\right]+\left[\begin{array}{c}{t}_{x}\\ {t}_{y}\\ {t}_{z}\end{array}\right]\right)$$
(1)

On the one hand, the image data from the camera is represented by a three-dimensional lattice cloud made by Lidar [6], for which a transformation matrix, Eq. (2), is created that assigns 3D points to 2D points.

$$\left(\begin{array}{c}u\\ v\\ 1\end{array}\right)=\left(\begin{array}{ccc}{f}_{u}& 0& {u}_{0}\\ 0& {f}_{v}& {v}_{0}\\ 0& 0& 1\end{array}\right)\left(\begin{array}{cc}R& t\\ 0& 1\end{array}\right)\left(\begin{array}{c}x\\ y\\ z\\ 1\end{array}\right)=\left(\begin{array}{cccc}{m}_{11}& {m}_{12}& {m}_{13}& {m}_{14}\\ {m}_{21}& {m}_{22}& {m}_{23}& {m}_{24}\\ {m}_{31}& {m}_{32}& {m}_{33}& {m}_{34}\\ {m}_{41}& {m}_{42}& {m}_{43}& {m}_{44}\end{array}\right)$$
(2)

Obstacles were detected through YOLOV3 following certain rules. Some areas in the image are selected, labeling those regions according to their position. The model uses a convolutional neural network that extracts data from the characteristic image by predicting the location and category according to the neural network model. It is contrasted with the label and the loss function is obtained by evaluating the deviation between the real scenario and the one predicted based on what the network learned (Fig. 4). The point cloud obtained from Lidar is continuously processed, obtaining the point cloud cluster that can detect the obstacle (Fig. 5).

Fig. 5.
figure 5

Obstacle recognition with YoloV3.

Considering that weather conditions generate dispersion and interference points in the Lidar point cloud, a conditional elimination filter [7] is used to analyze the neighborhood between each point and calculate the Euclidean distance [8]. If a point has very few neighbors, it is considered atypical interference.

Fusion of camera and lidar data occurs by projecting the target point cloud cluster and image bounding boxes, calculating two-dimensional bounding boxes, and joining the information from both.

$${J}_{A}={J}_{U}+\frac{\left[L*sen\left(\theta *\frac{\pi }{180}\right)\right]}{\left[111*\mathrm{cos}\left({W}_{U}*\theta *\frac{\pi }{180}\right)\right]} {W}_{A}={W}_{U}+\frac{\left[L*cos\left(\theta *\frac{\pi }{180}\right)\right]}{\left[111\right]}$$
(3)

The USV can detect obstacles with its distance and angle direction relative to the autonomous sailboat. While, through the electronic compass and inertial navigation; the trajectory latitude, longitude, and angle are obtained. With Eq. (3), the longitude “J” and latitude “W” of the obstacle is found, being “U” the autonomous sailboat and “A” the obstacle, considering that the latitude differs by a degree every 111 km.

2.2 Planification

The workspace is discretized to be represented by a matrix where the cell size guarantees that the sailboat can always fit inside a cell. The coordinates of the obstacles are redefined based on the equivalent cell in the matrix. The available spaces within the matrix where the sailboat can move are called safe zones. On the other hand, non-safe zones are those that have an obstacle or a portion of it inside the cell, or the limits of the workspace. To guarantee a safe navigation of the sailboat, the norms stipulated in the Convention on the International Regulations for Preventing Collisions at Sea (COLREGs), must be considered [9].

It is intended to optimize the trajectories as the shortest safe path. Special attention must be paid to the redundancy of movements, preventing loops or going through the same segment several times without justification. For this, it was decided to use a combination of two trajectory planning algorithms, as shown in Fig. 6.

Fig. 6.
figure 6

Trajectory Planning General Algorithm

The intelligent algorithm A* evaluates the array by classifying its cells as occupied or free. For this, an admissible heuristic function is taken as a reference, considering the best possible scenario. This corresponds to the absence of obstacles in which the motion can be carried out by a single straight line. The heuristic function h is the Euclidean distance between the evaluated point and the final point. Each cell has an associated gi value corresponding to the number of steps that must be taken from the cell of origin to it. Adding both parameters determine the fi cost of each cell. The algorithm will choose the cell with the lowest cost to update its current position. If the trajectory reaches a ci cell with all its paths blocked, the algorithm returns to the previous position ci-1, eliminates the option of moving to ci cell, and selects the adjacent cell with the second lowest cost. These actions are repeated until the program finds a path that reaches the destination position through the cells with the lowest possible cost [10] (Fig. 7).

Fig. 7.
figure 7

a) A* algorithm Flowchart. b) Potential Fields algorithm Flowchart.

Since the system requires constant updating of the sampled space, the trajectory generation system is complemented with an algorithm based on potential fields. Unlike the A* algorithm, it can be implemented in real time, constantly updating the trajectory according to the motion of the obstacles. The direction of movement of the vehicle is established by the direction of the force resulting from the sum of the attraction and repulsion forces [11]. This way, the final trajectory followed by the USV is very close to the optimal route generated by the A* algorithm. This algorithm is guaranteed to work efficiently updating the trajectory in real time as required (Fig. 8).

Fig. 8.
figure 8

a) Obstacle coordinate system with respect to the USV. b) Generated Trajectory Example.

3 Conclusions

In this research, it is demonstrated the solution to the autonomous navigation of a Sailboat. It was possible to detect obstacles by segmenting and labeling images. For the sailboat to be able to perceive any object close to it, the image quality has been corrected and noise has been reduced. Fast response was gotten when implementing Yolo (up to 30 FPS) by tagging obstacles in real-time. By having the data on the location of the obstacle, the sailboat automatically traces an obstacle avoidance route. First, it uses the intelligent search algorithm A* to generate the shortest safe path. Additionally, for navigation in real-time, the algorithm of potential fields was complimented. It was assumed that the GPS obtained correct coordinates without error. Therefore, in future research, a Kalman filter could be added to treat the location of the USV.