Cross-Field Road Markings Detection Based On Inverse Perspective Mapping
Cross-Field Road Markings Detection Based On Inverse Perspective Mapping
Department of Geomatics, National Cheng Kung University, No. 1, University Rd., Tainan 701, Taiwan;
[email protected]
* Correspondence: [email protected]
Abstract: With the rapid development of the autonomous vehicles industry, there has been a dramatic
proliferation of research concerned with related works, where road markings detection is an important
issue. When there is no public open data in a field, we must collect road markings data and label them
by ourselves manually, which is huge labor work and takes lots of time. Moreover, object detection
often encounters the problem of small object detection. The detection accuracy often decreases
when the detection distance increases. This is primarily because distant objects on the road take up
few pixels in the image and object scales vary depending on different distances and perspectives.
For the sake of solving the issues mentioned above, this paper utilizes a virtual dataset and open
dataset to train the object detection model and cross-field testing in the field of Taiwan roads. In
order to make the model more robust and stable, the data augmentation method is employed to
generate more data. Therefore, the data are increased through the data augmentation method and
homography transformation of images in the limited dataset. Additionally, Inverse Perspective
Mapping is performed on the input images to transform them into the bird’s eye view, which solves
the “small objects at far distance” problem and the “perspective distortion of objects” problem so that
the model can clearly recognize the objects on the road. The model testing on the front-view images
and bird’s eye view images also shows a remarkable improvement of accuracy by 18.62%.
Keywords: road markings; object detection; cross-field; inverse perspective mapping; deep learning
of High-Definition Maps, the vector layers consist of the actual shape of the road markings
as a polygon. The research is based on establishing the geometric outline objects of the
High-Definition Map to set the pixel-level road markings detection.
In order to recognize the objects precisely, the Artificial Intelligence (AI) decision model
needs to learn a large amount of image data that include different weather conditions to
improve the robustness of the model. However, data collection and labeling data cost
human resources and time; thus, an efficient data collection method should be determined.
Previous studies on the detection of objects mostly focus on objects on the upper part of
the road such as road signs and traffic signals, and less on road markings on the road
surface. Since the detection targets are located on the road, only the area of the road
environment is required. Generally, this kind of research will carry out image processing
in the preprocessing stage. Common practices removing irrelevant backgrounds include
extracting regions of interest, front view to bird’s eye view, and so on, which can reduce
the influence of environmental conditions and unnecessary background feature learning so
that the deep-learning model can achieve better performance.
Image data with ground truth is an important key for determining the accuracy of
object detection. If the cost of data labeling can be reduced, high-quality road image
recognition and application can be improved. However, data collection takes a long
time and costs a large amount of money. For instance, an intelligent vehicle equipped
with industrial cameras in the field where the route has been surveyed and planned
acquires the data for model training. After data collection, the self-collected data need to be
labeled manually, which is huge additional labor work and takes a large quantity of time.
Furthermore, there is no public road markings dataset in Taiwan causing difficulties in the
use of data for related research. Additionally, one of the most crucial tasks for autonomous
vehicles is to detect the lane and road markings accurately. In current methods, when the
detection distance increases, the detection accuracy of objects frequently declines. This is
primarily because distant objects on the road take up few pixels in the image and because
object scales vary depending on different distances and perspectives. In light of these
concerns, this paper expects to solve the problem of reduced detection accuracy resulting
from small objects and the difficulties of data collection.
To solve the problems mentioned above, the research utilizes a virtual dataset and an
open dataset to train the object detection model and cross-field testing in the field of Taiwan
roads without spending a lot of money on collecting data and extra labor to annotate the
target objects in the image. The mixed dataset will be augmented to increase the amount of
data in the limited dataset by the method of data augmentation, such as flip, contrast, and
brightness adjustment. Furthermore, this paper utilizes Inverse Perspective Mapping (IPM)
to produce bird’s eye view images of the scene from the front-view image plane. Compared
to the front-view images for testing, the transformed images eliminate the perspective
distortion so that the accuracy of object detection can be improved.
The contributions of the paper are listed below:
1. A virtual dataset mixed with an open dataset is viable for cross-field detection when
training them together and testing them on the real dataset.
2. The method of data augmentation is employed to increase the amount of data in the
limited dataset without extra labor to annotate the target objects in the image.
3. The integration of Inverse Perspective Mapping (IPM) to transform input images
into a bird’s eye view is a key innovation of this work, significantly improving road
marking detection accuracy. This approach addresses perspective distortion, resulting
in a remarkable mAP improvement from 60.04% to 78.66%.
4. The Mask R-CNN tests the front-view images and the bird’s eye view images, and
the mAP increases from 60.04% to 78.66%, which is a significant improvement in
model accuracy.
The following section is the organization of the paper. In Section 2, literature reviews
of object detection algorithms and road markings detection are introduced. In Section 3,
the methodology of the proposed approach is elaborated in detail. Section 4 evaluates the
Sensors 2024, 24, 8080 3 of 21
proposed method by the experiments and analyzes the results. In Section 5, conclusions
and future works of the paper will be mentioned.
2. Related Work
Some previous research works are reviewed in this section, which consists of
two major sections: object detection algorithms and research on road markings detec-
tion. The following will elaborate on the development of the object detection algorithm,
including the one-stage model and two-stage model. In addition, some existing research
on road markings will also briefly be reviewed.
than the previous versions. YOLOv4 was released by Alexey Bochkovskiy et al., enhancing
various parts of YOLOv3. In addition to maintaining the speed, the detection accuracy was
significantly strengthened.
Mask R-CNN, which is improved by Faster R-CNN, burnishes its legacy on the
instance segmentation model. Influenced by Mask R-CNN, Daniel Bolya et al. hoped to
design a one-stage instance segmentation model that integrated the merits of Mask R-CNN
and YOLO, namely YOLACT [11]. YOLACT is a real-time instance segmentation model,
which performs the object detection tasks optimally and computing speed rapidly. YOLACT
splits the complex instance segmentation process into two simple parallel tasks, generating
prototype masks and predicting mask coefficients per instance. For each instance, the
corresponding predicted mask coefficient is simply multiplied and added to the prototype
mask. Subsequently, the instances are filtered according to the bounding box and the
threshold value to obtain the corresponding mask for each instance, which is a high-quality
mask. SOLO [12] directly segments the instance mask, which is a box-free approach. SOLO
considers a method that introduces the concept of instance categories to predict the class
of object instances. To distinguish the object instances based on their center locations and
object sizes, the approach transforms the instance segmentation issue into a classification
issue, imitating the idea of semantic segmentation to predict the class of each pixel. The
accuracy of the experiment testing on the COCO dataset has surpassed Mask R-CNN.
SOLO version 2 [13], published by the same author, improves the mask learning and NMS
(Non-Maximum Suppression) approach, which not only enhances the accuracy but also
realizes the real-time requirements.
distortion of road markings. The presented method can obtain good performance with less
computation even though it is a two-stage model. In summary, the deep-learning-based
approach is more robust and more stable than the traditional feature extraction approach
and can be applied to different scenarios with higher accuracy.
3. Methodology
In this section, the overall proposed method will be introduced first. Figure 1 provides
an apparent visualization of the overall framework of the proposed method. The procedure
is divided into two phases, the training phase and the testing phase. For the sake of the
inconvenience of collecting data, the training data are composed of the two open datasets
for cross-field detection. Additionally, in order to focus on the objects of the road surface,
the mixed dataset will be processed in the training phase. The images will be cropped to
remove the irrelevant background, and afterward the Inverse Perspective Mapping (IPM)
will be applied to project to the perspective of the bird’s eye view, which is favorable for
the object detection model. For the testing phase, IPM will be performed on the testing
data, projecting the images to the bird’s eye view.
road surface, the mixed dataset will be processed in the training phase. The images will
be cropped to remove the irrelevant background, and afterward the Inverse Perspective
Mapping (IPM) will be applied to project to the perspective of the bird’s eye view, which
Sensors 2024, 24, 8080 is favorable for the object detection model. For the testing phase, IPM will be performed
6 of 21
on the testing data, projecting the images to the bird’s eye view.
Figure 1.
Figure 1. Illustration
Illustration of
of the
the proposed method.
proposed method.
Figure 2.
Figure 2. The
The overall
overall pipeline
pipeline of
of the
the proposed
proposed method.
method.
The points in the world are (X, Y, Z) coordinates that are turned into homogeneous
forms. After multiplying by two matrices, the homogeneous coordinates of the pixel in
the image are calculated. Hence, combining two matrices by matrix multiplication is the
camera matrix shown in (3), which is a 3 × 4 matrix with 12 numbers, denoted as Cij .
The elements Cij represent the intrinsic and extrinsic parameters of the camera, where i
is the row index and j is the column index. λ is an arbitrary scalar scale factor that scales
the (x ′ , y′ , z′ ) coordinates. The (x ′ , y′ ) are divided by z′ ; thereforethe λ disappears. The
coordinates multiplied by the matrix use any arbitrary scale factor and will obtain exactly
the same result. Since the scale factor is arbitrary, the element in C (3, 4) will be set to 1.
X
C14 w
′
x C11 C12 C13
y′ = λC21 Yw
C22 C23 C24
Zw (3)
z′ C31 C32 C33 C34
1
The matrix (4) can be simplified to consider the camera projection on a point on a
plane of the world space; therefore, the Z value of the points will be set to 0 and the
Sensors 2024, 24, 8080 9 of 21
third column of the matrix will be multiplied to be 0. After removing a row of the vector, it
is a 3 × 3 system called planar Homography H. It maps the coordinates of the points in a
plane to the points in the image. Two-dimensional coordinates of a point in the world can
be converted by using the simple matrix into the coordinates of a point in the image.
′
x C11 C12 C14 Xw h11 h12 h13 Xw
y′ = C21 C22 C24 Yw = h21 h22 h23 Yw (4)
1 C31 C32 1 1 h31 h32 1 1
The eight equations are used to estimate the eight degrees of freedom in the homog-
raphy matrix, while h33 is set to 1. The hij is shifted to another reference frame, and the
coordinates of the 2D image are mapped according to (5), which will be calculated by
the least-squares method. After applying homography transformation to the front-view
images, the images will be projected onto the bird’s eye view. The four yellow points are
decided along the roadside, so the ROI will focus on the road area.
In the research, the homography transformation is performed on the data augmen-
tation to train the model. For the testing phase, the testing data will also operate the
homography matrix to transform the different angles to the perspective of the bird’s eye
view. In conclusion, IPM transforms 2D images into other 2D images on the same planar
surface through homography matrix multiplications so that the images from the monocular
camera as the front-facing image will be projected onto top-view images.
Considering the simple features and monotonous color of the road markings, they
do not contain diverse structural features for the object detection model. Accordingly, this
paper adopts the flip and color space adjustment (brightness and contrast) to the training
dataset. Flip is one of the effective methods and has proven to be useful to improve the
performance of the deep-learning model. Furthermore, color space adjustment is the easiest
and most common technique to change the luminance of images. On the road surface
environment, the diversity of lighting conditions and weather conditions have an impact
on the accuracy of the model, and hence, data augmentation is a vital technique to vary the
images by the color space adjustment. The images are flipped using horizontal flipping
and vertical flipping, which are common methods in geometric transformation. Moreover,
brightness adjustment is implemented to the training data to convert the brightness-related
channel depending on the value setting. Therefore, it can make the images slightly brighter
or darker to enhance the lighting conditions of images. Contrast adjustment is also one of
the data augmentation techniques that rescales the range of intensity value in the images.
The contrast is the ratio between the brightest and darkest areas of the images. The lager
the ratio, the more gradations from black to white, which makes the object or the boundary
in the images more distinguishable. Therefore, the contrast of the white road markings
on the black asphalt road should be enhanced, which will improve the visual perception.
Ultimately, four sets of image copies were generated by the data augmentation techniques,
which increased the amount of data from the original images without extra time costs.
into two stages: finding the region proposals first and then identifying them. The first
stage has the same first layer as Faster R-CNN, called Region Proposal Network (RPN).
RPN consists of two branches. One of the branches determines the probability of whether
the anchor contains an object or not. The other branch is responsible for calculating the
offsets ( x, y, w, h) between the anchors and ground truth. The output of RPN will be the
input into the ROI align. A modification is proposed to improve the location accuracy
known as ROI align. Using bilinear interpolation of the virtual pixel at the point with the
nearest pixel—instead of using the quantization output for obtaining a fixed size from ROI
pooling—solves the problem of misalignment, which is caused by twice quantization in
the ROI pooling operation. Therefore, the accuracy of the bounding box location makes
obvious progress. In the second stage, apart from predicting classes and bounding box
locations, a branch of the fully convolutional network is added. A corresponding binary
mask
Sensors 2024, 25, x FOR is predicted
PEER REVIEW for each ROI to indicate whether a given pixel is part of the target or not. 12
The overall architecture of Mask R-CNN is depicted in Figure 4.
4. Experimental Evaluations
In the first place, the experimental settings and evaluation metrics for the propo
method will be described. Secondly, this section elucidates the detailed information o
datasets in the experiment. Finally, some experiments and results of the proposed met
are presented at the end of the section.
Sensors 2024, 24, 8080 12 of 21
4. Experimental Evaluations
In the first place, the experimental settings and evaluation metrics for the proposed
method will be described. Secondly, this section elucidates the detailed information of the
datasets in the experiment. Finally, some experiments and results of the proposed method
are presented at the end of the section.
Items Specification
CPU Intel i9-9900 3.5 GHz 10 cores
Memory DDR4 2400 MHz 16 GB × 4
GPU NVIDIA GeForce RTX2080 Ti
Operating System Linux Ubuntu 18.04
Libraries Python3.6, Tensorflow-gpu-1.12, CUDA 9.1
The following is an empirical setup for the model’s hype-parameters: the number
of steps per epoch is the total training samples divided by the batch size, where the total
epochs are 100, and the learning rate is 0.001. Other training settings are shown in Table 2.
Items Specification
Number classes 8 (including background)
Steps per epoch Training samples/batch size
Epochs 100
Learning rate 0.001
Detection minimize confidence 0.9
To evaluate the performance of the proposed method, the widely used evaluation
metric Mean Average Precision (mAP) was employed to evaluate the model. The mAP is
related to the four metrics of IoU, Precision, Recall, and AP. The following describes each
metric and its formula:
1. IoU (Intersection over Union): Equation (8) displays the overlap ratio between the
ground-truth bounding boxes and predicted bounding boxes. The higher the overlap
ratio, the higher the accuracy of the predicted target object position. Essentially, the
predefined threshold is 0.5.
Area o f Overlap
IoU = (8)
Area o f Union
2. Precision: Precision (9) is the number of predicted objects that have been predicted as
positive where true positive (TP) is the predicted object that matches the ground-truth
objects, and false positive (FP) is the positively predicted object that is actually false.
TP
Precision = (9)
TP + FP
3. Recall: Recall (10) is the number of actual objects that the model predicts correctly,
where false negative (FN) represents when the model predicts a negative object that is
actually positive.
Sensors 2024, 24, 8080 13 of
Sensors 2024, 25, x FOR PEER REVIEW 14 of 21
23
5. mAP (Mean Average Precision): the average 1 of the AP for every class. The mAP as
𝑚𝐴𝑃 measurement
shown in (13) is a principal quantitative 𝐴𝑃 for object detection. (13)
𝑁
1 N
N i∑
mAP = APi (13)
4.2. Dataset =1
In this section, two datasets (Ceymo and SVA datasets) for training and the Taiwan
4.2. Dataset
dataset for testing will be introduced. Two open datasets were collected in Sri Lanka and
In this section, two datasets (Ceymo and SVA datasets) for training and the Taiwan
in the virtual
dataset world,
for testing willGrand Theft Auto
be introduced. TwoVopen
(GTAV), respectively.
datasets The Ceymo
were collected dataset
in Sri Lanka and con-
in
sists of 2099 images for training and 788 for testing belonging to eleven
the virtual world, Grand Theft Auto V (GTAV), respectively. The Ceymo dataset consists of classes, providing
polygon,
2099 images bounding boxand
for training annotations, and pixel-level
788 for testing belonging segmentation
to eleven classes, masks. In order
providing to sat-
polygon,
isfy our research
bounding demand, we
box annotations, andonly retainedsegmentation
pixel-level seven classes masks.
and selected 2172toimages
In order satisfy with
our
research demand,
annotations. we only retained
Additionally, seven classes
the Surrounding and selected
Vehicles 2172 images
Awareness (SVA)with annotations.
dataset was col-
Additionally,
lected from the thevirtual
Surrounding
world,Vehicles Awareness real-world
GTAV, simulating (SVA) dataset was collected
scenarios from the
under abundant
virtual world, GTAV, simulating real-world scenarios under abundant
weather conditions and different illuminations. We selected 1771 images from the SVA weather conditions
and different illuminations. We selected 1771 images from the SVA dataset and labeled
dataset and labeled them into six classes. For the SVA dataset, the whole labeling proce-
them into six classes. For the SVA dataset, the whole labeling procedure was performed
dure was performed manually by the image annotation tool, VGG Image Annotator
manually by the image annotation tool, VGG Image Annotator (VIA). As for the Ceymo
(VIA). As
dataset, it for the Ceymo
provides label dataset, it provides
files annotated label files which
by Labelme, annotatedis a by Labelme,
different which
format is a
from
different format from VIA; therefore, the label files from the Ceymo
VIA; therefore, the label files from the Ceymo dataset need to be transformed into the dataset need to be
transformed
format of VIAinto the format
in view of VIA in The
of uniformity. viewtotal
of uniformity.
images within The total images
the two within
datasets arethe3943.
two
datasets
Figure aredepict
5a,b 3943.some
Figure 5a,b depict
examples fromsome
eachexamples from each dataset.
dataset. Subsequently, Subsequently,
the datasets were mixed the
into one dataset and were randomly divided into training sets and
datasets were mixed into one dataset and were randomly divided into training sets and validation sets of each
dataset
validationwithsets
a proportion of 7:3,with
of each dataset containing, in total,
a proportion of2785 training datainand
7:3, containing, 1158
total, validation
2785 training
data, respectively.
data and 1158 validation data, respectively.
Figure5.5.Example
Figure Exampleof
ofimages
imagesfrom
fromdifferent
differentdatasets.
datasets.
The
The testing
testing data
data were
were derived
derived from
from YouTube
YouTubevideos
videos in
in the
the field
field of
of the
the Taiwan
Taiwanroad
road
scene. Figure 5c shows some examples of images from the Taiwan dataset. The data
scene. Figure 5c shows some examples of images from the Taiwan dataset. The data con-
contain diverse scenarios at different times during the days, including sunny, rainy, and
tain diverse scenarios at different times during the days, including sunny, rainy, and
cloudy. The total images in the testing data consist of 582 images. Figure 6 presents the
seven classes in the Taiwan data for model prediction, including straight arrow, left arrow,
Sensors 2024, 24, 8080 14 of 21
Sensors 2024, 25, x FOR PEER REVIEW 15 of 23
(a) Straight arrow (b) Left arrow (c) Right arrow (d) Special lane
(a) Straight arrow (b) Left arrow (c) Right arrow (d) Special lane
(e) Straight left arrow (f) Straight right arrow (g) Pedestrian crossing
(e) Straight
Figure left
6. Classes ofarrow (f) Straight
Taiwan road images (sevenright arrow
classes). (g) Pedestrian crossing
Figure 6. ClassesFigure
of Taiwan road images
6. Classes (seven
of Taiwan road classes).
images (seven classes).
4.3. Data Preprocessing
4.3. Data Preprocessing
Before training the Preprocessing
4.3. Data model and testing the images, the images and label files will be
Before training
preprocessed first. Four theapproaches
Before
model andoftesting
training the data
model
the images, the
augmentation
and testing will
images
augmentand
the images,
label
theimages
the
files
total quantity will be
and label files will
preprocessed
of images. Infirst. Four the
addition, approaches
homography of data augmentationbased
transformation will augment
on Inverse thePerspective
total quantity
preprocessed first. Four approaches of data augmentation will augment the total quanti
of images.(IPM)
Mapping In addition,
appliedthe homography transformation based on Inverse Perspective
ofis images. to the
In trainingthe
addition, data and testing transformation
homography data. based on Inverse Perspecti
Mapping (IPM) is applied to the training data and testing data.
Mapping (IPM) is applied to the training data and testing data.
4.3.1. Data Augmentation
4.3.1. Data Augmentation
The experiment used Augmentation
4.3.1. Data the data augmentation method only on the training set, and the
The experiment used the data augmentation method only on the training set, and the
testing set consists ofThe theexperiment
original images. used the Data augmentation
data augmentation wasmethod
performed onlyby onchanging
thechanging
training set, and th
testing set consists of the original images. Data augmentation was performed by
the contrast and brightness
testing of
set consists the images in the experiment. The experiment was realized
the contrast and brightness of theofimages
the original
in theimages. Data augmentation
experiment. The experiment waswas performed
realizedby changin
by the Imgaug package. Theand Imgaug package is aimages
pythoninlibrary for image augmentation
by the Imgaug the contrast
package. The Imgaug brightness
packageof the is a python the experiment.
library for image Theaugmentation
experiment was realize
providing the
providing the bykeypoint
the Imgaug
keypoint and bounding
and package.
bounding Thebox
box transformation.
Imgaug package isThere
transformation. a python
There areare three
library threefunctions
for functions
image augmentatio
adopted in the experiment,
adopted in theproviding
experiment, inclusive
the inclusive
keypoint ofof “AddToBrightness”
and “AddToBrightness” function,
bounding box transformation. “LinearContrast”
function, “LinearContrast”
There are three functio
function, and
function, and “Fliplr”
“Fliplr” function.
adoptedfunction. The
The LinearContrast
in the experiment, inclusivefunction
LinearContrast function sets
setsthe
of “AddToBrightness” thealpha
alphavalue value totosam-
function, sample
“LinearContras
ple uniformly
uniformly within within the
the specific
function, specific interval
intervalfunction.
and “Fliplr” [0.4,
[0.4, 1.6].The 1.6].
In the In the experiment,
experiment,function
LinearContrast we
we set 1.6 set 1.6
asthe
sets theasalpha
the value to sam
contrast
contrast
value value ple
to adjust totheadjust the intensity
intensity
uniformly of images.
within of specific
the images. Figure
Figureinterval 7b
7b illustrates illustrates
[0.4, the experiment,
theInoutcome
1.6]. the outcome after
after contrast
we set 1.6 as th
contrast adjustment.
adjustment. The The AddToBrightness
AddToBrightness function function
converts converts
each
contrast value to adjust the intensity of images. Figure 7b illustrates theeach
image image
to a to a
color color
space space
with a
outcome aft
with a brightness-related
brightness-related channel, channel,
extracts extracts
the the
channel, channel,
and and
then then
adds
contrast adjustment. The AddToBrightness function converts each image to a color spa adds
or or
subtracts subtracts
the the
channel
value
channelbetween − 30 and -30
value between 20 to and convert
with a brightness-related
it back
20 to convert to theextracts
it back
channel,
original color space.
to the original
the channel, colorandFigure
space.then
7cadds
illustrates
Figure 7cor subtracts th
the dark image
illustrates reducing
the dark image the lighting,
reducing the and Figure
lighting, and 7d illustrates
Figure 7d the bright
illustrates the image.
bright The flip
image.
channel value between -30 and 20 to convert it back to the original color space. Figure
function can flip can
The flip function the images
flip the horizontally
images or vertically.
horizontally Fliplr means to reverse the images
illustrates the dark image reducingor thevertically.
lighting, Fliplr means
and Figure 7dtoillustrates
reverse the
the bright imag
from left to
images from leftright, towhich
right, horizontally
which flipped
horizontally the
flipped images.
the Figure
images. 7e
Figure shows
7e the
shows example
the ex- of
The flip function can flip the images horizontally or vertically. Fliplr means to reverse th
the horizontally flipped
ample of the horizontally image. Data augmentation is helpful to increase the amount of
images fromflipped image.
left to right, Datahorizontally
which augmentation is helpful
flipped to increase
the images. Figurethe 7e shows the e
data in the
amount limited
of data in dataset
the limited without
dataset extra
without labor to annotate
extra labor to the target
annotate theobjects
target in the image.
objects into increase th
ample of the horizontally flipped
After data augmentation, the number of images increased from 2785 to 13,925. image. Data augmentation is helpful
the image. After data augmentation,
amount of data in the the number
limited of images
dataset without increased
extra labor from to 2785
annotateto 13,925.
the target objects
the image. After data augmentation, the number of images increased from 2785 to 13,92
(d) (e)
Figure 7.
Figure 7. Data augmentation.
augmentation.(a) (d)image.
(a)Original
Original image.(b)(b)
Contrast. (c) (c)
Contrast. Dark. (d) (d)
Dark. Bright. (e)Flip.
(e) Flip.
Bright. (e)
Figure 7. Data
4.3.2. Inverse Perspective augmentation. (a) Original image. (b) Contrast. (c) Dark. (d) Bright. (e) Flip.
Mapping
The homography transformation based on IPM is employed to the Ceymo dataset
and SVA dataset to transform the images into the bird’s eye view to augment the dataset.
The two datasets do not provide the camera parameters, so the perspective transformation
4.3.2.
TheInverse Perspective
homography Mapping based on IPM is employed to the Ceymo dataset
transformation
and SVA Thedataset
homographyto transform the imagesbased
transformation into the on bird’s
IPM iseye view to augment
employed to the Ceymo the dataset.
dataset
The two datasets do not provide the camera parameters, so
and SVA dataset to transform the images into the bird’s eye view to augment the dataset.the perspective transformation
Sensors 2024, 24, 8080 adopts
The two planar projective
datasets do not transformation,
provide the camera choosing four points
parameters, so theon the input image
perspective and
transformation 15 the
of 21
corresponding points on the output image to estimate the homography
adopts planar projective transformation, choosing four points on the input image and the matrix. The input
image size affects
corresponding the learning
points of the image
on the output object to detection
estimatemodel. Compared matrix.
the homography to smallThe images,
input
adopts
large planar
images not projective
only requiretransformation,
more training choosing
time and four
more
image size affects the learning of the object detection model. Compared to small images, points
memory on tothe input
extract image
the and
internal
the corresponding
features of thenot
large images image points
onlybut on the
also contain
require output image
more background
more training to estimate
time and more the
noise, homography
whichtohas
memory matrix.
a negative
extract The
im-
the internal
input
pact image
on detection.size affects the
Consequently, learning of
the images the object detection model. Compared to small
features of the image but also contain more will be cropped
background firstwhich
noise, in order hasto remove the
a negative im-
images, large images not only require more training time and more memory to extract
irrelevant background
pact on detection. so that the model
Consequently, can focus
the images will on be the road surface,
cropped first in orderwhichtoreducesremovethe the
the internal features of the image but also contain more background noise, which has a
effect of
negative the
irrelevantimpactenvironmental
background conditions
so that the
on detection. and unnecessary
model can focus
Consequently, feature
on the will
the images roadbelearning.
surface,
cropped Furthermore,
whichfirstreduces
in orderthethe
to
reserved
effect of ROI
the easily chooses
environmental the source
conditions point
and to map
unnecessary to a corresponding
feature
remove the irrelevant background so that the model can focus on the road surface, which learning. point in the
Furthermore, target
the
image
reserved
reduces through
theROI perspective
easily
effect of chooses transformation.
the source point
the environmental OpenCV
to map
conditions provides perspective
to aunnecessary
and corresponding transformation
point
feature in the target
learning. Fur-
functions
thermore,
image throughtothe
calculate
reserved the
ROI
perspective homography
easily chooses
transformation. matrix
theOpenCVfor thepoint
source images
provides given
to map to the source
a corresponding
perspective and desti-
transformation point
in the points.
nation
functionstarget image
to calculate through perspectivematrix
The “getPerspectiveTransform”
the homography transformation.
function
for the images OpenCV
computes given provides
thetheprojection
source perspective
andmatrix.
desti-
transformation
Afterward,
nation points. the Thefunctions
top-view to calculate the
perspective transformation
“getPerspectiveTransform” homography
function matrix
is performed for
computes using the images given
the “warpPer-
the projection the
matrix.
source
spective” and destination
function. points.
Figure 8perspective The
provides the “getPerspectiveTransform”
example of source function
points 𝑥 using , 𝑦 inthe computes
the“warpPer-
cropped the
Afterward, the top-view transformation is performed
projection matrix. Afterward, the top-view perspective transformation is performed using
image and function.
spective” reference Figure 𝑥 , 𝑦 in the example
points 8 provides bird’s eyeofview source image.
points The𝑥pixel , 𝑦 coordinates
in the cropped of
the “warpPerspective” function. Figure 8 provides the example of source points ( xi , yi )
the
in
points
image
the cropped
are transferred
and reference
imagepoints
from 𝑥 , 𝑦 points
and reference
one plane to another
in the bird’s
xi′ , yi′eye
through
in view
homography
image.
the bird’s eyeThe view pixel matrix
coordinates
image.
multi-
The pixel of
plication.
the points are transferred from one plane to another through
coordinates of the points are transferred from one plane to another through homography homography matrix multi-
plication.
matrix multiplication.
Figure 8. Source points 𝑥 , 𝑦 and reference points 𝑥 , 𝑦 within the different perspectives of
′ ′
the images.
Figure 8.
Figure 8. Source points ( 𝑥i , 𝑦 i and reference points x𝑥i ,, 𝑦yi within
x , y ) within the
the different
different perspectives
perspectives of
of
the
the images.
images.
As for the testing phase, the proposed method also transforms the data into the bird’s
As for the testing phase, the proposed method also transforms the data into the
eye view using
As for theIPM.
testingFigure 9 schematizes
phase, the proposedthe location
method oftransforms
also source points 𝑥 , 𝑦into
the data and
thedesti-
bird’s
bird’s eye view using IPM. Figure 9 schematizes the location of source points ( xi , yi ) and
nation
eye points
view using𝑥IPM.
, 𝑦 . ′Figure
The′ four
9 yellow points
schematizes the are decided
location of along points
source 𝑥
the roadside, 𝑦 soand
thatdesti-
the
destination points xi , yi . The four yellow points are decided along the roadside so that
ROI will
nation
the focus
ROIpoints on𝑥 , 𝑦on .the
will focus the road area.
Theroad To
fourarea.ensure
yellow the fairness
points the
To ensure of
are fairnessthe
decided of experiment,
along the
the roadsidethe
the experiment, number of
so number
that the
the
ROI
of label’s
thewill samples
focus
label’s on after
samples transformation
the after
road is the
area. To ensure
transformation same.
isthe
the fairness
same. of the experiment, the number of
the label’s samples after transformation is the same.
Figure 9. Source points ( xi , yi ) and reference points xi′ , yi′ in the Taiwan dataset.
images and 1158 validation images, the training time for 100 epochs is approximately
1 h 20 min. The inference time for the validation set is about 1–2 min, with each image
taking approximately 70–120 ms to process.
Table 3. The comparison of the same-field and cross-field detection in the front view model.
Table 4. The comparison of the same-field and cross-field detection in the bird’s eye view model.
Despite the fact that the model has the ability to detect the object on the Taiwan
road, the model still has room for improvement. Due to the different perspectives and the
small objects at a far distance, some objects have difficulty being predicted; thus, IPM is a
vital method to improve this situation. The experimental results will be presented in the
forthcoming sections.
approach can be successful. The mAP of the third model testing the front-view image is
60.04%, while the mAP of the bird’s eye view image is 78.66%. From the previous results, it
can be noticed that the same perspective model testing on the same perspective images can
obtain a better performance, while the different perspective model testing on the different
view achieves a much lower mAP. Therefore, the third model, which takes the front-view
images and bird’s eye view images as input, plays an essential role. The model can detect
the different perspectives of images, which makes the model more robust, general, and
stable. The experiment has demonstrated that the perspective transformation is effective
for object detection on the road surface.
Testing Data
Models Training Data
Front View Bird’s Eye View
Front view 57.90% 28.60%
Mask
BEV 10.13% 85.57%
R-CNN
Front view and BEV 60.04% 78.66%
Front view 15.32% 9.9%
SOLO v2 BEV 5.4% 42.60%
Front view and BEV 23.60% 39.70%
Front view 30.28% 9.64%
YOLACT++ BEV 28.84% 67.56%
Front view and BEV 31.48% 64.76%
Moreover, the experiment also compared the proposed method with other instance
segmentation models, SOLO version 2 and YOLACT++ [29]. SOLOv2 is a box-free instance
segmentation model using ResNet-101 as the backbone and FPN for multi-scale prediction.
As for YOLACT++, it uses the architecture of RetinaNet combing with ResNet-101 and
FPN. The result reflected in Table 5 indicates that the Mask R-CNN with the bird’s eye
image significantly outperforms the other methods. Basically, the performance of using
bird’s eye view images as input is better than the other two kinds of images as input. On
the other hand, it is believed that the proposed method can work well after perspective
transformation no matter which model is used.
Table 6. Comparative results of the model with data augmentation and without data augmentation.
Table 6. Comparative results of the model with data augmentation and without data augmentation.
Table 6. Cont. w/o Data Augmentation w/ Data Augmentation
Training Data
Front View w/o DataBEV
Augmentation Front View BEV
w/ Data Augmentation
Front view Training Data
57.90% 28.60% 58.27% 33.96%
Front View BEV Front View BEV
BEV 10.13% 85.57% 10.85% 80.34%
BEV 10.13% 85.57% 10.85% 80.34%
Front view and BEV 60.04% 78.66% 57.71% 79.02%
Front view and BEV 60.04% 78.66% 57.71% 79.02%
(a)
(b)
(c)
(d)
Figure
Figure10.
10.Examples
Examplesofofthe
thebird’s
bird’seye
eyeview
viewmodel
modeltesting
testingon
ondifferent
differentcases.
cases.
Figure1111illustrates
Figure illustratesthat
thatsome
somecases
casesininthe
thefirst
firstrow
rowofofthe
thefront-view
front-viewimages
imagesfailed
failed
to detect the road markings at a far distance, such as pedestrian crossing
to detect the road markings at a far distance, such as pedestrian crossing (a), straight arrow(a), straight
arrow
(b,c), (b,c),
and and lane
special special
(d).lane (d). However,
However, after projecting
after projecting to the
to the bird’s eyebird’s
view,eye
theview, the
objects
objects were accurately recognized even on a low-light rainy day (c). As displayed in
were accurately recognized even on a low-light rainy day (c). As displayed in Figures 10
Figures 10 and 11, it is demonstrated that the proposed method can successfully detect
and 11, it is demonstrated that the proposed method can successfully detect small road
small road markings at a great distance by converting the images into a bird’s eye view.
markings at a great
In particular, the distance
number by of converting
label samplesthe images into a bird’s eye
after transforming willview. In particular,
not change, so the
the number of label samples after transforming will not change,
comparison between the two types of images is based on the same foundation. so the comparison be-
tween the two types of images is based on the same foundation.
Sensors
Sensors2024,
2024,25,
24,x8080
FOR PEER REVIEW 20 of 23
19 of 21
(a)
(b)
(c)
(d)
Figure
Figure11.
11.Examples
Examplesofofthe
thefront
frontview
viewwith
withbird’s
bird’seye
eyeview
viewmodel
modeltesting
testingon
ondifferent
differentcases.
cases.
5. Conclusions
5. Conclusions
In this paper, the cross-field road markings have been successfully detected based
In this paper,
on Inverse the cross-field
Perspective Mappingroad markings
(IPM). After the have been successfully
perspective detectedthe
transformation, based on
distant
Inverse Perspective Mapping (IPM). After the perspective transformation,
objects on the road surface were detected, which solves the small object detection problem. the distant ob-
jects
Firstonof the road
all, the two surface
open were
datasetsdetected,
derivedwhich
from thesolves the world
virtual small object
and real detection
world were problem.
mixed
First of training
for the all, the two dataopen
which datasets
can reducederived frompreprocessing
the data the virtual worldtime andand cost.
real Theworld were
research,
which trained three kinds of models according to the different
mixed for the training data which can reduce the data preprocessing time and cost. The perspective training images,
presented
research, different
which results.
trained threeThekindstesting phase,according
of models comparedtowith the preliminary
the different study,train-
perspective used
front-view images to test on the road environment. IPM was
ing images, presented different results. The testing phase, compared with the preliminary performed on the input
images to transform them into the bird’s eye view, which solves the “small objects at far
study, used front-view images to test on the road environment. IPM was performed on
distance” problem and the “perspective distortion of objects” problem, so that the model
the input images to transform them into the bird’s eye view, which solves the “small ob-
can clearly recognize objects on the road. Using three kinds of models to test the front-view
jects at far
images anddistance”
bird’s eye problem and the can
view images “perspective
demonstrate distortion of objects”
the apparent problem,
result. In the so that
second
the model
model testcan
on clearly
the images recognize
after theobjects
IPMon the road.itUsing
approach, could three
reachkinds
an 85.57%of models
mAP,towhich test
the front-view
obtained images and
the immense bird’s eye view
improvement images
of 27.67% can demonstrate
(57.90–85.57%). Thethe apparent
third model result.
testingIn on
the front-view images and bird’s eye view images also showed
the second model test on the images after the IPM approach, it could reach an 85.57% a remarkable improvement
of accuracy
mAP, which by 18.62% the
obtained (60.04–78.66%). Moreover, for
immense improvement of the sake (57.90–85.57%).
27.67% of making the model more
The third
model testing on the front-view images and bird’s eye view images also showed a remark-in
robust and stable, the data augmentation method was employed to generate more data
the limited
able improvementdataset. ofIn comparison
accuracy with the
by 18.62% model without
(60.04–78.66%). data augmentation,
Moreover, for the sakethe result
of mak-
could increase by 1–2% mAP. We utilized Mask R-CNN as the implemented model and
ing the model more robust and stable, the data augmentation method was employed to
compared this with other models, SOLO and YOLACT, to ensure the proposed method
generate
could bemore data
realized in the limited dataset. In comparison with the model without data
successfully.
augmentation,
Much remains the result
to becould
doneincrease
for future by work;
1–2% mAP. We anticipated
it is also utilized Mask thatR-CNN
the work as the
can
implemented model and compared this with other models, SOLO
increase some different classes of road markings detection, such as “stop”, “slow”, “speed and YOLACT, to ensure
the proposed
limit”, “bicyclemethod
sign”,could
“lanebe realized
lines”, and successfully.
so on, so it can produce a more reliable and stable
detection
Much model.
remainsIntoaddition,
be donethe forperspective
future work; transformation of the images
it is also anticipated that the is fulfilled
work can by
choosing four reliable points and then warping them into the 2D
increase some different classes of road markings detection, such as “stop”, “slow”, “speed plane. The four points are
decided
limit”, depending
“bicycle sign”,on the different
“lane lines”, and datasets
so on, tosofind
it canthe suitablea points
produce after removing
more reliable and stable the
irrelevant background or to find the points along the roadside, so that the points follow the
detection model. In addition, the perspective transformation of the images is fulfilled by
properties of the different datasets, lacking uniformity. If the dataset contains the camera
choosing four reliable points and then warping them into the 2D plane. The four points
parameters, the homography matrix is easily computed, but it is hard to ensure that all the
are decided
datasets comedepending
with theon the different
camera datasets
information. to find thethese
Considering suitable points after
limitations, removing
the perspective
the irrelevant background or to find the points along the roadside,
transformation based on the deep-learning method can be examined and undertaken, which so that the points fol-
low the properties of the different datasets, lacking uniformity. If the dataset contains the
camera parameters, the homography matrix is easily computed, but it is hard to ensure
Sensors 2024, 24, 8080 20 of 21
automatically produces the bird’s eye view images without manual selection and camera
parameters. Moreover, other data augmentation techniques can be tried in the study to
prove that the method is beneficial to augment the data. In spite of some limitations of the
conclusion, the contributions of the study are seen to be compelling enough to encourage
future investigation into both this and other road marking-related topics. We will also
explore YOLO-based methods and state-of-the-art instance segmentation methods, such as
YOLOv8, TSD-YOLO, and CBNetV2, to compare their performance with the approaches
used in this paper. This will help evaluate the strengths of bounding box-level detection
versus mask-level segmentation in road marking tasks. Additionally, an ablation study on
data augmentation techniques, including brightness adjustment, contrast enhancement, and
flipping, will be conducted to assess their individual contributions to model performance.
Furthermore, we plan to design data augmentation strategies specifically tailored to road
marking detection in diverse and complex environments such as wear and tear of road
markings, and extreme lighting scenarios like glare or low-light conditions. These directions
aim to build upon the current findings and further enhance the robustness and applicability
of road marking detection methods.
Author Contributions: Conceptualization, E.H.-C.L. and Y.-C.H.; Methodology, E.H.-C.L. and Y.-C.H.;
Software, Y.-C.H.; Validation, Y.-C.H.; Formal analysis, E.H.-C.L.; Investigation, Y.-C.H.; Resources,
E.H.-C.L.; Data curation, Y.-C.H.; Writing—original draft, E.H.-C.L. and Y.-C.H.; Writing—review &
editing, E.H.-C.L.; Visualization, Y.-C.H.; Supervision, E.H.-C.L.; Project administration, E.H.-C.L.; Funding
acquisition, E.H.-C.L. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by National Science and Technology Council grant number NSTC
112-2628-M-006-008-MY2.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data are contained within the article.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788.
2. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587.
3. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv
2015, arXiv:1506.01497. [CrossRef] [PubMed]
4. He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE Conference on Computer Vision, Honolulu,
HI, USA, 21–27 July 2017; pp. 2961–2969.
5. Girshick, R. Fast R-CNN. In Proceedings of the IEEE Conference on Computer Vision, Santiago, Chile, 7–13 December 2015;
pp. 1440–1448.
6. Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271.
7. Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767.
8. Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020,
arXiv:2004.10934.
9. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2016; pp. 770–778.
10. Lin, T.Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125.
11. Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT: Real-Time Instance Segmentation. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, Seoul, Republic of Korea, 27–28 October 2019; pp. 9157–9191.
12. Wang, X.; Kong, T.; Shen, C.; Jiang, Y.; Li, L. SOLO: Segmenting Objects by Locations. In European Conference on Computer Vision;
Springer: Cham, Switzerland, 2020; pp. 649–665.
13. Wang, X.; Zhang, R.; Kong, T.; Li, L.; Shen, C. SOLOv2: Dynamic and Fast Instance Segmentation. Adv. Neural Inf. Process. Syst.
2020, 33, 17721–17732.
Sensors 2024, 24, 8080 21 of 21
14. Tang, Z.; Boukerche, A. An Improved Algorithm for Road Markings Detection with SVM and ROI Restriction: Comparison with
a Rule-Based Model. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas, MI, USA,
20–24 May 2018; pp. 1–6.
15. Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. In Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–22 June 2005; Volume 1, pp. 886–893.
16. Hearst, M.; Dumais, S.; Osuna, E.; Platt, J.; Scholkopf, B. Support Vector Machines. IEEE Intell. Syst. Their Appl. 1998, 13, 18–28.
[CrossRef]
17. Lee, S.; Kim, J.; Yoon, J.S.; Shin, S.; Bailo, O.; Kim, N.; Lee, T.H.; Hong, H.; Han, S.H.; Kweon, I.S. VPGNet: Vanishing Point
Guided Network for Lane and Road Marking Detection and Recognition. In Proceedings of the IEEE Conference on Computer
Vision, Venice, Italy, 22–29 October 2017; pp. 1947–1955.
18. Hoang, T.M.; Nam, S.H.; Park, K.R. Enhanced Detection and Recognition of Road Markings based on Adaptive Region of Interest
and Deep Learning. IEEE Access 2019, 7, 109817–109832. [CrossRef]
19. Zhang, W.; Mi, Z.; Zheng, Y.; Gao, Q.; Li, W. Road Marking Segmentation based on Siamese Attention Module and Maximum
Stable External Region. IEEE Access 2019, 7, 143710–143720. [CrossRef]
20. Ye, X.Y.; Hong, D.S.; Chen, H.H.; Hsiao, P.Y.; Fu, L.C. A Two-Stage Real-Time YOLOv2-based Road Marking Detector with
Lightweight Spatial Transformation-Invariant Classification. Image Vis. Comput. 2020, 102, 103978. [CrossRef]
21. Jaderberg, M.; Simonyan, K.; Zisserman, A.; Kavukcuoglu, K. Spatial Transformer Networks. Adv. Neural Inf. Process. Syst. 2015,
28, 2017–2025.
22. Li, H.; Feng, M.; Wang, X. Inverse Perspective Mapping based Urban Road Markings Detection. In Proceedings of the 2012 IEEE
2nd International Conference on Cloud Computing and Intelligence Systems, Hangzhou, China, 30 October–1 November 2012;
Volume 3, pp. 1178–1182.
23. Greenhalgh, J.; Mirmehdi, M. Detection and Recognition of Painted Road Surface Markings. In Proceedings of the International
Conference on Pattern Recognition Applications and Methods, Lisbon, Portugal, 10–12 January 2015; pp. 130–138.
24. Jiri, M.; Ondřej, C.; Martin, U.; Tomas, P. Robust Wide-Baseline Stereo from Maximally Stable Extremal Regions. Image Vis.
Comput. 2004, 22, 761–767.
25. Bailo, O.; Lee, S.; Rameau, F.; Yoon, J.S.; Kweon, I.S. Robust Road Marking Detection and Recognition Using Density-Based
Grouping and Machine Learning Techniques. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision
(WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 760–768.
26. Kang, J.; Jo, Y.; Lee, D.; Han, S.J.; Min, K.; Choi, J. Real-Time Road Surface Marking Detection from a Bird’s-Eye View Image
Using Convolutional Neural Networks. In Proceedings of the Twelfth International Conference on Machine Vision (ICMV 2019),
Amsterdam, The Netherlands, 16–18 November 2020; Volume 11433, pp. 599–604.
27. Jayasinghe, O.; Hemachandra, S.; Anhettigama, D.; Kariyawasam, S.; Rodrigo, R.; Jayasekara, P. CeyMo: See More on Roads-A
Novel Benchmark Dataset for Road Marking Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of
Computer Vision, Waikoloa, HI, USA, 3–8 January 2022; pp. 3104–3113.
28. Everingham, M. The PASCAL Visual Object Classes Challenge 2010 (VOC2010) Development Kit. In Evaluation (chap. 3.4). 2010.
Available online: http://host.robots.ox.ac.uk/pascal/VOC/voc2010/htmldoc/devkit_doc.html#SECTION00044000000000000
000 (accessed on 8 May 2010).
29. Bolya, D.; Zhou, C.; Xiao, F.; Lee, Y.J. YOLACT++: Better Real-Time Instance Segmentation. IEEE Trans. Pattern Anal. Mach. Intell.
2020, 44, 1108–1121. [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.