Abstract:
|
Ship positioning and maneuvering information is highly relevant to understand
the levels of pollution on coastal cities and sea-life quality, containing latent patterns
of vessels behavior, that are of utility on earth sciences and environmental
research.
Using Automatic Identification System (AIS) data enables air quality models
to have finer grain estimations. However, the data as it is, carries uncertainty
and errors. Therefore, there is a need for a methodology to filter and clean it and
to extract patterns. Ship navigation traces can be understood as time series.
Here, we present a methodology for characterizing ships by their navigation
traces, using Conditional Restricted Boltzmann Machines (CRBMs) plus classic
clustering techniques like k-Means.
From the inputs received from ships using the AIS, containing ship positions,
speed, and characteristics, we produce a processed cruising trace that a CRBM
can encode while preserving the time factor and reducing dimensionality of data.
Such codification can be then clustered or pattern-mined, then used not only for
ship classification but also to cross such behavior patterns with environmental
information. In this paper we detail such methodology and validate it using
data from the Spanish Ports Authority records from national and international
fishing vessels and passenger and cargo ships.
Along the pattern mining methodology we propose how to use Apache Spark
for the data cleaning process until it arrives to the Conditional Restricted Boltzmann
Machine (CRBM). Finally, we develop a visualization tool for data exploration
and pattern evaluation. |