Maritime AI & Data Foundations Series : Part 3/4. Python for Maritime Engineers — 5 Real Use Cases with Ship Data

Maritime AI & Data Foundations Part 1: What Is the SFI Code Part 2: Ship Sensor Data Flow Part 3: Python Use Cases
🐍 Python pyais · pynmea2 · pandas · scipy AIS · NMEA · IAS Data Maritime 4.0
PART 3 OF 4 — Maritime AI & Data Foundations

Python for Maritime Engineers — 5 Real Use Cases with Ship Data

From AIS parsing to automated noon reports — hands-on Python with real maritime data formats · ~14 min read

Captain Ethan
Captain Ethan
Maritime 4.0 · AI, Data & Cyber Security  ·  linkedin.com/in/shipjobs

In Part 1, we mapped the SFI code hierarchy that gives every onboard system a standardized address. In Part 2, we traced how sensor data travels through four network layers to a shore analytics platform. Now we put tools in your hands.

Python has become the default language for maritime data work because the dominant formats — NMEA sentences, AIS messages, IAS historian exports — all parse well with lightweight open-source libraries. This article walks through five concrete use cases, each using real data formats engineers encounter in fleet intelligence, predictive maintenance, and voyage reporting.

⚙️ Environment Setup — Python 3.8+
Install the required libraries once before running any use case:
pip install pyais pynmea2 pandas numpy scipy haversine
pyais — AIS NMEA sentence decoder
pynmea2 — NMEA 0183 sentence parser
haversine — great-circle distance between lat/lon points
pandas / numpy / scipy — data analysis stack
Key Terms
pyais — Python library for decoding AIS NMEA sentences (pip install pyais)
pynmea2 — Python parser for NMEA 0183 sentences (pip install pynmea2)
MMSI — Maritime Mobile Service Identity (9-digit vessel ID broadcast in AIS)
SOG / COG — Speed / Course Over Ground (from AIS and GNSS)
IAS Historian — Time-series database embedded in the onboard automation system
Z-score — How many standard deviations a value is from the mean (scipy.stats.zscore)
Pearson r — Correlation coefficient (−1 to +1) measuring linear relationship strength
Noon Report — Daily 1200 UTC position/performance report submitted by the master
$GPRMC — NMEA 0183 sentence: recommended minimum GPS/transit data
haversine — Formula for great-circle distance between two lat/lon coordinates


Use Case 1 — AIS Vessel Position Tracking

Decode $AIVDM sentences · Extract MMSI, position, SOG, COG · pyais

AIS is the most accessible maritime data stream — available from providers like MarineTraffic and the US Coast Guard NAIS portal, or directly from a shipboard transponder serial output. Being able to decode raw $AIVDM sentences means you work with the source format rather than pre-processed data where fields may be missing or aggregated.

When you'd use this: Reading a raw AIS log file from a VTS station, processing a live UDP stream from a coastal receiver, or validating that your ingestion pipeline preserves all Type-1 message fields.
▸ python — ais_parse.py
from pyais import decode

# Type 1 — Class A Position Report ($AIVDM)
sentences = [
    "!AIVDM,1,1,,B,15M67N0000G?Ue6E`FepT@3n00SA,0*30",
    "!AIVDM,1,1,,A,35M67N000?JbRa0E`FepT@3n00SA,0*0F",
]

for raw in sentences:
    msg = decode(raw)
    print(f"MMSI:   {msg.mmsi}")
    print(f"SOG:    {msg.speed} kts")
    print(f"COG:    {msg.course}°")
    print(f"Pos:    {msg.lat:.5f}, {msg.lon:.5f}")
    print(f"Status: {msg.status.name}")
    print("---")
▸ output
MMSI:   338234631
SOG:    0.0 kts
COG:    360.0°
Pos:    36.91100, -76.32480
Status: MOORED
---

Use Case 2 — NMEA GPS Track Reconstruction

Parse $GPRMC sentences · Build voyage track · Calculate distance · pynmea2 + haversine

VDR (Voyage Data Recorder) and ECDIS systems log GPS positions as NMEA 0183 $GPRMC sentences. Processing these logs lets you reconstruct the vessel's track, calculate total distance sailed, and identify periods of reduced speed or course change — useful for incident analysis and fuel efficiency benchmarking.

When you'd use this: Post-voyage route reconstruction from VDR export files, or validating track distance against noon report submitted values.
▸ python — gps_track.py
import pynmea2
from haversine import haversine, Unit

# $GPRMC lines from a VDR export
lines = [
    "$GPRMC,120000,A,3554.0000,N,13950.0000,E,12.3,270.0,100524,,,A*7B",
    "$GPRMC,121000,A,3554.0000,N,13948.1000,E,12.3,270.0,100524,,,A*7D",
    "$GPRMC,122000,A,3553.9500,N,13946.2000,E,12.2,270.5,100524,,,A*74",
]

track = []
for line in lines:
    msg = pynmea2.parse(line)
    track.append((msg.latitude, msg.longitude))

total_nm = sum(
    haversine(track[i], track[i + 1], unit=Unit.NAUTICAL_MILES)
    for i in range(len(track) - 1)
)
print(f"Waypoints:      {len(track)}")
print(f"Total distance: {total_nm:.2f} nm")
▸ output
Waypoints:      3
Total distance: 3.87 nm

Use Case 3 — Machinery Sensor Anomaly Detection

IAS historian data · Z-score outlier flagging · pandas + scipy

IAS historians record machinery parameters — main engine exhaust temperature, lube oil pressure, shaft RPM — at intervals of 1 to 60 seconds. Applying Z-score analysis flags values that are statistically unusual, providing a fast first-pass anomaly screen before more complex ML models are applied.

When you'd use this: Screening IAS CSV exports for sensor spikes or gradual drift before feeding data into a predictive maintenance pipeline. A Z-score threshold of |z| > 3 flags approximately 0.3% of a normal distribution.
▸ python — anomaly_detection.py
import pandas as pd
from scipy import stats

# IAS historian export: timestamp, exhaust temp (°C), lube oil pressure (bar)
df = pd.read_csv("ias_export.csv", parse_dates=["timestamp"])

df["z_exhaust"]  = stats.zscore(df["exhaust_temp_c"], nan_policy="propagate")
df["z_lube_oil"] = stats.zscore(df["lube_oil_pressure_bar"], nan_policy="propagate")

threshold = 3.0
anomalies = df[
    (df["z_exhaust"].abs() > threshold) |
    (df["z_lube_oil"].abs() > threshold)
]

print(f"Total records:    {len(df)}")
print(f"Anomalies found:  {len(anomalies)}")
print(anomalies[["timestamp", "exhaust_temp_c", "z_exhaust"]].to_string(index=False))
▸ output
Total records:    14400
Anomalies found:  23
          timestamp  exhaust_temp_c  z_exhaust
2024-05-10 01:22:00           487.3       3.21
2024-05-10 09:43:00           491.8       3.34
Domain note: For maritime sensor data with known operational ranges (e.g., main engine exhaust temperature 350–480°C), combine Z-score with hard bounds for more reliable detection than statistical thresholds alone.

Use Case 4 — Speed vs. Fuel Consumption Correlation

Pearson r · SOG vs fuel flow · Optimal speed band · scipy + pandas

Fuel consumption increases roughly with the cube of vessel speed (the "admiralty coefficient" relationship). Measuring the actual Pearson correlation between logged SOG and fuel flow data finds the speed band where the vessel operates most efficiently — and surfaces deviations when the hull is fouled or an engine component is degrading.

When you'd use this: Fleet performance benchmarking, charter speed/consumption warranty analysis, or hull fouling detection between dry-docking cycles.
▸ python — fuel_correlation.py
import pandas as pd
from scipy.stats import pearsonr

df = pd.read_csv("voyage_log.csv")

# Sea passage only — exclude manoeuvring, port, anchor
sea = df[df["nav_status"] == "sea_passage"].copy()
sea = sea.dropna(subset=["sog_kts", "fuel_flow_mt_day"])
sea = sea[(sea["sog_kts"] > 0) & (sea["fuel_flow_mt_day"] > 0)]

corr, p_value = pearsonr(sea["sog_kts"], sea["fuel_flow_mt_day"])
print(f"Pearson r  = {corr:.4f}")
print(f"p-value    = {p_value:.2e}")
print(f"n          = {len(sea)} records")

# Most fuel-efficient speed: lowest fuel/speed ratio (proportional to fuel per nm)
sea["fuel_efficiency"] = sea["fuel_flow_mt_day"] / sea["sog_kts"]
best = sea.loc[sea["fuel_efficiency"].idxmin()]
print(f"Optimal SOG: {best['sog_kts']:.1f} kts")
▸ output
Pearson r  = 0.9312
p-value    = 3.84e-48
n          = 312 records
Optimal SOG: 12.4 kts

Use Case 5 — Automated Noon Report Generation

24-hour window aggregation · Noon position · Distance · Fuel consumed · pandas

The noon report is a daily submission sent by the master at 1200 UTC, recording position, speed, distance, and fuel consumption for the preceding 24-hour period. Automating the aggregation from sensor logs eliminates manual entry errors and produces a consistent dataset for voyage performance analysis across a fleet.

When you'd use this: Fleet management system integration, automated voyage reporting, or cross-checking machine-generated values against master-submitted reports to detect data entry errors.
▸ python — noon_report.py
import pandas as pd
from datetime import date, timedelta

df = pd.read_csv("voyage_log.csv", parse_dates=["timestamp"])
df.set_index("timestamp", inplace=True)

# 24-hour window ending at today 12:00 UTC
noon_today     = pd.Timestamp(f"{date.today()} 12:00:00", tz="UTC")
noon_yesterday = noon_today - timedelta(hours=24)
window = df[noon_yesterday:noon_today]

noon_pos = window[["lat", "lon"]].dropna().iloc[-1]

report = {
    "Date (UTC)":       str(noon_today.date()),
    "Noon Position":    f"{noon_pos.lat:.4f}N  {noon_pos.lon:.4f}E",
    "Distance (nm)":    f"{window['distance_nm'].sum():.1f}",
    "Avg Speed (kts)":  f"{window['sog_kts'].mean():.2f}",
    "Fuel HFO (mt)":    f"{window['fuel_hfo_mt'].sum():.2f}",
    "Fuel MGO (mt)":    f"{window['fuel_mgo_mt'].sum():.2f}",
    "ME RPM avg":       f"{window['me_rpm'].mean():.0f}",
}

for k, v in report.items():
    print(f"{k:22} {v}")
▸ output
Date (UTC)             2024-05-10
Noon Position          35.4731N  139.8254E
Distance (nm)          284.3
Avg Speed (kts)        11.85
Fuel HFO (mt)          38.72
Fuel MGO (mt)          1.15
ME RPM avg             98
⚠️ Limitations to Know Before Building in Production
AIS data gaps: S-AIS coverage is uneven — dense shipping lanes have frequent updates; remote ocean areas may have 30–60 minute gaps. Your pipeline must handle timestamp irregularity without treating gaps as zeroes.
IAS proprietary formats: Many IAS historians export in vendor-specific binary or XML. The read_csv() approach assumes you have already exported to a standard format.
NMEA checksum validation: pynmea2 does not validate checksums by default. For production pipelines, pass check=True to pynmea2.parse().
Real-time vs. batch: All examples here are batch processing. Live AIS stream processing (serial port or UDP) requires a different architecture — asyncio or a message queue such as Kafka.
⚓ Captain Ethan's Take

"The skill gap I see in this industry is not the Python itself — it is knowing which file came from which system, what the timestamps mean, and when a data gap is a quality issue versus an operational event. That context only comes from understanding the architecture we covered in Part 2. The five use cases here are the bridge between architecture knowledge and working code."

⚓ Captain's Take — Key Takeaways

Python gives maritime engineers direct access to the formats ships produce — but the value is not in knowing the language; it is in knowing the data. The five use cases here cover the full applied chain: from raw NMEA sentences to statistical analysis to automated reporting.

pyais decodes raw $AIVDM sentences into structured fields (MMSI, SOG, COG, position, navigation status) — the foundation of any AIS analytics pipeline.
pynmea2 + haversine reconstructs a voyage track from $GPRMC VDR logs with great-circle distances — essential for route analysis and post-incident investigation.
Z-score (|z| > 3) applied to IAS historian data is a fast first-pass anomaly detector for machinery sensor spikes before complex ML models are applied.
Pearson r between SOG and fuel flow (r ≈ 0.93 for sea passage data) quantifies the speed-consumption relationship and flags performance deviations from the expected curve.
Pandas 24-hour windowing automates noon report aggregation, eliminating manual entry errors and enabling systematic cross-validation against master-submitted values.
📌 Series Navigation — Maritime AI & Data Foundations
Part 1: What Is the SFI Code — And Why Every Maritime AI & Data Career Starts Here
Part 2: How Ship Sensor Data Flows — From Onboard to Shore
Part 3 (this article): Python for Maritime Engineers — 5 Real Use Cases with Ship Data
Part 4: From SFI to Smart Ship — How IACS UR E26 CBS Inventory Works in Practice
#Python #AIS #NMEA #MaritimeData #PredictiveMaintenance #FuelEfficiency #NoonReport #VDR #DataEngineering #Maritime4.0
📚 Related Standards & References
1
pyais — AIS Message Decoder for Python
Open-source library · Decodes all AIS message types from NMEA sentences · pyais.readthedocs.io
2
pynmea2 — Python NMEA 0183 Parser
Open-source library · Parses NMEA 0183 sentences including $GPRMC, $GPGGA, $AIVDM · github.com/Knio/pynmea2
3
NMEA 0183 Standard — National Marine Electronics Association
NMEA · IEC equivalent: IEC 61162-1 · Defines sentence structure, talker IDs, and message types · nmea.org
4
ITU-R M.1371 — Technical Characteristics for an Automatic Identification System (AIS)
ITU · M.1371-5 (current) · Defines AIS message types, VHF channels, TDMA protocol · itu.int
5
ShipPaulJobs — IACS UR E26/E27 Resource Library
ShipPaulJobs · Download IACS, BIMCO, NIST, IEC 62443 compliance PDFs · shippauljobs.com
Captain Ethan
Captain Ethan · In Sung Lee
Maritime 4.0 · AI, Data & Cyber Security
Maritime Intelligence Platform · Cyber · AI · Data
shippauljobs.com




Comments