Documentation Index
Fetch the complete documentation index at: https://www.siya.com/llms.txt
Use this file to discover all available pages before exploring further.
📝The Challenge
Inaccurate ETAs are a primary source of operational friction in the maritime industry, leading to increased costs, scheduling conflicts, and inefficient port operations. This module addresses this by moving beyond simple calculations to a data-driven prediction model that accounts for real-world, dynamic variables.
Data Flow & Processing
The ETA prediction model relies on a robust, four-stage data pipeline to ensure accuracy and reliability, transforming raw inputs into actionable insights.
Core Algorithms & Calculations
The ETA prediction model is built on a comprehensive data pipeline that transforms raw, diverse inputs into a refined, reliable forecast. This process involves five key stages, from initial data collection to the final continuous calculation.
1. Data Ingestion: Weather & Route Polling
The process begins by polling external APIs for essential voyage context. The system is designed to periodically query Navtor for detailed routing information and Stormglass for real-time and forecasted weather conditions along the vessel’s planned route.
Algorithm Logic:
- Establish secure connections to Navtor and Stormglass APIs.
- Request route data for a specific vessel and voyage.
- Request weather parameters (wind speed, wave height, currents) for the specific geographical points along the route.
- Store the raw, unstructured JSON responses for processing.
A significant portion of operational data arrives in unstructured formats, such as noon report emails. The system uses a sophisticated parsing engine to extract critical information from this text.
Algorithm Logic:
- Monitor a designated inbox for incoming noon report emails.
- Use regular expressions (regex) and keyword matching to identify and isolate key data points (e.g., “SOG:”, “Remaining Dist:”, “ETA:”).
- Extract values for vessel speed, remaining distance, fuel consumption, and the reported ETA.
- Temporarily store this extracted, key-value data for the next stage.
import re
def parse_noon_report_email(email_body):
# Use regex to find key-value pairs in the email text
# Note: These are simplified patterns for illustration
sog_pattern = re.compile(r"SOG:\s*([\d\.]+)\s*knots")
eta_pattern = re.compile(r"ETA:\s*(\d{4}-\d{2}-\d{2}\s*\d{2}:\d{2})")
sog_match = sog_pattern.search(email_body)
eta_match = eta_pattern.search(email_body)
extracted_data = {
"SOG": sog_match.group(1) if sog_match else None,
"ETA": eta_match.group(1) if eta_match else None,
# ... other extracted fields
}
return extracted_data
Alternative Method: GPT-Based Parsing
For more complex or less structured reports, a Large Language Model (LLM) is be used for more robust and flexible data extraction.
import openai
def parse_with_gpt(email_body):
# Prepare a prompt that instructs the model to extract key information
prompt = f"""
Extract the following entities from this noon report:
- Speed Over Ground (SOG) in knots
- Estimated Time of Arrival (ETA) in YYYY-MM-DD HH:MM format
Report: "{email_body}"
Return the result as a JSON object.
"""
# Call the OpenAI API (or any other LLM provider)
response = openai.Completion.create(
engine="text-davinci-003",
prompt=prompt,
max_tokens=100
)
# The model's response will be a JSON string that can be parsed
extracted_data_json = response.choices[0].text
return json.loads(extracted_data_json)
3. Data Standardization & Enrichment
Once data is extracted, it must be converted into a standardized, structured format. This stage involves cleaning the data, converting units, and enriching it with information from other sources.
Code Implementation:
def standardize_report_data(extracted_data):
# Convert speed from knots to a standard float
standardized_speed = float(extracted_data.get("SOG"))
# Standardize date and time formats to UTC
reported_eta_str = extracted_data.get("ETA")
standardized_eta = convert_to_utc(reported_eta_str)
# Enrich with data from other sources
vessel_dwt = get_vessel_particulars(vessel_id)
# Create a clean, structured data object
structured_report = {
"speed_knots": standardized_speed,
"eta_utc": standardized_eta,
"vessel_dwt": vessel_dwt,
# ... other fields
}
return structured_report
4. Data Validation: Time-Based Correction
A key source of ETA error stems from the timing of noon report submissions. Reports filed after midday can be incorrectly timestamped to the following day. The system applies a specific logical check to correct this.
Algorithm Logic:
The algorithm checks the timestamp of each incoming noon report. If the report’s time is after 12:00 PM (noon), but the associated date has been advanced to the next day, the algorithm corrects the date back to the actual day of submission.
Code Implementation:
def correct_noon_report_date(report):
# Check if the report time is post-meridian (after 12:00 PM)
is_after_noon = report.time > '12:00:00'
# Check if the date has been incorrectly advanced
date_is_advanced = report.date > actual_submission_date
if is_after_noon and date_is_advanced:
# Revert the date to the correct day
report.date = actual_submission_date
return report
5. Voyage Status & Geolocation Processing
After the primary ETA is calculated, the data is further enriched with voyage status and geolocation information to provide a complete operational picture on the dashboard.
Geolocation Mapping:
This function maps the vessel’s current location to a standardized geographical region for easier tracking and filtering.
Code Implementation:
def map_location(latitude, longitude):
# This function would contain logic to map coordinates to defined regions
# Example:
if 30.0 < latitude < 60.0 and -30.0 < longitude < 0.0:
return "North Atlantic"
elif 25.0 < latitude < 45.0 and 35.0 < longitude < 65.0:
return "Arabian Sea"
else:
return "Unknown Region"
Voyage Status Creation:
The system generates a dynamic, human-readable status for each voyage based on its current operational data.
Code Implementation:
def create_voyage_status(voyage_data):
# This function would create a status string based on vessel activity
# Example:
if voyage_data.get('speed_knots') > 1:
status = f"En route to {voyage_data.get('destination_port')}"
else:
status = f"Alongside at {voyage_data.get('current_port')}"
return status
Final Data Processing:
This function orchestrates the final data processing steps, bringing together all the calculated and enriched data points into a final, dashboard-ready object.
Code Implementation:
def process_eta_data(voyage_id):
# This function would be the main orchestrator for a single voyage
# 1. Fetch the latest validated report data
report_data = get_validated_report(voyage_id)
# 2. Calculate the live ETA
live_eta = calculate_live_eta(report_data)
# 3. Map the location
geo_location = map_location(report_data.get('lat'), report_data.get('lon'))
# 4. Create a human-readable status
voyage_status = create_voyage_status(report_data)
# 5. Assemble the final data object for the dashboard
dashboard_payload = {
"voyage_id": voyage_id,
"live_eta_utc": live_eta,
"location_region": geo_location,
"current_status": voyage_status,
"last_updated": get_current_utc_time()
}
return dashboard_payload