Abstract
This paper presents an innovative intelligent decision optimization model that integrates distributed blockchain technology with federated reinforcement learning to address critical challenges in ship traffic collaborative supervision. Traditional maritime traffic monitoring systems suffer from data silos, privacy concerns, and centralized decision-making bottlenecks that impede effective multi-jurisdictional coordination. The proposed framework employs a multi-layered architecture consisting of data layer, blockchain layer, federated learning layer, and decision layer to enable secure data sharing while preserving operational autonomy among maritime authorities. The distributed blockchain mechanism ensures data integrity and immutability through cryptographic protocols and smart contracts, while the federated reinforcement learning algorithm enables privacy-preserving collaborative model training without exposing sensitive commercial information. Experimental validation demonstrates superior performance with 93.6% decision accuracy, 520ms average response time, and 285 transactions per second throughput. Case studies involving emergency collision avoidance, abnormal behavior identification, and search-and-rescue coordination confirm the system’s practical effectiveness, achieving 40% reduction in incident response times and 60% enhancement in cross-agency collaboration efficiency. The research provides a robust foundation for next-generation maritime traffic management systems that require secure multi-party collaboration and intelligent decision optimization.
Introduction
The maritime transportation industry plays a pivotal role in global trade, with over 90% of international cargo transported by sea, making efficient ship traffic supervision crucial for ensuring maritime safety and economic stability1. Traditional ship traffic monitoring systems face significant challenges in the era of digital transformation, particularly in addressing the complexities of multi-jurisdictional waters and the increasing volume of maritime traffic2. The fragmentation of monitoring systems across different maritime authorities creates substantial data silos, preventing comprehensive situational awareness and coordinated decision-making processes that are essential for effective traffic management3.
Contemporary ship traffic supervision systems suffer from several critical limitations that impede their effectiveness in modern maritime environments. Data isolation between different monitoring stations and maritime authorities results in incomplete traffic pictures, leading to suboptimal routing decisions and potential safety hazards4. Privacy concerns regarding sensitive commercial shipping data further complicate information sharing between stakeholders, as shipping companies are reluctant to disclose proprietary route information and cargo details5. Additionally, the centralized nature of current decision-making systems creates bottlenecks that reduce response times during critical situations, particularly in high-traffic maritime corridors where rapid coordination is essential6.
The emergence of distributed blockchain technology offers promising solutions to address these fundamental challenges in maritime traffic supervision. Blockchain’s inherent characteristics of decentralization, immutability, and transparency provide a robust framework for secure data sharing while maintaining privacy through cryptographic mechanisms7. The integration of smart contracts enables automated compliance monitoring and real-time enforcement of maritime regulations without requiring centralized oversight8. Furthermore, the distributed nature of blockchain networks ensures system resilience and eliminates single points of failure that plague traditional centralized monitoring systems.
Federated reinforcement learning represents another breakthrough technology that addresses the dual challenges of privacy preservation and intelligent decision-making in ship traffic management. This approach enables multiple maritime authorities to collaboratively train decision-making models without sharing sensitive raw data, thereby preserving commercial confidentiality while improving overall system performance. The federated learning paradigm allows each participating node to contribute to the global model while maintaining local data sovereignty, creating a win-win scenario for all stakeholders involved in maritime traffic supervision.
The convergence of distributed blockchain technology and federated reinforcement learning presents unprecedented opportunities for developing intelligent ship traffic supervision systems that can overcome existing limitations. Unlike traditional IMO e-Navigation systems and Vessel Traffic Services (VTS) that rely on centralized architectures and manual coordination protocols, the proposed integrated approach addresses fundamental interoperability challenges between different maritime authorities while maintaining data sovereignty. By leveraging blockchain’s secure data sharing capabilities and federated learning’s privacy-preserving collaborative intelligence, this integrated approach enables the creation of sophisticated decision optimization models that can adapt to dynamic maritime environments while respecting stakeholder privacy requirements, overcoming the static rule-based decision-making limitations inherent in conventional VTS implementations.
This research aims to develop a comprehensive distributed blockchain and federated reinforcement learning intelligent decision optimization model specifically designed for collaborative ship traffic supervision. The primary objectives include establishing a secure and efficient data sharing mechanism among maritime authorities, developing privacy-preserving intelligent decision-making algorithms that can optimize traffic flow and enhance safety, and creating a scalable framework that can accommodate the growing complexity of modern maritime transportation networks.
The main innovations of this work lie in the novel integration of blockchain consensus mechanisms with federated reinforcement learning algorithms to create a hybrid system that addresses both technical and regulatory challenges in maritime traffic supervision. As demonstrated in Table 1, the proposed method achieves superior performance compared to existing state-of-the-art approaches across multiple evaluation dimensions.
The proposed model introduces a multi-layer architecture that separates data sharing, privacy protection, and decision optimization into distinct but interconnected components, enabling flexible deployment across different maritime jurisdictions while maintaining system coherence and effectiveness.
The paper makes several significant contributions to the field of intelligent maritime traffic management. First, it presents a comprehensive theoretical framework that combines distributed ledger technology with collaborative machine learning to address the fundamental challenges of data silos and privacy concerns in maritime supervision. Second, it develops practical algorithms and protocols that enable real-world implementation of the proposed system across multiple maritime authorities. Third, it provides extensive experimental validation demonstrating the effectiveness of the integrated approach in improving decision-making efficiency and system reliability compared to traditional centralized systems.
The remainder of this paper is organized as follows: Section II reviews related work in blockchain applications and federated learning for maritime systems; Section III presents the system architecture and theoretical foundations of the proposed model; Section IV details the implementation algorithms and protocols; Section V provides comprehensive experimental results and performance analysis; and Section VI concludes with discussions of implications and future research directions.
Related technical theory foundation
Distributed blockchain technology principles
Blockchain technology maintains a continuously growing list of cryptographically linked records in a decentralized network, ensuring data integrity without central authorities13. Each block contains the previous block’s hash, timestamp, and transaction data, forming an immutable chain that prevents unauthorized modifications14.
The distributed ledger mechanism enables multiple parties to maintain synchronized copies of the same database across a network, eliminating the need for centralized data storage and reducing single points of failure15. The cryptographic hash function employed in blockchain systems can be mathematically represented as:
This cryptographic linking ensures data integrity across the blockchain network.
As illustrated in Fig. 1, the distributed blockchain architecture demonstrates the interconnected network of nodes, each maintaining identical copies of the blockchain ledger, with consensus mechanisms ensuring data consistency across the network. The diagram shows how transactions are validated, blocks are created, and the chain is propagated throughout the distributed network.
Consensus mechanisms form the cornerstone of blockchain technology, ensuring agreement among distributed nodes regarding the validity of transactions and the state of the ledger16. The most widely implemented consensus algorithms include Proof-of-Work (PoW), Proof-of-Stake (PoS), and Practical Byzantine Fault Tolerance (PBFT), each offering different trade-offs between security, energy efficiency, and transaction throughput. The consensus probability for a valid block can be expressed as:
where \(\:{N}_{honest}\) represents the number of honest nodes and \(\:{N}_{total}\) is the total number of participating nodes in the network.
Smart contracts represent programmable, self-executing contracts with the terms of agreement directly written into code, enabling automated enforcement of contractual obligations without intermediaries17. These autonomous programs run on the blockchain network and automatically execute predefined actions when specified conditions are met, providing transparency, efficiency, and cost reduction in various applications. The deterministic nature of smart contracts ensures consistent execution across all network nodes, maintaining system reliability and predictability.
The inherent advantages of blockchain technology in terms of data security, transparency, and decentralization make it particularly suitable for maritime supervision applications18. The immutable nature of blockchain records provides tamper-proof audit trails for vessel movements, cargo manifests, and regulatory compliance documentation. The transparency feature enables all authorized stakeholders to access real-time information while maintaining data integrity, fostering trust and collaboration among maritime authorities, shipping companies, and port operators. The decentralized architecture eliminates reliance on single controlling entities, reducing systemic risks and improving system resilience against cyber attacks and technical failures.
In maritime regulatory contexts, blockchain technology addresses critical challenges related to data provenance, inter-agency coordination, and regulatory compliance verification. The technology’s ability to create permanent, auditable records of vessel activities, port clearances, and safety inspections provides maritime authorities with comprehensive oversight capabilities while reducing administrative overhead and processing delays.
Federated reinforcement learning theory foundation
Reinforcement learning enables intelligent agents to learn optimal policies through environmental interaction, formalized as Markov Decision Processes with tuple \(\:\left(S,A,P,R,\gamma\:\right)\)19,20.
Federated learning has demonstrated successful applications in various domains. In smart city traffic management, multiple traffic control centers collaborate to optimize signal timing while preserving local operational data12. In unmanned aerial vehicle networks, distributed drones share navigation knowledge without centralizing flight path information21. These applications validate the feasibility of federated approaches in transportation systems.
The core objective of reinforcement learning involves finding an optimal policy \(\:{\pi\:}^{\text{*}}\) that maximizes the expected cumulative reward, mathematically expressed as:
where \(\:{r}_{t+1}\) represents the reward received at time step \(\:t+1\) following policy \(\:\pi\:\).
Q-learning represents one of the most influential model-free reinforcement learning algorithms that learns the optimal action-value function without requiring explicit knowledge of the environment dynamics22. The Q-learning algorithm iteratively updates the action-value function using the Bellman equation:
where \(\:\alpha\:\) denotes the learning rate, \(\:{s}_{t}\) and \(\:{a}_{t}\) represent the state and action at time \(\:t\), respectively, and \(\:{r}_{t+1}\) is the immediate reward.
Federated learning emerges as a distributed machine learning approach that enables multiple participants to collaboratively train a shared model while keeping their data locally stored and private23. This paradigm addresses the critical challenge of data privacy in multi-party machine learning scenarios by allowing participants to contribute to model training without revealing their sensitive datasets. The federated learning framework employs various privacy protection mechanisms, including differential privacy, secure multi-party computation, and homomorphic encryption, to ensure that individual participant data remains confidential throughout the collaborative training process.
The parameter aggregation methodology in federated learning typically employs weighted averaging techniques to combine local model updates from participating nodes into a global model. The most common aggregation function can be expressed as:
where \(\:{w}_{global}^{\left(t+1\right)}\) represents the global model parameters at iteration \(\:t+1\), \(\:{w}_{k}^{\left(t+1\right)}\) denotes the local model parameters from participant \(\:k\), \(\:{n}_{k}\) is the number of training samples at participant \(\:k\), and \(\:n\) is the total number of training samples across all participants.
Federated reinforcement learning combines the principles of federated learning with reinforcement learning to enable multiple agents to collaboratively learn optimal policies while preserving data privacy and local autonomy24. This integration provides significant theoretical advantages in multi-agent collaborative decision-making scenarios, particularly in environments where agents possess heterogeneous data distributions and face varying local conditions. The federated approach allows each agent to maintain its local policy while contributing to a global knowledge base that benefits all participants.
The theoretical advantages of federated reinforcement learning in multi-agent systems include enhanced sample efficiency through knowledge sharing, improved robustness through diverse training experiences, and maintained privacy through localized data processing. These characteristics make federated reinforcement learning particularly suitable for ship traffic supervision applications, where multiple maritime authorities need to coordinate decisions while maintaining operational independence and protecting sensitive commercial information. The distributed nature of the approach aligns with the decentralized structure of maritime governance, enabling seamless integration with existing regulatory frameworks while providing enhanced collaborative capabilities for complex traffic management scenarios.
Ship traffic collaborative supervision requirements analysis
Contemporary ship traffic supervision systems face unprecedented challenges in managing the increasing complexity and volume of maritime transportation activities across global shipping routes25. Current implementations such as IMO e-Navigation systems, the U.S. Maritime Safety and Security Information System (MSSIS), and regional Vessel Traffic Services (VTS) represent significant technological advances but remain limited by fragmented oversight mechanisms that operate independently across different jurisdictions26,27,28,29. These systems, while effective within their operational domains, result in significant coordination gaps and inefficiencies in information sharing between maritime authorities, particularly in cross-border maritime scenarios where seamless interoperability becomes critical for effective supervision. These systemic deficiencies manifest in delayed incident response times, incomplete situational awareness, and suboptimal resource allocation during critical maritime operations.
Multi-departmental coordination represents one of the most critical pain points in existing ship traffic supervision frameworks, as maritime authorities, port operators, coast guards, and environmental agencies often operate with incompatible information systems and divergent operational protocols. The lack of standardized communication channels and data formats creates substantial barriers to effective collaboration, particularly during emergency situations that require rapid coordinated responses. Additionally, jurisdictional boundaries and regulatory differences between nations further complicate the coordination process, leading to delayed decision-making and potential safety hazards in international waters.
Real-time monitoring capabilities constitute another fundamental requirement that remains inadequately addressed by traditional supervision systems. Current monitoring infrastructures rely heavily on periodic reporting and manual data collection processes that introduce significant temporal delays between actual vessel activities and regulatory awareness30. The absence of continuous real-time tracking and automated data processing capabilities prevents maritime authorities from maintaining comprehensive situational awareness, particularly in high-traffic shipping corridors where vessel density and movement complexity exceed human monitoring capacities.
Risk prediction and early warning systems represent critical components of effective ship traffic supervision that remain underdeveloped in conventional regulatory frameworks. Security threat responses constitute another fundamental challenge, encompassing the system’s ability to detect, mitigate, and recover from various cyber attacks, data breaches, and malicious activities targeting maritime infrastructure31.
Traditional approaches rely primarily on reactive measures that respond to incidents after they occur, rather than proactive systems that can identify and mitigate potential risks before they escalate into serious safety or environmental threats. The lack of predictive analytics capabilities prevents maritime authorities from implementing preventive measures that could significantly reduce the frequency and severity of maritime accidents.
The limitations of traditional monitoring models stem from their centralized architectures that create single points of failure and bottlenecks in information processing and decision-making workflows32. Centralized systems suffer from scalability constraints that become increasingly problematic as maritime traffic volumes continue to grow exponentially. Furthermore, the hierarchical nature of traditional supervision structures introduces communication delays and reduces system responsiveness during time-critical situations that require immediate coordinated action.
The necessity for emerging technology-based collaborative supervision solutions becomes evident when considering the exponential growth in maritime traffic complexity and the corresponding increase in regulatory oversight requirements. Traditional supervision models lack the technological sophistication necessary to address modern challenges such as autonomous vessel integration, environmental compliance monitoring, and cybersecurity threats that increasingly affect maritime operations. The integration of distributed blockchain technology and federated reinforcement learning offers promising solutions to overcome these fundamental limitations by providing secure, decentralized, and intelligent supervision capabilities.
The feasibility of implementing advanced collaborative supervision systems is supported by the rapid advancement in maritime digitalization initiatives and the increasing adoption of Internet of Things (IoT) technologies in shipping operations. Modern vessels are increasingly equipped with sophisticated sensors and communication systems that generate vast amounts of real-time operational data, creating opportunities for intelligent supervision systems to leverage this information for enhanced decision-making and risk management. The convergence of these technological trends with the urgent need for improved maritime safety and environmental protection creates favorable conditions for deploying innovative supervision solutions that can address the shortcomings of traditional regulatory approaches.
Intelligent decision optimization model design based on distributed blockchain and federated reinforcement learning
System overall architecture design
The proposed intelligent decision optimization model employs a multi-layered architecture that seamlessly integrates distributed blockchain technology with federated reinforcement learning to address the complex requirements of ship traffic collaborative supervision33. The system architecture consists of four distinct but interconnected layers: the data layer, blockchain layer, federated learning layer, and decision layer, each designed to fulfill specific functional requirements while maintaining overall system coherence and operational efficiency.
As depicted in Fig. 2, the system overall architecture demonstrates the hierarchical organization of functional modules and their interdependencies within the collaborative supervision framework. The architecture ensures scalable deployment across multiple maritime authorities while maintaining data sovereignty and enabling intelligent collaborative decision-making through advanced machine learning techniques.
The data layer forms the foundation of the system architecture, responsible for collecting, preprocessing, and managing heterogeneous maritime data from diverse sources including vessel Automatic Identification Systems (AIS), radar networks, satellite surveillance, and port management systems34. This layer implements standardized data formats and communication protocols to ensure interoperability between different data sources and maritime authorities. The data processing efficiency can be mathematically expressed as:
where \(\:{V}_{i}\) represents the data volume from source \(\:i\), \(\:{Q}_{i}\) denotes the data quality index, \(\:{T}_{i}\) is the processing time, and \(\:{C}_{i}\) indicates the computational cost.
The blockchain layer provides the secure, decentralized infrastructure for data sharing and transaction recording among participating maritime authorities. This layer implements a customized Practical Byzantine Fault Tolerance (PBFT) variant specifically optimized for maritime applications, with a validator set size of 7–21 nodes to ensure \(\:\left(f+1\right)/3\) fault tolerance where f represents the maximum number of Byzantine nodes. The consensus mechanism operates with configurable parameters: block size of 2 MB, block generation interval of 3 s, endorsement timeout of 5 s, and ordering service batch timeout of 2 s. This layer implements a customized Practical Byzantine Fault Tolerance (PBFT) variant specifically optimized for maritime applications, with detailed specifications outlined in Table 2.
where \(\:{P}_{j}\) represents the probability of successful consensus from validator \(\:j\), and \(\:m\) is the total number of validators in the network.
Figure 3 illustrates the detailed interaction flows between different system modules, showcasing how data flows from collection through processing, blockchain verification, federated learning, and ultimately to decision execution. The diagram emphasizes the bidirectional communication patterns and feedback loops that enable continuous system improvement and adaptation.
The federated learning layer orchestrates collaborative model training across distributed maritime authorities while preserving data privacy and local autonomy35. This layer implements advanced aggregation algorithms that combine local model updates from participating nodes to create globally optimized decision models. The federated learning framework employs differential privacy mechanisms and secure aggregation protocols to ensure that sensitive operational data remains protected throughout the collaborative training process. The global model convergence rate can be expressed as:
where \(\:{\theta\:}_{global}^{\left(t+1\right)}\) represents the global model parameters, \(\:{\theta\:}_{k}^{\left(t+1\right)}\) denotes local parameters from participant \(\:k\), \(\:{n}_{k}\) is the local dataset size, \(\:N\) is the total dataset size, and \(\:\epsilon\cdot\:\mathcal{N}\left(0,{\sigma\:}^{2}\right)\) introduces differential privacy noise.
The decision layer synthesizes information from lower layers to generate intelligent recommendations and automated responses for ship traffic management scenarios36. This layer incorporates sophisticated reinforcement learning algorithms that continuously adapt to changing maritime conditions and learn from historical decision outcomes. The decision layer interfaces with existing maritime management systems to provide seamless integration with current operational workflows while enhancing decision-making capabilities through artificial intelligence.
The architectural design ensures system security through multiple complementary mechanisms including blockchain-based access control, cryptographic data protection, and distributed consensus validation. The system addresses specific security threats including: (1) Sybil attacks where malicious nodes create multiple false identities, mitigated through verified credential requirements37; (2) 51% attacks attempting to control consensus mechanisms, prevented by diversified validator networks38; (3) Data poisoning attacks targeting federated learning models, countered through robust aggregation algorithms and outlier detection24; (4) Privacy inference attacks attempting to extract sensitive information, addressed via differential privacy mechanisms39.
Security is further enhanced through the federated learning approach that minimizes data exposure risks by keeping sensitive information locally stored while enabling collaborative intelligence development. The multi-layered security model provides defense in depth against various cyber threats and ensures system resilience against node failures or malicious attacks.
System efficiency is optimized through intelligent load balancing, adaptive resource allocation, and dynamic consensus mechanisms that adjust to varying network conditions and traffic demands. The architecture supports horizontal scaling to accommodate growing numbers of participating authorities and increasing data volumes without compromising performance. Real-time processing capabilities ensure that time-critical decisions can be made within acceptable latency constraints, while batch processing modes handle non-urgent analytical tasks efficiently.
The modular design philosophy enables flexible deployment configurations that can be customized to meet specific regional requirements and regulatory frameworks while maintaining interoperability with the global supervision network. This approach facilitates gradual system adoption and allows maritime authorities to integrate new capabilities incrementally without disrupting existing operations.
Distributed blockchain data management mechanism
The distributed blockchain data management mechanism forms the core infrastructure for secure and trustworthy data exchange among multiple maritime authorities, implementing sophisticated data governance protocols that ensure data integrity, confidentiality, and availability across the collaborative supervision network40. The mechanism employs a multi-tiered data storage strategy that categorizes maritime information based on sensitivity levels, access requirements, and operational criticality to optimize both security and performance characteristics of the blockchain-based data management system.
The data on-chain strategy implements a hybrid approach that distinguishes between on-chain and off-chain storage to balance security requirements with storage efficiency and transaction costs41. Critical metadata, transaction records, and verification hashes are stored directly on the blockchain to ensure immutability and auditability, while large-volume operational data such as radar images and detailed vessel tracking information are stored in distributed off-chain storage systems with cryptographic references maintained on the blockchain. The data integrity verification process can be mathematically expressed as:
where \(\:{H}_{verify}\) represents the verification hash, \(\:{D}_{i}\) denotes the data segment \(\:i\), \(\:{K}_{i}\) is the corresponding encryption key, and \(\:p\) is a large prime number used for modular arithmetic.
As shown in Table 3, the blockchain data structure specification defines the storage strategy, access control mechanism and verification protocol for different types of ship traffic data, ensuring that all types of data can be efficiently shared and collaboratively processed across departments while meeting security requirements.
The smart contract design adopts a modular architecture, including core components such as data access control contract, verification logic contract, permission management contract and audit trail contract42. The data access control contract implements role-based fine-grained permission management to ensure that only authorized maritime authorities can access specific types of sensitive information. The verification logic contract is responsible for performing complex data integrity checks and business rule validation, automatically handling data quality assessment and anomaly detection tasks. The permission management contract provides dynamic permission allocation and revocation functions, supporting temporary permission escalation and multi-party authorization mechanisms in emergency situations.
The data verification protocol implements a multi-level security mechanism, combining cryptography technology and distributed consensus algorithms to ensure the authenticity and integrity of data. The verification process uses zero-knowledge proof technology, allowing data providers to prove the validity of data without disclosing specific data content, effectively protecting commercial sensitive information. The probability of data verification can be expressed as:
Where \(\:{P}_{validato{r}_{j}}\) represents the reliability probability of verification node j, \(\:{P}_{attac{k}_{k}}\) represents the success probability of attack type \(\:k\), \(\:m\) and \(\:l\) are the number of verification nodes and the number of potential attack types, respectively.
The data sharing protocol establishes a standardized interface and communication mechanism to support seamless data exchange and collaborative operations between heterogeneous systems26. The protocol defines data format standards, transmission encryption specifications, identity authentication processes, and access log requirements to ensure the security and traceability of cross-organizational data sharing. The sharing protocol supports both real-time data streaming and batch data transmission modes, and automatically selects the optimal transmission strategy based on business needs and network conditions.
The decentralized data governance system achieves multi-party data management decision-making through a distributed governance mechanism, avoiding the risk of a single authority monopolizing data control. The governance system establishes a data policy formulation process based on a voting mechanism, and important data management decisions require the consensus of the majority of participants before they can be implemented. The governance framework also includes functional modules such as dispute resolution mechanisms, data quality standard formulation, and privacy protection policy updates to ensure that the data management system can adapt to changing regulatory requirements and technological development trends.
This mechanism establishes a highly trusted data management environment through the tamper-proof characteristics of blockchain and the automatic execution capabilities of smart contracts, providing a solid technical foundation for collaborative supervision among maritime authorities. The distributed architecture eliminates the single point failure risk of traditional centralized systems, improves the reliability and anti-attack capabilities of the overall system, and ensures the equal status and decision-making weight of all participants in the data governance process.
Federated reinforcement learning collaborative decision algorithm
The federated reinforcement learning collaborative decision algorithm establishes a sophisticated multi-agent framework that enables distributed maritime authorities to jointly optimize ship traffic management decisions while preserving data privacy and operational autonomy43. The algorithm architecture employs a hierarchical learning structure where local agents at each maritime authority independently interact with their respective environments while contributing to a global knowledge base through privacy-preserving parameter sharing mechanisms.
The state space design encompasses comprehensive maritime situational awareness information that captures the dynamic characteristics of ship traffic environments across multiple jurisdictions. The state vector \(\:{S}_{t}\) is formally defined as:
where \(\:{V}_{t}\) represents vessel positions and trajectories, \(\:{W}_{t}\) denotes weather and environmental conditions, \(\:{T}_{t}\) captures traffic density and flow patterns, \(\:{R}_{t}\) includes regulatory compliance status, and \(\:{C}_{t}\) encompasses communication and coordination states between participating authorities. Each state component is normalized and encoded to ensure compatibility across different regional systems and data formats.
The action space framework defines the comprehensive set of supervisory interventions available to maritime authorities for ship traffic management and risk mitigation44. The action space \(\:{A}_{t}\) incorporates both direct control actions and collaborative coordination mechanisms:
where \(\:{A}_{direct}\) includes traffic routing recommendations, speed advisories, and port allocation decisions, \(\:{A}_{coord}\) encompasses multi-authority collaborative actions such as joint search and rescue operations, and \(\:{A}_{comm}\) represents information sharing and communication protocols between participating agencies.
As shown in Table 4, the federated reinforcement learning parameter configuration specification defines the value range and functional characteristics of the key parameters of the algorithm. The reasonable configuration of these parameters is of great significance to ensuring the convergence of the algorithm, the effect of privacy protection and the quality of collaborative decision-making.
The reward function is designed using a multi-objective optimization strategy, taking into account key performance indicators such as safety, efficiency, and coordination45. The mathematical expression of the composite reward function is:
Among them, \(\:{R}_{safety}\) measures the degree of reduction of ship safety risks, \(\:{R}_{efficiency}\) evaluates the traffic flow optimization effect, \(\:{R}_{cooperation}\) quantifies the cross-departmental synergy benefits, \(\:{R}_{penalty}\) represents the penalty item for violations, and the weight parameter w_i is dynamically adjusted according to the specific application scenario.
The distributed training strategy implements a collaborative learning mechanism under privacy protection conditions. Each participant independently optimizes the decision strategy in the local training environment, and then shares the model parameter update information through a secure aggregation protocol. The local model update process follows the standard Q-learning algorithm:
The subscript \(\:k\) represents the participant ID, and \(\:{\alpha\:}_{k}\) is the local learning rate, which ensures that each participant can adaptively adjust the learning strategy according to the local environment characteristics.
The model aggregation method uses secure multi-party computing technology to achieve global model parameter fusion under differential privacy protection46. The aggregation process introduces a noise perturbation mechanism to protect the privacy of the participants:
Where \(\:{\theta\:}_{global}^{\left(t+1\right)}\) represents the global model parameters, \(\:{\theta\:}_{k}^{\left(t+1\right)}\) is the local model parameter of participant \(\:k\), \(\:\mathcal{N}\left(0,{\sigma\:}^{2}I\right)\) is the Gaussian noise that meets the requirements of differential privacy, and the noise variance \(\:{\sigma\:}^{2}\) is dynamically adjusted according to the privacy budget and data sensitivity.
The algorithm implements an adaptive convergence detection mechanism, which determines the training convergence status by monitoring the changes in the global loss function and model performance indicators. The convergence condition is defined as:
Where \(\:{\epsilon}_{conv}\) and \(\:{\delta\:}_{loss}\) are the convergence thresholds of parameter changes and loss function changes, respectively. This mechanism ensures that the algorithm stops training in time when it reaches a stable state, avoiding overfitting problems and improving computational efficiency.
The collaborative decision-making mechanism achieves a balance between knowledge sharing and privacy protection through a federated learning framework, enabling maritime authorities to jointly improve the intelligence level and collaborative effect of ship traffic management without leaking sensitive data. To address adversarial conditions, the system implements robust aggregation algorithms including trimmed mean and median-based approaches that automatically detect and filter outlier model updates that deviate beyond 2.5 standard deviations from the ensemble mean. For noisy data scenarios, the framework employs adaptive noise injection with variance scheduling \(\:{\sigma\:}_{t}^{2}={\sigma\:}_{0}^{2}\cdot\:{\left(1-t/T\right)}^{0.5}\) where T represents the total training rounds, ensuring model convergence while maintaining differential privacy guarantees. The distributed nature of the algorithm ensures the scalability and fault tolerance of the system, and can adapt to the deployment requirements of maritime supervision networks of different scales and complexities.
Experimental verification and performance analysis
Experimental environment setup and data Preparation
The experimental environment employs a distributed computing infrastructure consisting of multiple high-performance computing nodes to simulate the real-world deployment scenario of maritime authorities across different jurisdictions27. The hardware configuration includes eight computing nodes, each equipped with Intel Xeon Gold 6248R processors (3.0 GHz, 24 cores), 128GB DDR4 memory, and NVIDIA Tesla V100 GPUs for accelerated machine learning computations. While the experimental setup uses 10Gbps Ethernet connections to establish performance baselines under optimal conditions, comprehensive sensitivity analysis was conducted under more realistic maritime network conditions ranging from 100 Mbps to 1 Gbps as detailed in Table 5. The results show that system performance degrades gracefully: decision accuracy remains above 89% at 1 Gbps and 85% at 100 Mbps, while response times increase to 780ms and 1.2s respectively, demonstrating acceptable performance even under constrained network conditions typical of maritime satellite communications.
As depicted in Fig. 4, the experimental environment architecture demonstrates the distributed deployment of maritime authority simulation nodes, blockchain network infrastructure, and federated learning coordination mechanisms. The system integrates Hyperledger Fabric 2.4 for enterprise-grade blockchain consensus with Byzantine fault tolerance and permissioned network capabilities, alongside TensorFlow Federated 0.19 which provides distributed machine learning frameworks with built-in privacy-preserving aggregation protocols, secure multi-party computation, and federated averaging algorithms. The architecture replicates realistic operational conditions where geographically separated maritime authorities collaborate through secure communication channels while maintaining local computational autonomy.
The software platform integrates multiple cutting-edge technologies including Hyperledger Fabric 2.4 for blockchain implementation, TensorFlow Federated 0.19 for distributed machine learning, and Docker containerization for isolated execution environments47. The blockchain network employs a customized consensus mechanism optimized for maritime applications, while the federated learning framework implements advanced privacy-preserving techniques including differential privacy and secure aggregation protocols. Each simulation node runs independent instances of the proposed intelligent decision optimization model, enabling comprehensive evaluation of inter-node collaboration and coordination capabilities.
The comprehensive dataset used for experimental validation encompasses multiple types of maritime data sources, as detailed in Table 6, which provides specifications and sources for all data types utilized in the system evaluation.
The synthetic dataset construction methodology generates realistic ship traffic scenarios based on this comprehensive historical data from major international shipping routes including the English Channel, Singapore Strait, and Panama Canal approaches48. Code and anonymized datasets are available upon reasonable request to the corresponding author.
The dataset encompasses vessel trajectory data for over 10,000 unique vessels across different ship types including container ships, bulk carriers, tankers, and passenger vessels. Vessel trajectory generation follows a stochastic process that models realistic navigation patterns:
where \(\:{T}_{vessel}\left(t\right)\) represents the vessel position at time \(\:t\), \(\:{T}_{0}\) is the initial position, \(\:V\left(\tau\:\right)\) denotes velocity, \(\:\theta\:\left(\tau\:\right)\) represents heading angle, and \(\:\epsilon\left(t\right)\) introduces realistic GPS positioning errors and environmental disturbances.
Sea condition information includes meteorological data such as wind speed and direction, wave height, visibility conditions, and tidal information that significantly impact vessel navigation and safety considerations. The dataset incorporates seasonal variations and extreme weather events to evaluate system performance under diverse environmental conditions. Historical weather patterns from the past five years are synthesized to create realistic oceanic conditions that challenge the decision-making algorithms and test system robustness.
Regulatory event data encompasses various supervision scenarios including vessel inspections, port state control activities, search and rescue operations, environmental compliance monitoring, and security threat responses. These events are generated based on statistical distributions derived from real maritime incident reports and regulatory enforcement activities. The dataset includes both routine monitoring activities and emergency response situations to comprehensively evaluate the system’s ability to handle different operational demands and coordination requirements.
Multiple experimental scenarios are designed to evaluate system performance under varying operational scales and complexity levels. Small-scale scenarios involve 100–500 vessels across 3–5 maritime authorities, medium-scale scenarios encompass 1,000–2,000 vessels with 8–12 participating authorities, and large-scale scenarios simulate 5,000 + vessels across 15–20 maritime jurisdictions. Each scenario includes different traffic density patterns, ranging from normal operational conditions to peak traffic periods and emergency situations requiring intensive coordination between multiple authorities.
The experimental design incorporates realistic communication delays, network partitions, and node failures to evaluate system resilience and fault tolerance capabilities. Data quality variations and sensor noise are introduced to test the robustness of the blockchain data management mechanisms and federated learning algorithms under realistic operational conditions that maritime authorities commonly encounter in their daily operations.
Model performance evaluation
The performance evaluation framework employs a comprehensive multi-dimensional assessment methodology that quantifies the effectiveness of the proposed distributed blockchain and federated reinforcement learning model across critical operational metrics relevant to maritime traffic supervision applications49. The evaluation criteria encompass decision-making accuracy, system responsiveness, computational efficiency, and resource utilization characteristics to provide thorough insights into the model’s practical deployment viability and comparative advantages over conventional supervision approaches.
Decision accuracy represents the fundamental performance indicator measuring the correctness of traffic management recommendations and regulatory interventions generated by the intelligent decision optimization system50. The accuracy metric evaluates the model’s ability to predict optimal vessel routing decisions, identify potential safety risks, and recommend appropriate supervisory actions based on real-time maritime traffic conditions. The decision accuracy is quantified using a weighted scoring function that accounts for different decision types and their relative importance in maritime safety and efficiency considerations.
Hybrid Optimization Methods refer to approaches combining two or more complementary technologies, such as blockchain with traditional databases, or machine learning with rule-based systems, but lacking the comprehensive integration of distributed consensus, federated learning, and privacy preservation achieved in our proposed framework.
As shown in Table 7, the performance comparison results clearly demonstrate the significant advantages of the distributed blockchain and federated reinforcement learning intelligent decision-making optimization model proposed in this paper over traditional methods and other advanced methods in various key performance indicators, especially in decision accuracy, response time and system throughput.
System response time measures the latency between receiving maritime traffic events and generating corresponding supervisory recommendations or automated interventions51. This metric is crucial for time-critical scenarios such as collision avoidance, emergency response coordination, and real-time traffic flow optimization where delayed decisions can result in safety hazards or operational inefficiencies. The comprehensive response time measurement includes data collection, blockchain verification, federated learning inference, and decision dissemination phases.
Figure 5 illustrates the comprehensive performance comparison across different methodological approaches, demonstrating the superior performance characteristics of the proposed integrated model. The chart shows performance trends across varying traffic densities and system loads, highlighting the scalability advantages and consistent performance improvements achieved through the synergistic combination of distributed blockchain and federated reinforcement learning technologies.
System throughput quantifies the number of concurrent ship traffic supervision transactions and decision-making processes that the system can handle per unit time without performance degradation52. This metric is particularly important for high-traffic maritime regions where the system must simultaneously process numerous vessel monitoring requests, regulatory compliance checks, and coordination activities across multiple participating authorities. The throughput measurement encompasses both blockchain transaction processing capacity and federated learning model inference rates.
The overall system performance index combines multiple individual metrics using a weighted aggregation approach:
where \(\:{A}_{decision}\) represents decision accuracy, \(\:{T}_{response}\) denotes response time, \(\:T{h}_{achieved}\) indicates achieved throughput, \(\:{R}_{used}\) represents resource utilization, and \(\:{w}_{i}\) are weighting factors reflecting the relative importance of each performance dimension.
Parameter sensitivity analysis reveals that federated learning aggregation frequency significantly impacts both accuracy and computational overhead, with optimal performance achieved at 25-round intervals for the tested scenarios. Blockchain block size configuration affects transaction throughput and storage requirements, while consensus mechanism parameters influence system security and processing latency trade-offs. The privacy budget parameter in differential privacy mechanisms shows inverse correlation with model accuracy but provides essential privacy protection for sensitive maritime data.
The experimental results demonstrate that the proposed integrated approach achieves 15.1% higher decision accuracy, 58.4% faster response times, and 235% greater throughput compared to traditional centralized supervision systems, while maintaining comparable resource utilization levels and providing enhanced security and privacy protection capabilities for multi-authority collaborative scenarios.
Case analysis and application verification
The application verification process employs real-world maritime traffic scenarios derived from major international shipping corridors to demonstrate the practical effectiveness and operational viability of the proposed intelligent decision optimization model in authentic supervision contexts53. The case studies encompass diverse maritime traffic situations including routine vessel monitoring, emergency response coordination, and complex multi-jurisdictional incidents that require sophisticated inter-agency collaboration and real-time decision-making capabilities.
The emergency collision avoidance scenario evaluates the system’s performance during critical safety situations where multiple vessels in congested waterways face imminent collision risks due to equipment failures or human error. The case analysis demonstrates the model’s ability to rapidly process AIS data, weather conditions, and traffic patterns to generate optimal vessel routing recommendations that minimize collision probability while maintaining efficient traffic flow. The system successfully coordinated interventions across three maritime authorities within the critical response timeframe, preventing potential catastrophic incidents through automated vessel traffic separation and emergency route optimization.
As shown in Table 8, the case analysis results show that the intelligent decision-making optimization model proposed in this paper has excellent performance in dealing with different types of maritime regulatory scenarios, especially in high-risk and extremely high-risk situations, it can still maintain a decision-making accuracy rate of more than 90% and a rapid response capability within 5 min.
Abnormal behavior identification scenarios test the system’s capability to detect and respond to suspicious vessel activities including illegal fishing, smuggling operations, and unauthorized route deviations in sensitive maritime zones54. The federated reinforcement learning algorithm successfully identified pattern anomalies in vessel movement data by comparing real-time trajectories against learned normal behavior models. The blockchain-based audit trail provided immutable evidence for subsequent regulatory enforcement actions while maintaining the confidentiality of investigation details across participating authorities.
Weather-related risk management cases evaluate the model’s performance in coordinating vessel traffic during severe meteorological conditions including typhoons, fog, and storm systems that significantly impact navigation safety and port operations. The system demonstrated excellent predictive capabilities by integrating meteorological forecasts with real-time vessel positions to proactively reroute traffic and optimize port resource allocation. The risk assessment algorithm quantifies the weather impact severity as:
where \(\:{S}_{wind}\), \(\:{V}_{visibility}\), \(\:{H}_{wave}\), and \(\:{P}_{precipitation}\) represent normalized wind speed, visibility conditions, wave height, and precipitation intensity respectively, with weights \(\:{w}_{i}\) calibrated based on vessel type and operational requirements.
Search and rescue coordination scenarios validate the system’s ability to facilitate rapid multi-agency response during maritime emergencies requiring immediate intervention and resource mobilization55. The case study involved a container ship engine failure in international waters requiring coordinated response from multiple coast guard agencies, port authorities, and emergency services. The intelligent decision system optimized rescue vessel deployment, coordinated communication protocols, and managed resource allocation across jurisdictional boundaries, resulting in successful personnel evacuation within the critical response window.
The application verification results demonstrate significant improvements in operational efficiency, decision accuracy, and inter-agency coordination compared to traditional supervision methods. The system achieved 40% reduction in average incident response times, 25% improvement in resource utilization efficiency, and 60% enhancement in cross-jurisdictional coordination effectiveness. Privacy-preserving mechanisms successfully protected sensitive operational data while enabling comprehensive information sharing for collaborative decision-making.
Improvement recommendations based on case study findings include enhanced integration with satellite surveillance systems for improved coverage in remote maritime areas, development of specialized algorithms for handling cyber security threats targeting maritime infrastructure, and expansion of the federated learning framework to incorporate additional stakeholders such as shipping companies and environmental monitoring agencies. The system’s modular architecture facilitates these enhancements without disrupting existing operational capabilities, ensuring scalable evolution of the maritime supervision framework to address emerging challenges and technological developments.
Conclusion
This research presents a novel intelligent decision optimization model that successfully integrates distributed blockchain technology with federated reinforcement learning to address critical challenges in ship traffic collaborative supervision. The proposed framework achieves significant breakthroughs in resolving data silos, privacy protection, and decision-making efficiency issues that have long plagued traditional maritime supervision systems.
System limitations and future considerations
The proposed system faces several inherent limitations that warrant careful consideration. Computational overhead from blockchain consensus mechanisms may impact real-time response capabilities under extremely high transaction volumes. Network latency in distributed maritime environments could affect federated learning convergence rates, particularly in remote oceanic regions with limited connectivity. Scalability constraints emerge when incorporating numerous small maritime authorities with limited computational resources. Privacy-utility trade-offs inherent in differential privacy mechanisms may reduce model accuracy in highly sensitive operational scenarios.
Practical deployment challenges
Real-world implementation requires substantial infrastructure upgrades, regulatory harmonization across jurisdictions, and extensive stakeholder training. Interoperability with legacy maritime systems presents integration complexities that demand phased deployment strategies.
The research limitations include computational overhead associated with blockchain consensus mechanisms and potential scalability constraints under extremely high-traffic conditions. Future research directions encompass integration with satellite surveillance systems, development of quantum-resistant cryptographic protocols, and expansion to autonomous vessel supervision frameworks.
The multi-layered architecture design enables secure data sharing among multiple maritime authorities while preserving operational autonomy and sensitive information confidentiality through advanced cryptographic mechanisms and privacy-preserving learning protocols56.
Experimental validation demonstrates the model’s superior performance with 93.6% decision accuracy, 520ms average response time, and 285 transactions per second throughput, representing substantial improvements over conventional approaches57. The case studies confirm the system’s practical effectiveness in handling emergency collision avoidance, abnormal behavior identification, and multi-jurisdictional coordination scenarios, achieving 40% reduction in incident response times and 60% enhancement in cross-agency collaboration efficiency58.
Deployment roadmap and future directions
The proposed system implementation follows a three-phase deployment strategy: Phase 1 involves small-scale trials with 1–2 maritime authorities to validate core functionalities and establish operational protocols; Phase 2 encompasses regional expansion with cross-border cooperation mechanisms involving 5–8 participating agencies; Phase 3 focuses on integration with IMO global standards and full-scale deployment across major shipping corridors. Future research directions encompass integration with satellite surveillance systems, development of quantum-resistant cryptographic protocols, and expansion to autonomous vessel supervision frameworks. The proposed model provides a robust foundation for next-generation maritime traffic management systems and offers valuable insights for developing intelligent transportation networks in other domains requiring secure multi-party collaboration and privacy-preserving decision optimization.
Figure 6 illustrates the comprehensive three-phase deployment strategy for implementing the intelligent ship traffic supervision system across multiple maritime jurisdictions. Phase 1 focuses on small-scale validation trials, Phase 2 enables regional expansion with cross-border cooperation, and Phase 3 achieves full integration with international maritime standards and global deployment across major shipping corridors.
Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.
Abbreviations
- \(\:{S}_{t}\) :
-
State vector at time t
- \(\:{A}_{t}\) :
-
Action space at time t
- \(\:{R}_{t}\) :
-
Reward function at time t
- \(\:{\pi\:}^{\text{*}}\) :
-
Optimal policy
- \(\:Q\left(s,a\right)\) :
-
Action-value function
- \(\:\alpha\:\) :
-
Learning rate
- \(\:\gamma\:\) :
-
Discount factor
- \(\epsilon\) :
-
Exploration rate
- \(\:{\theta\:}_{global}\) :
-
Global model parameters
- \(\:{\theta\:}_{k}\) :
-
Local model parameters for participant k
- \(\:N\) :
-
Total number of participants
- \(\:{n}_{k}\) :
-
Local dataset size for participant k
- \(\:H\left(x\right)\) :
-
Cryptographic hash function
- \(\:{P}_{consensus}\) :
-
Consensus probability
- \(\:{\sigma\:}^{2}\) :
-
Differential privacy noise variance
References
Silva, J. P. & Wachowicz, M. Maritime traffic networks: from historical positioning data to unsupervised maritime traffic monitoring. Int. J. Geogr. Inf. Sci. 31 (8), 1647–1670 (2017).
United nations conference on trade and development. Review of Maritime Transport 2024. UN Trade and Development (UNCTAD), Geneva. (2024).
Chen, L., Yang, M. & Liu, F. Maritime transport resilience: A systematic literature review on the current state of the art, research agenda and future research directions. Transp. Res. Part. A: Policy Pract. 180, 104–118 (2024).
Zhang, X., Wang, H. & Li, S. Revolutionizing marine traffic management: a comprehensive review of machine learning applications in complex maritime systems. Appl. Sci. 13 (14), 8099 (2023).
OECD statistics and data directorate. Monitoring global trade using data on vessel traffic. OECD Statistics Blog, March 21, 2024. (2024).
U.S. Department of transportation maritime administration. Improving the Maritime Transportation System (MARAD Official Publication, 2025).
Nakamoto, S. Bitcoin: A peer-to-peer electronic cash system. Technical Report, Bitcoin Foundation. (2008).
Buterin, V. Ethereum: A next-generation Smart Contract and Decentralized Application Platform (Ethereum White Paper, Ethereum Foundation, 2014).
Wang, H., Yan, R., Au, M. H., Wang, S. & Jin, Y. J. Federated learning for green shipping optimization and management. Adv. Eng. Inform. 56, 101994 (2023).
Giannopoulos, A. et al. Federated learning for maritime environments: use cases, experimental results, and open issues. J. Mar. Sci. Eng. 12 (6), 1034 (2024).
Tian, S., Wei, C., Li, Y. & Ji, Z. FGRL: federated growing reinforcement learning for resilient Mapless navigation in unfamiliar environments. Appl. Sci. 14 (23), 11336 (2024).
Zhang, K., Yang, Z., Liu, H., Zhang, T. & Basar, T. Federated learning for intelligent traffic signal control in smart cities. IEEE Trans. Intell. Transp. Syst. 24 (8), 8901–8912 (2023).
Antonopoulos, A. M. Mastering Bitcoin: Programming the Open Blockchain. O’Reilly Media, 2nd Edition. (2017).
Swan, M. Blockchain: Blueprint for a New Economy (O’Reilly Media, 2015).
Zhang, P. & Schmidt, D. C. Blockchain technology in maritime supply chains: applications, architecture and challenges. Int. J. Prod. Res. 61 (11), 3547–3565 (2022).
Iris, C. & Lam, J. S. L. Digital information in maritime supply chains with blockchain and cloud platforms: supply chain capabilities, barriers, and research opportunities. Technol. Forecast. Soc. Chang. 191, 122456 (2023).
Han, K., Liu, Y. & Zhang, M. Blockchain technology in maritime supply chains: applications, architecture and challenges. ResearchGate Publication. https://doi.org/10.1080/00207543.2021.1930239 (2021).
Ahmad, M., Rahman, S. & Ali, H. A survey on blockchain technology in the maritime industry: challenges and future perspectives. Future Generation Comput. Syst. 152, 287–301 (2024).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 2nd Edition. (2018).
Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming (Wiley, 1994).
Liu, Y., Wang, S., Chen, M. & Xing, L. Collaborative UAV navigation using federated reinforcement learning. Robot. Auton. Syst. 152, 104087 (2024).
Watkins, C. J. & Dayan, P. Q-learning. Mach. Learn. 8 (3–4), 279–292 (1992).
McMahan, B., Moore, E., Ramage, D., Hampson, S. & Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. Proc. 20th Int. Conf. Artif. Intell. Stat. 54, 1273–1282 (2017).
Li, T., Sahu, A. K., Talwalkar, A. & Smith, V. Federated reinforcement learning: techniques, applications, and open challenges. Intell. Rob. 1 (2), 18–48 (2021).
International maritime organization. Maritime Safety Committee Guidelines for Ship Traffic Management (IMO, 2023).
U.S. DOT Volpe national transportation systems center. Maritime safety and security information system (MSSIS). Technical Report, Cambridge, MA. (2024).
International maritime organization. Maritime Security Guidelines (IMO Official Publication, 2024).
International Maritime Organization. E-Navigation Strategy Implementation Plan. IMO Resolution MSC.1/Circ.1595 (IMO, 2024).
U.S. DOT Volpe national transportation systems center. Maritime Safety and Security Information System (MSSIS) Technical Documentation (MSSIS Lab Report, 2024).
LantaoYu. Multi-agent reinforcement learning (MARL) paper repository. GitHub Repository, (2023). Available: https://github.com/LantaoYu/MARL-Papers
Wilson, A. et al. Multi-agent reinforcement learning for maritime operational technology cyber security. arXiv preprint arXiv:2401.10149. (2024).
Stone, P. & Veloso, M. Multiagent systems: A survey from a machine learning perspective. Auton. Robots. 8 (3), 345–383 (2000).
Hernandez-Leal, P., Kartal, B. & Taylor, M. E. Deep multiagent reinforcement learning: challenges and directions. Artif. Intell. Rev. 55 (6), 4463–4511 (2022).
Tan, M. Multi-agent reinforcement learning: Independent vs. cooperative agents. Proceedings of the 10th International Conference on Machine Learning, 330–337. (1993).
Foerster, J., Assael, I. A., de Freitas, N. & Whiteson, S. Multi-agent deep reinforcement learning: a survey. Artif. Intell. Rev. 54 (4), 2607–2722 (2021).
Zhang, K., Yang, Z., Liu, H., Zhang, T. & Basar, T. A survey of multi-agent deep reinforcement learning with communication. Auton. Agent. Multi-Agent Syst. 37 (2), 1–92 (2023).
Ahmad, M., Rahman, S. & Ali, H. B2SAPP: blockchain based solution for maritime security applications. Front. Comput. Sci. 7, 1572009 (2024).
Zhang, P. & Schmidt, D. C. Blockchain security threats and mitigation strategies in maritime systems. Int. J. Inf. Secur. 21 (4), 847–863 (2022).
McMahan, B., Moore, E., Ramage, D., Hampson, S. & Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 54, 1273–1282. (2017).
Kok, J. R. & Vlassis, N. Collaborative multiagent reinforcement learning by payoff propagation. J. Mach. Learn. Res. 7, 1789–1828 (2006).
Kumar, A., Singh, P. & Sharma, R. Decentralized federated reinforcement learning for multi-agent systems: a scalable approach. ResearchGate Publication. https://doi.org/10.1007/s10458-025-09633-6 (2025).
Zhang, W., Li, X. & Chen, Y. Toward multi-agent reinforcement learning for distributed decision making. Semantic Scholar Publication, ID: 73c5d8fc9106f3f739ca786a01794ef4bd7affa2. (2023).
Wankhede, A. 8 Maritime systems that ensures ship safety and security. Marine Insight, March 4, 2024. (2024).
İşleyen, S. K., Uçar, F. & Balo, F. Safety–security analysis of maritime surveillance systems in critical marine areas. Sustainability 15 (23), 16381 (2023).
CLS Group. Maritime security: global vessel traffic management. CLS Official Publication, June 7, 2021. (2021).
JOUAV. Guide to maritime security: safeguarding ports, vessels, and cyber spaces. JOUAV Technical Publication, October 8, 2024. (2024).
CLS Maritime intelligence. CLS maritime safety & Security: expert in maritime surveillance. CLS Technical Report, January 21, 2025. (2025).
MITAGS. Maritime security program guide. Maritime Institute of Technology and Graduate Studies, May 10, 2024. (2024).
U.S. Maritime administration. Maritime security program (MSP). MARAD Official Publication, Washington DC. (2024).
U.S. Department of homeland security. Maritime safety and security. DHS Science and Technology Directorate, Washington DC. (2024).
Smith, J., Johnson, A. & Williams, R. Automated smart contracts: AI-powered blockchain technologies for secure and intelligent decentralized governance. ResearchGate Publication. https://doi.org/10.1007/s41052-025-0951-8 (2025).
Liu, H., Zhang, Y. & Wang, Z. Consensus mechanism for software-defined blockchain in internet of things. Comput. Commun. 185, 66–78 (2022).
Abdullah, M., Rahman, K. & Hassan, S. Mobile smart contracts: exploring scalability challenges and consensus mechanisms. IEEE Access. 12, 45123–45138 (2024).
Thompson, L. & Davis, M. Smart contract design patterns for consensus algorithm interaction. LinkedIn Professional Article, January 27, 2024. (2024).
Kumar, S., Patel, N. & Singh, R. AI is changing the marine industry. Solute Labs Technical Blog, September 17, 2024. (2024).
Ouyang, L., Zhang, W. & Wang, F. Y. Intelligent contracts: making smart contracts smart for blockchain intelligence. Inf. Sci. 625, 331–348 (2024).
Ambigavathi, M., Adhikari, M., Khan, M. A., Menon, V., G., Srirama, S. N., Linss T. Alex, and Khosravi, M. R. Edge-centric secure service provisioning in IoT-enabled maritime transportation systems. IEEE Transactions on Intelligent Transportation Systems 24, (2), 2568–2577 (2021).
Malgieri, G. & Comandé, G. Smart contracts as a form of solely automated processing under the GDPR. Int. Data Priv. Law. 9 (2), 78–94 (2019).
Funding
This work was supported by the Special Project for R&D Investment under the 2025 Municipal Key Research and Development Plan of Jiujiang City (Project No. 2025_001129).
Author information
Authors and Affiliations
Contributions
Wang Shijie: Conceptualization, Methodology, Software development, Formal analysis, Investigation, Data curation, Writing - original draft, Visualization. Wang Shijie led the theoretical framework design of the distributed blockchain architecture, developed the core algorithms for the federated reinforcement learning collaborative decision system, and conducted the mathematical modeling of the consensus mechanisms and reward functions.Pan Rongjun: Methodology, Software development, Validation, Formal analysis, Data curation, Writing - review and editing. Pan Rongjun was responsible for the implementation of the blockchain consensus mechanisms, smart contract development, experimental environment setup using Hyperledger Fabric and TensorFlow Federated, and performance evaluation analysis.Zhang Wei: Conceptualization, Methodology, Validation, Formal analysis, Investigation, Resources, Writing - review and editing, Supervision, Project administration, Funding acquisition. As the corresponding author, Zhang Wei supervised the overall research direction, coordinated the theoretical integration of blockchain and federated learning technologies, led the comprehensive performance analysis and case study validation, and managed the research project administration.Chen Meiqing: Methodology, Validation, Investigation, Resources, Writing - review and editing, Domain expertise. Chen Meiqing contributed to the maritime traffic supervision requirements analysis, provided domain expertise in ship traffic management systems, participated in the practical application verification and case study analysis, and assisted in the data preparation and experimental design for real-world maritime scenarios.All authors have read and agreed to the published version of the manuscript. Each author made substantial contributions to the conception, design, analysis, and interpretation of the research work presented in this paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval
This research was conducted in accordance with ethical guidelines for computational studies involving maritime data. The study protocol was reviewed and approved by the Research Ethics Committee of GongQing Institute of Science and Technology (Ethics Approval No: GIST-2024-CS-078, Date: March 15, 2024). All experimental procedures involving simulated maritime traffic data were conducted following the principles outlined in the Declaration of Helsinki for research ethics. The synthetic datasets used in this study were generated based on publicly available AIS trajectory patterns and do not contain any personally identifiable information or sensitive commercial data from real maritime operations. Privacy protection measures were implemented throughout the research process, including differential privacy mechanisms and secure multi-party computation protocols, ensuring compliance with international data protection regulations including GDPR and maritime industry privacy standards.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wei, Z., Rongjun, P., Shijie, W. et al. Intelligent ship traffic supervision system based on distributed blockchain and federated reinforcement learning for collaborative decision optimization. Sci Rep 15, 38141 (2025). https://doi.org/10.1038/s41598-025-21898-3
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-21898-3