+

CN114964268B - Unmanned aerial vehicle navigation method and device - Google Patents

Unmanned aerial vehicle navigation method and device Download PDF

Info

Publication number
CN114964268B
CN114964268B CN202210902202.XA CN202210902202A CN114964268B CN 114964268 B CN114964268 B CN 114964268B CN 202210902202 A CN202210902202 A CN 202210902202A CN 114964268 B CN114964268 B CN 114964268B
Authority
CN
China
Prior art keywords
navigation
simulation
model
unmanned aerial
aerial vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210902202.XA
Other languages
Chinese (zh)
Other versions
CN114964268A (en
Inventor
李唯
张宁远
曹一丁
郭伟
杨雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baiyang Times Beijing Technology Co ltd
Original Assignee
Baiyang Times Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baiyang Times Beijing Technology Co ltd filed Critical Baiyang Times Beijing Technology Co ltd
Priority to CN202210902202.XA priority Critical patent/CN114964268B/en
Publication of CN114964268A publication Critical patent/CN114964268A/en
Application granted granted Critical
Publication of CN114964268B publication Critical patent/CN114964268B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Automation & Control Theory (AREA)
  • Navigation (AREA)

Abstract

The application discloses an unmanned aerial vehicle navigation method and device. The method comprises the following steps: constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments; based on simulation running information of a simulation target unmanned aerial vehicle in a simulation environment, constructing a deep reinforcement learning model for unmanned aerial vehicle navigation; when the target unmanned aerial vehicle operates, navigation is performed by utilizing the deep reinforcement learning model according to the real operation information of the target unmanned aerial vehicle. When the target unmanned aerial vehicle actually operates, the actual operation scheme can be obtained by utilizing the model and the actual operation information of the target unmanned aerial vehicle, so that the navigation of the target unmanned aerial vehicle is realized without the need of being familiar with the environment in advance, the navigation efficiency and accuracy of the unmanned aerial vehicle are improved, and the effect of the unmanned aerial vehicle is exerted to the maximum extent.

Description

Unmanned aerial vehicle navigation method and device
Technical Field
The application relates to the technical field of unmanned aerial vehicles, in particular to an unmanned aerial vehicle navigation method and device.
Background
In recent years, with the continuous development of intelligent control technology, robot technology and other technologies, unmanned aerial vehicle autonomous control technology has made great progress. The unmanned aerial vehicle is used as a flight platform capable of carrying various sensing devices and computing devices, has the advantages of small size, low manufacturing cost, high flexibility and the like, and can be widely applied to various tasks such as regional reconnaissance, disaster search and rescue and the like.
At present, in the existing unmanned aerial vehicle navigation method, an environment sensing device carried by the unmanned aerial vehicle is used for familiar with a known environment, an environment model corresponding to the known environment is built in advance, and then autonomous navigation is realized based on the environment model. Therefore, in the existing unmanned aerial vehicle navigation method, in order to realize a navigation scheme with higher accuracy, the requirement on the precision of an environment model is higher. In this case, if the known environment changes or the unmanned aerial vehicle enters the unknown environment, it is also difficult to achieve accurate autonomous navigation based on the environment model generated in advance.
Disclosure of Invention
The embodiment of the application provides an unmanned aerial vehicle navigation method and device, so as to solve the problem that the traditional unmanned aerial vehicle navigation method is difficult to meet the requirement of accurate navigation.
In a first aspect, an embodiment of the present application provides a method for navigating an unmanned aerial vehicle, including:
constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments;
based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, constructing a deep reinforcement learning model for unmanned aerial vehicle navigation;
and when the target unmanned aerial vehicle runs, navigating by utilizing the deep reinforcement learning model according to the real running information of the target unmanned aerial vehicle.
Optionally, the constructing a deep reinforcement learning model for unmanned aerial vehicle navigation based on the simulation running information of the simulation target unmanned aerial vehicle in the simulation environment includes:
based on the simulation running information, constructing a navigation strategy model for planning navigation information by using a deep learning algorithm;
constructing a navigation evaluation model for evaluating the navigation information based on the navigation strategy model by using a reinforcement learning algorithm;
and optimizing the navigation strategy model based on the navigation evaluation model until the navigation strategy model is converged, and taking the converged navigation strategy model as the deep reinforcement learning model.
Optionally, the constructing a navigation strategy model for planning navigation information based on the simulation running information and by using a deep learning algorithm includes:
taking the simulation visual information of the simulation target unmanned aerial vehicle in the simulation environment as the input of a navigation prediction model, and taking the simulation navigation information of the simulation target unmanned aerial vehicle in the simulation environment as the output of the navigation prediction model, and constructing the navigation prediction model;
the simulation task information of the simulation target unmanned aerial vehicle in the simulation environment and the output of the navigation prediction model are taken as the input of a navigation matching model together, and the matching degree between the output of the navigation prediction model and the simulation task information is taken as the output of the navigation matching model, so that the navigation matching model is constructed;
And constructing the navigation strategy model based on the navigation prediction model and the navigation matching model.
Optionally, the constructing a navigation evaluation model for evaluating the navigation information based on the navigation strategy model and using a reinforcement learning algorithm includes:
obtaining navigation information matched with the simulation task information from simulation navigation information output by the navigation prediction model as target navigation information;
and taking the output of the navigation strategy model and the target navigation information together as the input of the navigation evaluation model, and taking the rewarding evaluation value corresponding to the target navigation information as the output of the navigation evaluation model to construct the navigation evaluation model.
Optionally, when the target unmanned aerial vehicle operates, before navigating by using the deep reinforcement learning model according to the real operation information of the target unmanned aerial vehicle, the method further includes:
constructing a navigation test environment of the target unmanned aerial vehicle, and performing navigation test on the deep reinforcement learning model in the navigation test environment to obtain a test result;
determining test random information based on the simulation environment and the navigation test environment;
And updating the deep reinforcement learning model according to the test result and the test random information.
Optionally, when the target unmanned aerial vehicle operates, navigating according to the real operation information of the target unmanned aerial vehicle and by using the deep reinforcement learning model, including:
acquiring real visual information and real task information when the target unmanned aerial vehicle runs;
inputting the real visual information and the real task information into the deep reinforcement learning model;
obtaining predicted navigation information output by the depth enhancement model; the predicted navigation information is navigation information matched with the real task information;
and controlling the target unmanned aerial vehicle to operate based on the predicted navigation information.
Optionally, the constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the device parameters of the target unmanned aerial vehicle and the environment parameters of different known environments includes:
according to the equipment parameters of the target unmanned aerial vehicle, constructing a digital twin model corresponding to the target unmanned aerial vehicle as the simulation target unmanned aerial vehicle;
building a simulation environment set corresponding to different known environments according to the environment parameters of the different known environments;
And constructing the simulation environment based on the digital twin model and the simulation environment set.
Optionally, the constructing a digital twin model corresponding to the target unmanned aerial vehicle as the simulation target unmanned aerial vehicle according to the device parameters of the target unmanned aerial vehicle includes:
based on the control and state estimation system test parameters in the equipment parameters, constructing a control and state estimation system simulation model of the simulation target unmanned aerial vehicle;
based on simulation control parameters output by the control and state estimation system simulation model and power system test parameters in the equipment parameters, constructing a power system simulation model of the simulation target unmanned aerial vehicle;
based on the simulation power system parameters output by the power system simulation model and the dynamic model test parameters in the equipment parameters, constructing a dynamic simulation model of the simulation target unmanned aerial vehicle;
based on simulation dynamics parameters output by the dynamics simulation model and rigid motion model test parameters in the equipment parameters, constructing a rigid motion simulation model of the simulation target unmanned aerial vehicle;
and constructing the digital twin model according to the control and state estimation system simulation model, the power system simulation model, the dynamics simulation model and the rigid body motion simulation model.
Optionally, the unmanned aerial vehicle navigation method further includes:
obtaining simulation motion parameters output by the rigid motion simulation model;
and updating the control and state estimation system simulation model and/or the dynamics simulation model according to the simulation motion parameters.
In a second aspect, an embodiment of the present application provides an unmanned aerial vehicle navigation device, including:
the simulation environment construction module is used for constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments;
the model construction module is used for constructing a deep reinforcement learning model for unmanned aerial vehicle navigation based on simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment;
and the navigation module is used for navigating according to the real operation information of the target unmanned aerial vehicle and by utilizing the deep reinforcement learning model when the target unmanned aerial vehicle operates.
From the above technical solutions, the embodiments of the present application have the following advantages:
according to the method and the device, the simulation environment corresponding to the simulation target unmanned aerial vehicle can be built through the device parameters of the target unmanned aerial vehicle and the environment parameters of different known environments, and then the deep reinforcement learning model for unmanned aerial vehicle navigation is built based on the simulation running information of the simulation target unmanned aerial vehicle in the simulation environment. Therefore, when the target unmanned aerial vehicle operates, navigation can be performed according to the real operation information of the target unmanned aerial vehicle and by using the deep reinforcement learning model. Because the deep reinforcement learning model is constructed based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, the model can provide a simulation operation scheme of the simulation target unmanned aerial vehicle in the simulation environment. Therefore, when the target unmanned aerial vehicle actually operates, the actual operation scheme can be obtained by utilizing the model and the actual operation information of the target unmanned aerial vehicle, so that the navigation of the target unmanned aerial vehicle is realized without the need of being familiar with the environment in advance, the navigation efficiency and accuracy of the unmanned aerial vehicle are improved, and the effect of the unmanned aerial vehicle is exerted to the greatest extent.
Drawings
Fig. 1 is a flowchart of a method for navigating an unmanned aerial vehicle according to an embodiment of the present application;
FIG. 2 is a flowchart of an implementation of constructing a deep reinforcement learning model according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an unmanned aerial vehicle navigation device according to an embodiment of the present application.
Detailed Description
As described above, the inventors found in the study on the unmanned aerial vehicle navigation method that: the existing unmanned aerial vehicle navigation method generally utilizes an environment sensing device carried by the unmanned aerial vehicle to be familiar with a known environment, builds an environment model corresponding to the known environment in advance, and realizes autonomous navigation based on the environment model. Therefore, in the existing unmanned aerial vehicle navigation method, in order to realize a navigation scheme with higher accuracy, the requirement on the precision of an environment model is higher. In this case, if the known environment changes or the unmanned aerial vehicle enters the unknown environment, it is also difficult to achieve accurate autonomous navigation based on the environment model generated in advance.
In order to solve the above problems, an embodiment of the present application provides a method for navigating an unmanned aerial vehicle. The method may include: through the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments, a simulation environment corresponding to the simulation target unmanned aerial vehicle can be constructed, and then a deep reinforcement learning model for unmanned aerial vehicle navigation is constructed based on simulation running information of the simulation target unmanned aerial vehicle in the simulation environment. Therefore, when the target unmanned aerial vehicle operates, navigation can be performed according to the real operation information of the target unmanned aerial vehicle and by using the deep reinforcement learning model.
Because the deep reinforcement learning model is constructed based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, the model can provide a simulation operation scheme of the simulation target unmanned aerial vehicle in the simulation environment. Therefore, when the target unmanned aerial vehicle actually operates, an actual operation scheme can be formed by utilizing the model and the actual operation information of the target unmanned aerial vehicle, so that the navigation of the target unmanned aerial vehicle is realized without being familiar with the environment in advance, the navigation efficiency and accuracy of the unmanned aerial vehicle are improved, and the effect of the unmanned aerial vehicle is exerted to the greatest extent.
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Fig. 1 is a flowchart of a method for unmanned aerial vehicle navigation according to an embodiment of the present application. Referring to fig. 1, the unmanned aerial vehicle navigation method provided in the embodiment of the present application may include:
S101: and constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments.
Because the equipment configured by different types of unmanned aerial vehicles such as a multi-rotor unmanned aerial vehicle, a single-rotor unmanned aerial vehicle, a fixed-wing unmanned aerial vehicle and the like is different, the influence on environmental parameters such as air flow, air pressure and the like is also different. Based on the method, the simulation environment of the simulation unmanned aerial vehicle can be constructed by comprehensively considering the equipment parameters of the target unmanned aerial vehicle and the parameters of different known environments, so that the simulation environment with higher precision is constructed. The method for acquiring the device parameters of the target unmanned aerial vehicle may not be specifically limited. For example, the device parameters of the target drone may be obtained from a producer associated with the target drone, or, if the control system of the target drone is configured with a drone database, the device parameters may be obtained directly from the drone database. In addition, the method for acquiring the environmental parameters of different known environments is not limited in particular. For example, the environmental parameters of the corresponding environment may be determined based on the existing environmental model, or the environmental parameters of the operating environment may be obtained by using the environmental awareness device carried by the target unmanned aerial vehicle each time the target unmanned aerial vehicle operates.
In addition, the embodiment of the present application is not limited to a specific manner of constructing the simulation environment corresponding to the simulation target unmanned aerial vehicle, and for convenience of understanding, the following description is made with reference to a possible implementation manner.
In one possible implementation manner, S101 may specifically include: according to the equipment parameters of the target unmanned aerial vehicle, constructing a digital twin model corresponding to the target unmanned aerial vehicle as a simulation target unmanned aerial vehicle; constructing simulation environment sets corresponding to different known environments according to environment parameters of the different known environments; and constructing an unmanned aerial vehicle simulation environment based on the digital twin model and the simulation environment set. Therefore, high-precision simulation of the target unmanned aerial vehicle is realized through a digital twin technology, and the simulation environment set is combined, so that a deep reinforcement learning model with higher accuracy can be constructed later, and accurate unmanned aerial vehicle navigation is realized.
Specifically, the construction process of the digital twin model corresponding to the target unmanned aerial vehicle may include: based on the control and state estimation system test parameters in the equipment parameters, constructing a control and state estimation system simulation model corresponding to the simulation target unmanned aerial vehicle; based on simulation control parameters output by the control and state estimation system simulation model and power system test parameters in equipment parameters, constructing a power system simulation model corresponding to the simulation target unmanned aerial vehicle; based on simulation power system parameters output by the power system simulation model and dynamic model test parameters in equipment parameters, constructing a dynamic simulation model corresponding to the simulation target unmanned aerial vehicle; based on simulation dynamics parameters output by the dynamics simulation model and rigid motion model test parameters in equipment parameters, constructing a rigid motion simulation model corresponding to the simulation target unmanned aerial vehicle; and constructing a digital twin model according to the control and state estimation system simulation model, the power system simulation model, the dynamics simulation model and the rigid body motion simulation model.
In practical application, the target unmanned aerial vehicle takes a four-rotor unmanned aerial vehicle as an example, for a control and state system model, a series PID ((Proportional Integral Derivative, proportional integral derivative) control algorithm is adopted to determine a flight control simulation model, a sensor model added with sensor noise is adopted as a state estimation simulation model, and then a control and state estimation system simulation model is constructed based on the flight control simulation model and the state estimation simulation model.
In addition, in order to improve the accuracy of the simulation target unmanned aerial vehicle obtained based on the digital twin technology, in the embodiment of the application, the simulation target unmanned aerial vehicle can be optimized by using the simulation motion parameters output by the rigid motion simulation model. Specifically, the simulation motion parameters output by the rigid motion simulation model can be obtained; and updating the control and state estimation system simulation model and/or the dynamics simulation model according to the simulation motion parameters. Here, the simulation motion parameter may be a simulation air flow rate in a simulation environment in which the simulation target unmanned aerial vehicle is located.
S102: based on simulation running information of a simulation target unmanned aerial vehicle in a simulation environment, a deep reinforcement learning model for unmanned aerial vehicle navigation is constructed.
Here, the simulation running information may include simulation visual information of the simulation target unmanned aerial vehicle in the simulation environment, simulation navigation information of the simulation target unmanned aerial vehicle in the simulation environment, and simulation task information of the simulation target unmanned aerial vehicle in the simulation environment. In addition, for the construction process of the deep reinforcement learning model, reference is made to the description below for technical details.
In addition, in the embodiment of the application, in order to improve the accuracy of the deep reinforcement learning model, the deep reinforcement learning model can be optimized through a virtual-real migration technology. Specifically, a navigation test environment of the target unmanned aerial vehicle can be built, a navigation test is conducted on the deep reinforcement learning model in the navigation test environment to obtain a test result, then test random information is determined based on the simulation environment and the navigation test environment, and then the deep reinforcement learning model can be updated according to the test result and the test random information. The navigation test environment is a real environment, so that the random test information can be embodied as a small gap between the simulation environment and the real environment. In practical application, the test random information may be random environment information, such as random illumination information, random wind speed information, etc., and may also be a dynamic fuzzy error model affected by the angular speed of the unmanned aerial vehicle camera, a randomized unmanned aerial vehicle dynamic response model, etc. Thus, by simulating the gap between virtual and real, and optimizing the deep reinforcement learning model in this way, the gap can be reduced, and the accuracy of the deep reinforcement learning model can be improved.
S103: when the target unmanned aerial vehicle operates, navigation is performed by utilizing the deep reinforcement learning model according to the real operation information of the target unmanned aerial vehicle.
Here, the real operation information may include real visual information and real task information when the target unmanned aerial vehicle is operated.
For the real visual information, the environment information shot by the target unmanned aerial vehicle during running can be reflected, and in particular, if the target unmanned aerial vehicle is configured with an RGB (Red Green Blue) camera and a depth camera, the real visual information can be obtained in the following manner: shooting images of the running environment by utilizing an RGB camera and a depth camera respectively; performing feature processing and image recognition on the respectively shot images to obtain respectively processed images and image recognition results; and taking the respectively processed image and the image recognition result as real visual information. In addition, the real visual information may be acquired in a real-time acquisition manner, or may be acquired according to a preset acquisition frequency, for example, 60 frames per second, which is not particularly limited in this embodiment of the present application.
For real task information, it may represent the environmental destination of the target drone at runtime. Specifically, the real task information can be obtained through a mode that an unmanned aerial vehicle operator issues an instruction, specifically, the unmanned aerial vehicle operator can directly use an information input module configured by the target unmanned aerial vehicle to issue the instruction containing the real task information to the target unmanned aerial vehicle through an information input mode. For example, the information input module may be embodied as a keyboard, and the unmanned aerial vehicle operator manually inputs an instruction containing real task information by operating the keyboard, so as to obtain the instruction by the target unmanned aerial vehicle. Or, the information input module can be embodied as a voice acquisition module, the unmanned aerial vehicle operator inputs an instruction containing real task information in a voice mode, and the target unmanned aerial vehicle performs voice recognition to determine the real task information.
In addition, the implementation process of performing actual navigation of the target unmanned aerial vehicle by using the deep reinforcement learning model may not be specifically limited. For ease of understanding, the following description is provided in connection with one possible embodiment.
In one possible implementation, S103 may specifically include: acquiring real visual information and real task information when a target unmanned aerial vehicle runs; inputting real visual information and real task information into a deep reinforcement learning model; obtaining predicted navigation information output by the depth strengthening model; and controlling the operation of the target unmanned aerial vehicle based on the predicted navigation information. The predicted navigation information is navigation information matched with the real task information. Therefore, the deep reinforcement learning model constructed by the simulation operation information of the target simulation unmanned aerial vehicle in the simulation operation environment can obtain the actual operation scheme after providing the real visual information and the real character information to realize the navigation of the target unmanned aerial vehicle without being familiar with the environment in advance, so that the efficiency and the accuracy of the navigation of the unmanned aerial vehicle are improved, and the effect of the unmanned aerial vehicle is exerted to the maximum extent.
It can be appreciated that in the unmanned aerial vehicle navigation process, related operations can be performed not only for the target unmanned aerial vehicle, such as building an unmanned aerial vehicle simulation environment, building a deep reinforcement learning model for unmanned aerial vehicle navigation, and the like, but also for different unmanned aerial vehicles, so as to realize autonomous navigation for various unmanned aerial vehicles. In order to facilitate understanding of a navigation method for a specific unmanned aerial vehicle, in the embodiment of the present application, a target unmanned aerial vehicle is taken as an example to make a detailed description.
Based on the above relevant content of S101-S103, in the embodiment of the present application, through the device parameters of the target unmanned aerial vehicle and the environmental parameters of different known environments, a simulation environment corresponding to the simulation target unmanned aerial vehicle can be constructed first, and then based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, a deep reinforcement learning model for unmanned aerial vehicle navigation is constructed. Therefore, when the target unmanned aerial vehicle operates, navigation can be performed according to the real operation information of the target unmanned aerial vehicle and by using the deep reinforcement learning model. Because the deep reinforcement learning model is constructed based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, the model can provide a simulation operation scheme of the simulation target unmanned aerial vehicle in the simulation environment. Therefore, when the target unmanned aerial vehicle actually operates, the actual operation scheme can be obtained by utilizing the model and the actual operation information of the target unmanned aerial vehicle, so that the navigation of the target unmanned aerial vehicle is realized without the need of being familiar with the environment in advance, the navigation efficiency and accuracy of the unmanned aerial vehicle are improved, and the effect of the unmanned aerial vehicle is exerted to the greatest extent.
In order to achieve accurate autonomous navigation of the unmanned aerial vehicle, the embodiment of the application can adopt deep reinforcement learning to navigate the target unmanned aerial vehicle. Based on this, embodiments of the present application may provide one possible implementation of constructing a deep reinforcement learning model. Which may specifically include S201-S203. S201 to S203 will be described below with reference to the embodiments and drawings, respectively.
Fig. 2 is a flowchart of an implementation manner of constructing a deep reinforcement learning model according to an embodiment of the present application. As shown in connection with fig. 2, S201 to S203 may specifically include:
s201: based on the simulation running information, a navigation strategy model for planning navigation information is constructed by utilizing a deep learning algorithm.
The embodiment of the present application may not be limited to a specific process for constructing the navigation policy model, and for convenience of understanding, a possible implementation will be described below.
In one possible implementation, S201 may specifically include: taking simulation visual information of the simulation target unmanned aerial vehicle in a simulation environment as input of a navigation prediction model, and taking simulation navigation information of the simulation target unmanned aerial vehicle in the simulation environment as output of the navigation prediction model to construct the navigation prediction model; the simulation task information of the simulation target unmanned aerial vehicle in the simulation environment and the output of the navigation prediction model are taken as the input of the navigation matching model together, and the matching degree between the output of the navigation prediction model and the simulation task information is taken as the output of the navigation matching model, so that the navigation matching model is constructed; and constructing a navigation strategy model based on the navigation prediction model and the navigation matching model. Here, the navigation prediction model can predict the navigation path based on the simulation visual information of the simulation target unmanned aerial vehicle in the simulation environment, and the navigation matching model can judge the matching degree between the navigation path predicted by the navigation prediction model and the simulation task information, so that task execution of the task is facilitated.
Wherein the navigation prediction model may be composed of a multi-layered network structure. Specifically, the simulated visual information takes an RGB image photographed by the RGB camera subjected to the feature processing, a depth image photographed by the depth camera subjected to the feature processing, and an image recognition result of the depth image as examples, and the deep learning models corresponding to the three different simulated visual information are processed to jointly form the navigation prediction model. The network structure of the deep learning model corresponding to the RGB image can be embodied in a ResNet50 network as a first layer and a full-connection layer as a second layer; the network structure of the deep learning model corresponding to the depth image and the image recognition result of the depth image respectively can be represented by a first layer being a CNN (Convolutional Neural Networks, convolutional neural network) network and a second layer being a full-connection layer. Further, the processing of the deep learning network corresponding to the three different simulated visual information may include performing joint embedding training on the three network structures, embedding the three network structures into the same vector space to perform information fusion, and storing the three network structures through a memory network.
In addition, the navigation matching model may be a classification model, such as a transducer model. For example, if the simulated task information is embodied as a boy searching for a yellow hat, the degree of matching of the navigation matching model output may be expressed as matching when the simulated navigation information is embodied as a boy of at least one yellow hat appears in the simulated visual information of the target unmanned aerial vehicle; when the simulated navigation information is embodied as boys with yellow caps not appearing in the simulated visual information of the target unmanned aerial vehicle, the matching degree output by the navigation matching model can be expressed as mismatching.
S202: based on the navigation strategy model, constructing a navigation evaluation model for evaluating navigation information by using a reinforcement learning algorithm.
The process of constructing the navigation evaluation model may not be specifically limited, and for convenience of understanding, the following description will be made with reference to one possible implementation.
In one possible implementation, S202 may specifically include: obtaining navigation information matched with simulation task information from simulation navigation information output by a navigation prediction model as target navigation information; and taking the output of the navigation strategy model and the target navigation information together as the input of the navigation evaluation model, and taking the rewarding evaluation value corresponding to the target navigation information as the output of the navigation evaluation model to construct the navigation evaluation model. Here, the prize evaluation value may be determined based on criteria required at the time of actual application, for example, a completion time period based on simulation task information, or the like. Therefore, the accuracy of the navigation prediction model is judged by evaluating the target navigation information, so that the navigation prediction model is conveniently optimized based on the navigation evaluation model.
S203: and optimizing the navigation strategy model based on the navigation evaluation model until the navigation strategy model is converged, and taking the converged navigation strategy model as a deep reinforcement learning model.
The navigation evaluation model is used for optimizing the parameters of the navigation strategy model, so that the accuracy of the navigation prediction model is further improved, the navigation efficiency and accuracy of the unmanned aerial vehicle are improved, and the unmanned aerial vehicle is maximally exerted.
Based on the above-mentioned related content of S201-S203, in the embodiment of the present application, navigation information is planned by constructing a navigation policy model, and a navigation evaluation model is constructed to evaluate the navigation information and update and optimize the navigation policy model, so that when the target unmanned aerial vehicle actually operates, the finally obtained deep reinforcement learning model and the actual operation information of the target unmanned aerial vehicle can be utilized to obtain an actual operation scheme to realize the navigation of the target unmanned aerial vehicle, and the environment does not need to be familiar in advance, thereby improving the efficiency and accuracy of unmanned aerial vehicle navigation, and maximally playing the role of the unmanned aerial vehicle.
Based on the unmanned aerial vehicle navigation method provided by the embodiment, the embodiment of the application also provides an unmanned aerial vehicle navigation device. The unmanned aerial vehicle navigation device is described below with reference to the embodiments and drawings, respectively.
Fig. 3 is a schematic structural diagram of an unmanned aerial vehicle navigation device according to an embodiment of the present application. Referring to fig. 3, the unmanned aerial vehicle navigation device 300 provided in the embodiment of the present application may include:
The simulation environment construction module 301 is configured to construct a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the device parameter of the target unmanned aerial vehicle and the environment parameters of different known environments;
the model construction module 302 is configured to construct a deep reinforcement learning model for unmanned aerial vehicle navigation based on simulation operation information of the simulation target unmanned aerial vehicle in a simulation environment;
the navigation module 303 is configured to navigate by using the deep reinforcement learning model according to real operation information of the target unmanned aerial vehicle when the target unmanned aerial vehicle is operating.
In the embodiment of the application, through the cooperation of the simulation environment construction module 301, the model construction module 302 and the navigation module 303, when the target unmanned aerial vehicle actually operates, the actual operation information of the model and the target unmanned aerial vehicle can be utilized to obtain the actual operation scheme so as to realize the navigation of the target unmanned aerial vehicle, and the environment is not required to be familiar in advance, so that the navigation efficiency and accuracy of the unmanned aerial vehicle are improved, and the effect of the unmanned aerial vehicle is exerted to the maximum extent.
In one embodiment, to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the model building module 302 may specifically include:
the navigation strategy model construction module is used for constructing a navigation strategy model for planning navigation information based on simulation operation information by utilizing a deep learning algorithm;
The navigation evaluation model construction module is used for constructing a navigation evaluation model for evaluating navigation information based on the navigation strategy model by utilizing a reinforcement learning algorithm;
and the model optimization module is used for optimizing the navigation strategy model based on the navigation evaluation model until the navigation strategy model is converged, and taking the converged navigation strategy model as a deep reinforcement learning model.
As an implementation manner, in order to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the navigation policy model building module specifically may include:
the first construction module is used for taking simulation visual information of the simulation target unmanned aerial vehicle in a simulation environment as input of a navigation prediction model, and taking simulation navigation information of the simulation target unmanned aerial vehicle in the simulation environment as output of the navigation prediction model to construct the navigation prediction model;
the second construction module is used for taking the simulation task information of the simulation target unmanned aerial vehicle in the simulation environment and the output of the navigation prediction model together as the input of the navigation matching model, and taking the matching degree between the output of the navigation prediction model and the simulation task information as the output of the navigation matching model to construct the navigation matching model;
and the third construction module is used for constructing a navigation strategy model based on the navigation prediction model and the navigation matching model.
As an implementation manner, in order to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the navigation evaluation model building module specifically may include:
the fourth construction module is used for acquiring navigation information matched with the simulation task information from the simulation navigation information output by the navigation prediction model as target navigation information;
and the fifth construction module is used for constructing a navigation evaluation model by taking the output of the navigation strategy model and the target navigation information together as the input of the navigation evaluation model and taking the rewarding evaluation value corresponding to the target navigation information as the output of the navigation evaluation model.
As an embodiment, in order to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the unmanned aerial vehicle navigation device 300 further includes:
the navigation test module is used for constructing a navigation test environment of the target unmanned aerial vehicle, and performing navigation test on the deep reinforcement learning model in the navigation test environment to obtain a test result;
the test random information determining module is used for determining test random information based on the simulation environment and the navigation test environment;
and the first model updating module is used for updating the deep reinforcement learning model according to the test result and the test random information.
As an embodiment, in order to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the navigation module 303 specifically includes:
the first navigation module is used for acquiring real visual information and real task information when the target unmanned aerial vehicle runs;
the second navigation module is used for inputting real visual information and real task information into the deep reinforcement learning model;
the third navigation module is used for acquiring predicted navigation information output by the depth strengthening model; the predicted navigation information is the navigation information matched with the real task information;
and the fourth navigation module is used for controlling the operation of the target unmanned aerial vehicle based on the predicted navigation information.
As an embodiment, to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the simulation environment construction module 301 may specifically include:
the digital twin model construction module is used for constructing a digital twin model corresponding to the target unmanned aerial vehicle as a simulation target unmanned aerial vehicle according to the equipment parameters of the target unmanned aerial vehicle;
the simulation environment set building module is used for building simulation environment sets corresponding to different known environments according to the environment parameters of the different known environments;
the simulation environment construction submodule is used for constructing a simulation environment based on the digital twin model and the simulation environment set.
As an implementation manner, in order to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the digital twin model building module specifically may include:
the first simulation model construction module is used for constructing a control and state estimation system simulation model of the simulation target unmanned aerial vehicle based on control and state estimation system test parameters in the equipment parameters;
the second simulation model construction module is used for constructing a power system simulation model of the simulation target unmanned aerial vehicle based on simulation control parameters output by the control and state estimation system simulation model and power system test parameters in the equipment parameters;
the third simulation model construction module is used for constructing a dynamic simulation model of the simulation target unmanned aerial vehicle based on the simulation power system parameters output by the power system simulation model and the dynamic model test parameters in the equipment parameters;
the fourth simulation model construction module is used for constructing a rigid motion simulation model of the simulation target unmanned aerial vehicle based on simulation dynamic parameters output by the dynamic simulation model and rigid motion model test parameters in equipment parameters;
and the fifth simulation model construction module is used for constructing a digital twin model according to the control and state estimation system simulation model, the power system simulation model, the dynamics simulation model and the rigid body motion simulation model.
As an embodiment, in order to improve the efficiency and accuracy of unmanned aerial vehicle navigation, the unmanned aerial vehicle navigation model further includes:
the simulation motion parameter acquisition module is used for acquiring simulation motion parameters output by the rigid motion simulation model;
and the second model updating module is used for updating the control and state estimation system simulation model and/or the dynamic simulation model according to the simulation motion parameters.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting thereof; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (7)

1. A method of unmanned aerial vehicle navigation, comprising:
constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments; the simulation target unmanned aerial vehicle is a digital twin model of the target unmanned aerial vehicle constructed based on the equipment parameters;
Based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, constructing a deep reinforcement learning model for unmanned aerial vehicle navigation;
when the target unmanned aerial vehicle runs, navigation is carried out by utilizing the deep reinforcement learning model according to the real running information of the target unmanned aerial vehicle;
based on the simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment, constructing a deep reinforcement learning model for unmanned aerial vehicle navigation, comprising:
based on the simulation running information, constructing a navigation strategy model for planning navigation information by using a deep learning algorithm;
constructing a navigation evaluation model for evaluating the navigation information based on the navigation strategy model by using a reinforcement learning algorithm;
optimizing the navigation strategy model based on the navigation evaluation model until the navigation strategy model converges, and taking the converged navigation strategy model as the deep reinforcement learning model;
based on the simulation operation information, and by using a deep learning algorithm, constructing a navigation strategy model for planning navigation information, including:
taking the simulation visual information of the simulation target unmanned aerial vehicle in the simulation environment as the input of a navigation prediction model, and taking the simulation navigation information of the simulation target unmanned aerial vehicle in the simulation environment as the output of the navigation prediction model, and constructing the navigation prediction model;
The simulation task information of the simulation target unmanned aerial vehicle in the simulation environment and the output of the navigation prediction model are taken as the input of a navigation matching model together, and the matching degree between the output of the navigation prediction model and the simulation task information is taken as the output of the navigation matching model, so that the navigation matching model is constructed;
constructing the navigation strategy model based on the navigation prediction model and the navigation matching model;
when the target unmanned aerial vehicle operates, according to the real operation information of the target unmanned aerial vehicle and before the navigation is performed by using the deep reinforcement learning model, the method further comprises the following steps:
constructing a navigation test environment of the target unmanned aerial vehicle, and performing navigation test on the deep reinforcement learning model in the navigation test environment to obtain a test result;
determining test random information based on the simulation environment and the navigation test environment;
and updating the deep reinforcement learning model according to the test result and the test random information.
2. The method of claim 1, wherein the constructing a navigation assessment model for assessing the navigation information based on the navigation strategy model and using a reinforcement learning algorithm comprises:
Obtaining navigation information matched with the simulation task information from simulation navigation information output by the navigation prediction model as target navigation information;
and taking the output of the navigation strategy model and the target navigation information together as the input of the navigation evaluation model, and taking the rewarding evaluation value corresponding to the target navigation information as the output of the navigation evaluation model to construct the navigation evaluation model.
3. The method according to claim 1 or 2, wherein the navigating with the deep reinforcement learning model according to the real operation information of the target unmanned aerial vehicle when the target unmanned aerial vehicle is operating comprises:
acquiring real visual information and real task information when the target unmanned aerial vehicle runs;
inputting the real visual information and the real task information into the deep reinforcement learning model;
obtaining predicted navigation information output by the depth enhancement model; the predicted navigation information is navigation information matched with the real task information;
and controlling the target unmanned aerial vehicle to operate based on the predicted navigation information.
4. The method according to claim 1 or 2, wherein the constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the device parameters of the target unmanned aerial vehicle and the environment parameters of different known environments includes:
Building a simulation environment set corresponding to different known environments according to the environment parameters of the different known environments;
and constructing the simulation environment based on the digital twin model and the simulation environment set.
5. The method according to claim 4, wherein the constructing a digital twin model corresponding to the target unmanned aerial vehicle as the simulated target unmanned aerial vehicle according to the device parameters of the target unmanned aerial vehicle includes:
based on the control and state estimation system test parameters in the equipment parameters, constructing a control and state estimation system simulation model of the simulation target unmanned aerial vehicle;
based on simulation control parameters output by the control and state estimation system simulation model and power system test parameters in the equipment parameters, constructing a power system simulation model of the simulation target unmanned aerial vehicle;
based on the simulation power system parameters output by the power system simulation model and the dynamic model test parameters in the equipment parameters, constructing a dynamic simulation model of the simulation target unmanned aerial vehicle;
based on simulation dynamics parameters output by the dynamics simulation model and rigid motion model test parameters in the equipment parameters, constructing a rigid motion simulation model of the simulation target unmanned aerial vehicle;
And constructing the digital twin model according to the control and state estimation system simulation model, the power system simulation model, the dynamics simulation model and the rigid body motion simulation model.
6. The method of claim 5, wherein the method further comprises:
obtaining simulation motion parameters output by the rigid motion simulation model;
and updating the control and state estimation system simulation model and/or the dynamics simulation model according to the simulation motion parameters.
7. An unmanned aerial vehicle navigation device, comprising:
the simulation environment construction module is used for constructing a simulation environment corresponding to the simulation target unmanned aerial vehicle based on the equipment parameters of the target unmanned aerial vehicle and the environment parameters of different known environments; the simulation target unmanned aerial vehicle is a digital twin model of the target unmanned aerial vehicle constructed based on the equipment parameters;
the model construction module is used for constructing a deep reinforcement learning model for unmanned aerial vehicle navigation based on simulation operation information of the simulation target unmanned aerial vehicle in the simulation environment;
the navigation module is used for navigating according to the real operation information of the target unmanned aerial vehicle and by utilizing the deep reinforcement learning model when the target unmanned aerial vehicle operates;
The model construction module specifically comprises:
the navigation strategy model construction module is used for constructing a navigation strategy model for planning navigation information based on the simulation operation information by utilizing a deep learning algorithm;
the navigation evaluation model construction module is used for constructing a navigation evaluation model for evaluating the navigation information based on the navigation strategy model by utilizing a reinforcement learning algorithm;
the model optimization module is used for optimizing the navigation strategy model based on the navigation evaluation model until the navigation strategy model is converged, and taking the converged navigation strategy model as the deep reinforcement learning model;
the navigation strategy model construction module specifically comprises:
the first construction module is used for taking the simulation visual information of the simulation target unmanned aerial vehicle in the simulation environment as the input of a navigation prediction model, and taking the simulation navigation information of the simulation target unmanned aerial vehicle in the simulation environment as the output of the navigation prediction model to construct the navigation prediction model;
the second construction module is used for taking the simulation task information of the simulation target unmanned aerial vehicle in the simulation environment and the output of the navigation prediction model together as the input of a navigation matching model, and taking the matching degree between the output of the navigation prediction model and the simulation task information as the output of the navigation matching model to construct the navigation matching model;
The third construction module is used for constructing the navigation strategy model based on the navigation prediction model and the navigation matching model;
the unmanned aerial vehicle navigation device further includes:
the navigation test module is used for constructing a navigation test environment of the target unmanned aerial vehicle, and performing navigation test on the deep reinforcement learning model in the navigation test environment to obtain a test result;
the test random information determining module is used for determining test random information based on the simulation environment and the navigation test environment;
and the first model updating module is used for updating the deep reinforcement learning model according to the test result and the test random information.
CN202210902202.XA 2022-07-29 2022-07-29 Unmanned aerial vehicle navigation method and device Active CN114964268B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210902202.XA CN114964268B (en) 2022-07-29 2022-07-29 Unmanned aerial vehicle navigation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210902202.XA CN114964268B (en) 2022-07-29 2022-07-29 Unmanned aerial vehicle navigation method and device

Publications (2)

Publication Number Publication Date
CN114964268A CN114964268A (en) 2022-08-30
CN114964268B true CN114964268B (en) 2023-05-02

Family

ID=82968688

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210902202.XA Active CN114964268B (en) 2022-07-29 2022-07-29 Unmanned aerial vehicle navigation method and device

Country Status (1)

Country Link
CN (1) CN114964268B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116308006B (en) * 2023-05-19 2023-08-01 安徽省赛达科技有限责任公司 A digital rural comprehensive service cloud platform
CN119129298A (en) * 2024-11-14 2024-12-13 中国电子科技集团公司第二十八研究所 A virtual-real combined unmanned autonomous algorithm verification method and system
CN119131643B (en) * 2024-11-15 2025-03-04 深圳天鹰兄弟无人机创新有限公司 UAV visual navigation method and device based on deep learning
CN119148554A (en) * 2024-11-20 2024-12-17 西湖大学 Digital twin-based variable pitch unmanned aerial vehicle simulation control method and unmanned aerial vehicle

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160188675A1 (en) * 2014-12-29 2016-06-30 Ge Aviation Systems Llc Network for digital emulation and repository
CN107450593B (en) * 2017-08-30 2020-06-12 清华大学 Unmanned aerial vehicle autonomous navigation method and system
US10767997B1 (en) * 2019-02-25 2020-09-08 Qualcomm Incorporated Systems and methods for providing immersive extended reality experiences on moving platforms
CN110333739B (en) * 2019-08-21 2020-07-31 哈尔滨工程大学 AUV (autonomous Underwater vehicle) behavior planning and action control method based on reinforcement learning
WO2021086532A1 (en) * 2019-10-29 2021-05-06 Loon Llc Navigating aerial vehicles using deep reinforcement learning
CN111856965A (en) * 2020-06-22 2020-10-30 拓攻(南京)机器人有限公司 Unmanned aerial vehicle visual semi-physical simulation system and simulation method thereof
CN111694365B (en) * 2020-07-01 2021-04-20 武汉理工大学 A Deep Reinforcement Learning Based Path Tracking Method for Unmanned Vessel Formation
CN112179367B (en) * 2020-09-25 2023-07-04 广东海洋大学 A method for autonomous navigation of agents based on deep reinforcement learning
CN112130472A (en) * 2020-10-14 2020-12-25 广州小鹏自动驾驶科技有限公司 A simulation test system and method for autonomous driving
CN112965396A (en) * 2021-02-08 2021-06-15 大连大学 Hardware-in-the-loop visualization simulation method for quad-rotor unmanned aerial vehicle
CN113406957B (en) * 2021-05-19 2022-07-08 成都理工大学 Mobile robot autonomous navigation method based on immune deep reinforcement learning
CN113495578B (en) * 2021-09-07 2021-12-10 南京航空航天大学 A Reinforcement Learning Method for Cluster Track Planning Based on Digital Twin Training
CN114329766A (en) * 2021-09-22 2022-04-12 中国人民解放军空军工程大学 Credibility evaluation method of flight dynamics model for deep reinforcement learning
CN113886953B (en) * 2021-09-27 2022-07-19 中国人民解放军军事科学院国防科技创新研究院 Unmanned aerial vehicle intelligent simulation training method and device based on distributed reinforcement learning
CN113935373B (en) * 2021-10-11 2024-11-19 南京邮电大学 Human action recognition method based on phase information and signal strength
CN114488848B (en) * 2021-12-30 2024-08-09 北京理工大学 Unmanned aerial vehicle autonomous flight system and simulation experiment platform for indoor building space
CN114662656A (en) * 2022-03-04 2022-06-24 深圳大学 Deep neural network model training method, autonomous navigation method and system

Also Published As

Publication number Publication date
CN114964268A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN114964268B (en) Unmanned aerial vehicle navigation method and device
Zhang et al. 2D Lidar‐based SLAM and path planning for indoor rescue using mobile robots
CN114859910B (en) Unmanned ship path following system and method based on deep reinforcement learning
WO2021103834A1 (en) Method for generating lane changing decision model, lane changing decision method for driverless vehicle, and device
CN109948642A (en) A Multi-Agent Cross-Modality Deep Deterministic Policy Gradient Training Method Based on Image Input
CN112231489A (en) Knowledge learning and transferring method and system for epidemic prevention robot
CN109782600A (en) A method for establishing autonomous mobile robot navigation system through virtual environment
CN112034887A (en) Optimal path training method for UAV to avoid columnar obstacles and reach the target point
CN107450593A (en) A kind of unmanned plane autonomous navigation method and system
Li et al. Oil: Observational imitation learning
CN112200319A (en) A rule reasoning method and system for realizing unmanned vehicle navigation and obstacle avoidance
CN116300909A (en) Robot obstacle avoidance navigation method based on information preprocessing and reinforcement learning
KR20190041831A (en) Controlling mobile robot based on reinforcement learning using game environment abstraction
CN117490000A (en) Gas pipeline leakage detection method and device
KR101974448B1 (en) Controlling mobile robot based on asynchronous target classification
CN116817909A (en) Unmanned aerial vehicle relay type navigation method based on deep reinforcement learning
Lin et al. Airvista: Empowering uavs with 3d spatial reasoning abilities through a multimodal large language model agent
CN119692470A (en) Environmental spatial relationship reasoning method, medium and device based on multi-agent debate
CN118379563B (en) Navigation model training method and device, electronic equipment and storage medium
CN119669952A (en) A Sim2Real model construction method and device based on reinforcement learning
Zhou et al. Deep reinforcement learning with long-time memory capability for robot mapless navigation
CN116972853A (en) Planning method for local obstacle avoidance path of mobile robot in unknown environment
CN119698607A (en) Controlling Agents Using Reporter Neural Networks
Mohajerin Modeling Dynamic Systems for Multi-Step Prediction with Recurrent Neural Networks.
Temsamani et al. A multimodal AI approach for intuitively instructable autonomous systems: A case study of an autonomous off-highway vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载