CN117911597A - A GPU method for efficient colorization of multi-field dense point clouds - Google Patents
A GPU method for efficient colorization of multi-field dense point clouds Download PDFInfo
- Publication number
- CN117911597A CN117911597A CN202410082733.8A CN202410082733A CN117911597A CN 117911597 A CN117911597 A CN 117911597A CN 202410082733 A CN202410082733 A CN 202410082733A CN 117911597 A CN117911597 A CN 117911597A
- Authority
- CN
- China
- Prior art keywords
- field
- binary
- fields
- coloring
- vertex
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/005—General purpose rendering architectures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/20—Processor architectures; Processor configuration, e.g. pipelining
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Graphics (AREA)
- Image Generation (AREA)
Abstract
本发明公开了一种多字段密集点云高效着色的GPU方法,包含CPU端预处理模块和GPU端并行着色模块。CPU端预处理模块包括:将字段类型分类为二值字段和非二值字段;统计除颜色字段以外的每个非二值字段的取值范围;对字段进行多元编组;对字段进行编号排序;将预定义颜色条序列编码为一维图像序列;对一维图像序列进行分段线性内插升采样;创建一维图像序列对应的一维纹理序列;将上述处理结果传入图形渲染流水线。GPU端并行着色模块包含:将顶点坐标从全局坐标系转换至剪裁坐标系;根据活动字段类型对顶点进行着色;对着色后的顶点执行后处理;对光栅化后的片元进行着色。本发明能够高效呈现密集点云在不同字段、不同颜色条下的着色效果。
The present invention discloses a GPU method for efficient coloring of multi-field dense point clouds, comprising a CPU-side preprocessing module and a GPU-side parallel coloring module. The CPU-side preprocessing module comprises: classifying field types into binary fields and non-binary fields; counting the value range of each non-binary field except the color field; performing multi-grouping on the fields; numbering and sorting the fields; encoding a predefined color bar sequence into a one-dimensional image sequence; performing piecewise linear interpolation upsampling on the one-dimensional image sequence; creating a one-dimensional texture sequence corresponding to the one-dimensional image sequence; and passing the above processing results into a graphics rendering pipeline. The GPU-side parallel coloring module comprises: converting vertex coordinates from a global coordinate system to a clipping coordinate system; coloring vertices according to the active field type; performing post-processing on the colored vertices; and coloring the rasterized fragments. The present invention can efficiently present the coloring effects of dense point clouds under different fields and different color bars.
Description
技术领域Technical Field
本发明涉及计算机视觉和计算机图形学领域中基于密集点云实现数字孪生的可视化与表达,特别涉及到利用GPU(Graphics Processing Unit)并行处理架构实现多字段密集点云的高效着色的方法。The present invention relates to the visualization and expression of digital twins based on dense point clouds in the fields of computer vision and computer graphics, and in particular to a method for efficiently coloring multi-field dense point clouds by utilizing a GPU (Graphics Processing Unit) parallel processing architecture.
背景技术Background technique
近年来,激光雷达LiDAR(Light Detection and Ranging)技术快速发展,已被广泛应用于数字孪生、智慧城市、无人驾驶等新兴领域。与结构光技术、多视密集匹配技术相比,利用LiDAR技术获取的点云的精度更高、分辨率更高、对光照以及材质等环境因素的鲁棒性更强。随着LiDAR发射器、接收器、惯性导航技术、定位技术、数据存储设备以及实时通信模块等软硬件的快速升级与迭代,各类LiDAR平台(手持、车载以及机载等)所采集的点云空间范围更广、密度更高、字段属性更多。In recent years, LiDAR (Light Detection and Ranging) technology has developed rapidly and has been widely used in emerging fields such as digital twins, smart cities, and unmanned driving. Compared with structured light technology and multi-view dense matching technology, the point cloud obtained by LiDAR technology has higher accuracy, higher resolution, and stronger robustness to environmental factors such as lighting and materials. With the rapid upgrade and iteration of software and hardware such as LiDAR transmitters, receivers, inertial navigation technology, positioning technology, data storage devices, and real-time communication modules, the point cloud space collected by various LiDAR platforms (handheld, vehicle-mounted, and airborne, etc.) has a wider range, higher density, and more field attributes.
实时呈现点云在不同字段下的可视化效果,是进行地物反演、结构分析、变化监测等深层应用分析的重要支撑技术。然而,现有主流技术利用中央处理器CPU(CentralProcessing Unit)逐点更新大场景密集点云在不同字段下的渲染效果,实时性能低下,难以满足深层应用分析的时效需求。Real-time visualization of point clouds in different fields is an important supporting technology for deep application analysis such as ground feature inversion, structural analysis, and change monitoring. However, the existing mainstream technology uses the central processing unit (CPU) to update the rendering effect of large scene dense point clouds in different fields point by point, which has low real-time performance and is difficult to meet the timeliness requirements of deep application analysis.
因此,针对上述技术瓶颈,本发明提出一种多字段密集点云高效着色的GPU方法。本发明最显著的优势在于,将密集点云中的每个点视作三维可视化中的顶点,利用GPU图形渲染流水线的并行处理架构快速呈现多字段密集点云在不同字段、不同颜色条下的可视化效果。本发明的关键技术可为基于密集点云的地物分类、语义提取、目标检测等深层应用分析,提供高效、快捷的可视化方案。Therefore, in response to the above technical bottlenecks, the present invention proposes a GPU method for efficient coloring of multi-field dense point clouds. The most significant advantage of the present invention is that each point in the dense point cloud is regarded as a vertex in three-dimensional visualization, and the parallel processing architecture of the GPU graphics rendering pipeline is used to quickly present the visualization effect of multi-field dense point clouds in different fields and different color bars. The key technology of the present invention can provide an efficient and fast visualization solution for deep application analysis such as object classification, semantic extraction, and target detection based on dense point clouds.
发明内容Summary of the invention
为了克服现有技术的不足,本发明提出一种多字段密集点云高效着色的GPU方法,利用GPU的并行处理能力快速呈现多字段密集点云在不同字段、不同颜色条下的可视化效果,为后续的应用分析提供可视化技术支撑。本发明的特征包括两个方面:CPU端预处理模块和GPU端并行着色模块。In order to overcome the shortcomings of the prior art, the present invention proposes a GPU method for efficient coloring of multi-field dense point clouds, which uses the parallel processing capability of the GPU to quickly present the visualization effects of multi-field dense point clouds in different fields and different color bars, and provides visualization technical support for subsequent application analysis. The features of the present invention include two aspects: a CPU-side preprocessing module and a GPU-side parallel coloring module.
所述的CPU端预处理模块包括:The CPU-side preprocessing module comprises:
步骤1,将字段类型分类为二值字段和非二值字段;Step 1, classify the field types into binary fields and non-binary fields;
步骤2,统计除颜色字段以外的每个非二值字段的取值范围;Step 2, counting the value range of each non-binary field except the color field;
步骤3,对字段进行多元编组;Step 3, perform multi-grouping on the fields;
步骤4,对字段进行编号排序;Step 4, sort the fields by number;
步骤5,将预定义颜色条序列编码为一维图像序列;Step 5, encoding the predefined color bar sequence into a one-dimensional image sequence;
步骤6,对一维图像序列进行分段线性内插升采样;Step 6, performing piecewise linear interpolation upsampling on the one-dimensional image sequence;
步骤7,创建一维图像序列对应的一维纹理序列;Step 7, creating a one-dimensional texture sequence corresponding to the one-dimensional image sequence;
步骤8,将上述处理结果传入图形渲染流水线的顶点着色器;Step 8, passing the above processing results into the vertex shader of the graphics rendering pipeline;
所述的GPU端并行着色模块包括:The GPU-side parallel shading module includes:
步骤9,将顶点坐标从全局坐标系转换至剪裁坐标系;Step 9, convert the vertex coordinates from the global coordinate system to the clipping coordinate system;
步骤10,根据活动字段类型对顶点进行着色;Step 10, color the vertices according to the active field type;
步骤11,对着色后的顶点执行后处理;Step 11, performing post-processing on the colored vertices;
步骤12,对光栅化后的片元进行着色。Step 12: coloring the rasterized fragments.
步骤3,对字段进行多元编组,用以降低图形渲染流水线中顶点属性关联点的使用数量,提高CPU向GPU传输字段值的效率以及应用的可扩展性,其中多元编组方法包括:Step 3: Multi-group the fields to reduce the number of vertex attribute association points used in the graphics rendering pipeline, improve the efficiency of transmitting field values from the CPU to the GPU and the scalability of the application, wherein the multi-grouping method includes:
(1)将坐标字段(x,y,z)和颜色字段(r,g,b)编组在两个三元组中;(1) Group the coordinate fields (x, y, z) and the color fields (r, g, b) into two triplets;
(2)将其余非二值字段迭代地编组为四元组;当其余非二值字段的数量m不足4时,则将这m个字段编组在m元组中;(2) Iteratively group the remaining non-binary fields into four-tuples; when the number m of the remaining non-binary fields is less than 4, group the m fields into m-tuples;
(3)将二值字段迭代地编组为三元组;当剩余二值字段的数量n不足3时,则将这n个字段编组在一个n元组中;(3) Iteratively grouping the binary fields into triplets; when the number n of remaining binary fields is less than 3, grouping the n fields into an n-tuple;
(4)使用顶点数组对象VAO和顶点缓冲对象VBO传输多元组序列集合。(4) Use vertex array objects VAO and vertex buffer objects VBO to transfer a set of tuple sequences.
步骤6,对一维图像序列进行分段线性内插升采样,用以提高纹理数据在显存中的字节对齐程度与访问效率,增强GPU方法在不同硬件设备平台上的兼容性,所述步骤6包括:Step 6, performing piecewise linear interpolation upsampling on the one-dimensional image sequence to improve the byte alignment and access efficiency of texture data in the video memory and enhance the compatibility of the GPU method on different hardware device platforms, said step 6 comprising:
(1)基于按位与操作和对数换底公式快速计算升采样后的目标像素宽度;(1) Rapidly calculate the target pixel width after upsampling based on the bitwise AND operation and the logarithmic base conversion formula;
(2)确立相邻两像素间需要内插的像素数量;(2) Determine the number of pixels that need to be interpolated between two adjacent pixels;
(3)以像素坐标距离为权重执行分段线性内插;(3) performing piecewise linear interpolation using pixel coordinate distance as weight;
(4)创建一维图像序列的活动纹理并作为一致性一维采样器绑定到顶点着色器。(4) Create an active texture of the one-dimensional image sequence and bind it to the vertex shader as a consistent one-dimensional sampler.
步骤10,根据活动字段类型对顶点进行着色,即针对二值字段和非二值字段采用不同的着色机制:Step 10: Color the vertices according to the active field type, that is, use different coloring mechanisms for binary fields and non-binary fields:
(1)若活动字段编号所对应的字段类型为颜色字段,则直接从顶点颜色属性数组中取出该顶点的颜色作为着色结果;(1) If the field type corresponding to the active field number is a color field, the color of the vertex is directly taken from the vertex color attribute array as the shading result;
(2)若活动字段编号所对应的字段类型为除颜色字段以外的非二值字段,则根据该非二值字段的取值范围和顶点在该非二值字段上的取值,计算线性纹理坐标,通过线性纹理坐标在活动纹理中取得衔接自然的着色结果;(2) If the field type corresponding to the active field number is a non-binary field other than a color field, the linear texture coordinates are calculated according to the value range of the non-binary field and the value of the vertex on the non-binary field, and a naturally connected shading result is obtained in the active texture through the linear texture coordinates;
(3)若活动字段编号所对应的字段类型为二值字段,则将活动纹理的首末位置处的像素分别作为二值字段取0和1时的着色结果。(3) If the field type corresponding to the active field number is a binary field, the pixels at the first and last positions of the active texture are used as the shading results when the binary field takes the values of 0 and 1, respectively.
步骤5中,设预定义颜色条的数量为n条,每个颜色条包含m个像素,每个颜色条的像素个数并不相同,每个像素包含(r,g,b)三个颜色通道,则生成一维图像序列的具体步骤为:In step 5, assume that the number of predefined color bars is n, each color bar contains m pixels, the number of pixels in each color bar is different, and each pixel contains three color channels (r, g, b). The specific steps for generating a one-dimensional image sequence are:
(1)依次从预定义颜色条集合中取出一个颜色条Ci;(1) Take a color bar Ci from the predefined color bar set in turn;
(2)为Ci分配3m字节大小的内存Mi;(2) Allocate 3M bytes of memory Mi for Ci ;
(3)依次将Ci的每个像素写入Mi;(3) Write each pixel of Ci into Mi in sequence;
(4)根据Ci的像素数量m和对应的内存Mi,构建一维图像Ii;(4) Construct a one-dimensional image I i according to the number of pixels m of C i and the corresponding memory M i ;
(5)将Ii记录在一维图像序列L={Ii}中。(5) Record I i in a one-dimensional image sequence L = {I i }.
步骤8中,根据CPU端预处理模块处理结果的数据特性,采用不同的传输机制分别将非二值字段的取值范围、用户当前选中的活动字段编号、多元组序列集合以及用户当前选中的活动纹理传入顶点着色器,所述步骤8包括:In step 8, according to the data characteristics of the processing result of the CPU-side preprocessing module, different transmission mechanisms are used to respectively transmit the value range of the non-binary field, the active field number currently selected by the user, the multi-tuple sequence set, and the active texture currently selected by the user to the vertex shader, and the step 8 includes:
(1)使用uniform一致性变量传递非二值字段的取值范围和用户当前选中的活动字段编号;(1) Use uniform consistency variables to pass the value range of non-binary fields and the number of the active field currently selected by the user;
(2)使用VAO和VBO传输多元组序列集合;(2) Use VAO and VBO to transmit a set of multi-tuple sequences;
(3)将用户当前选中的活动纹理作为uniform sampler1D一致性一维采样器绑定到顶点着色器。(3) Bind the active texture currently selected by the user to the vertex shader as a uniform sampler1D consistent one-dimensional sampler.
用户当前切换选中的字段类型分为二值字段和非二值字段,两种类型的字段采用不同的着色机制,着色机制包括:The field types currently switched by the user are divided into binary fields and non-binary fields. The two types of fields use different coloring mechanisms, including:
(1)若活动字段编号所对应的字段类型为颜色字段,则直接从顶点颜色属性数组中取出该顶点的颜色作为着色结果;(1) If the field type corresponding to the active field number is a color field, the color of the vertex is directly taken from the vertex color attribute array as the shading result;
(2)若活动字段编号所对应的字段类型为二值字段,则将活动纹理的首末位置处的像素分别作为二值字段取0和1时的着色结果;(2) If the field type corresponding to the active field number is a binary field, the pixels at the first and last positions of the active texture are used as the coloring results when the binary field takes 0 and 1 respectively;
(3)若活动字段编号所对应的字段类型为除颜色字段以外的非二值字段,则根据该非二值字段的取值范围和顶点在该非二值字段上的取值,计算线性纹理坐标,通过线性纹理坐标在活动纹理中取得衔接自然的着色结果。(3) If the field type corresponding to the active field number is a non-binary field other than a color field, the linear texture coordinates are calculated based on the value range of the non-binary field and the value of the vertex on the non-binary field, and a naturally connected shading result is obtained in the active texture through the linear texture coordinates.
步骤11中,经顶点着色器着色后的顶点,先经过后处理阶段后被光栅化,顶点后处理阶段包括变换反馈、图元装配、视椎体剪裁、透视除法以及视口变换5个环节。In step 11, the vertices shaded by the vertex shader are first subjected to a post-processing stage and then rasterized. The vertex post-processing stage includes five steps: transformation feedback, primitive assembly, viewing frustum clipping, perspective division, and viewport transformation.
步骤1至步骤11中处理的图元均是矢量点,矢量点经过光栅化处理后才能呈现在屏幕上,光栅化根据窗口坐标将矢量点栅格化为片元,片元即为为像素单元,然后在片元着色器中对片元进行着色;The primitives processed in steps 1 to 11 are all vector points. Vector points can be presented on the screen only after being rasterized. Rasterization rasterizes vector points into fragments according to window coordinates. Fragments are pixel units, and then the fragments are colored in the fragment shader.
在光栅化阶段开启抗锯齿,在开启抗锯齿的条件下,图形渲染流水线以待栅格化的矢量点为圆心,构建半径为ps的圆形区域;片元中心坐标位于该圆形区域内的片元构成矢量点栅格化后的区域;在光栅化过程中,矢量点被栅格化后所占的像素区域大小称之为点的尺寸大小ps;In the rasterization stage, anti-aliasing is enabled. Under the condition of enabling anti-aliasing, the graphics rendering pipeline constructs a circular area with a radius of ps with the vector point to be rasterized as the center. The fragments whose center coordinates are located in the circular area constitute the area after the vector point is rasterized. In the rasterization process, the size of the pixel area occupied by the vector point after being rasterized is called the point size ps .
光栅化生成的离散片元被送入片元着色器进行最终着色,矢量点经过光栅化后形成的离散化片元拥有相同的属性值,片元着色器中利用顶点着色器的输出结果,并行地对离散化的片元进行着色;最后,经过逐片元操作阶段的裁剪测试、模版测试以及深度测试后,将可见片元送入帧缓冲区并绘制到屏幕上,完成一帧中的着色处理。The discrete fragments generated by rasterization are sent to the fragment shader for final shading. The discretized fragments formed after rasterization of the vector points have the same attribute values. The fragment shader uses the output results of the vertex shader to shade the discretized fragments in parallel; finally, after the clipping test, template test and depth test in the fragment-by-fragment operation stage, the visible fragments are sent to the frame buffer and drawn to the screen to complete the shading process in one frame.
本发明的优点在于:利用GPU的并行处理能力快速呈现多字段密集点云在不同字段、不同颜色条下的可视化效果,为后续的应用分析提供可视化技术支撑。为基于密集点云的地物分类、语义提取、目标检测等深层应用分析,提供高效、快捷的可视化方案。将密集点云中的每个点视作三维可视化中的顶点,利用GPU图形渲染流水线的并行处理架构快速呈现多字段密集点云在不同字段、不同颜色条下的可视化效果。The advantages of the present invention are: using the parallel processing capability of GPU to quickly present the visualization effect of multi-field dense point cloud in different fields and different color bars, and providing visualization technology support for subsequent application analysis. Providing an efficient and fast visualization solution for deep application analysis such as object classification, semantic extraction, target detection, etc. based on dense point cloud. Treating each point in the dense point cloud as a vertex in three-dimensional visualization, the parallel processing architecture of GPU graphics rendering pipeline is used to quickly present the visualization effect of multi-field dense point cloud in different fields and different color bars.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
下面对本发明说明书各幅附图表达的内容及图中的标记作简要说明:The following is a brief description of the contents expressed in the drawings of the present invention and the marks in the drawings:
图1为本发明实施例的多字段密集点云高效着色的总体流程;FIG1 is an overall process of efficient coloring of multi-field dense point clouds according to an embodiment of the present invention;
图2为本发明实施例的标准点云文件字段信息统计一览;FIG2 is a statistical overview of field information of a standard point cloud file according to an embodiment of the present invention;
图3为本发明实施例的分段线性内插升采样示意图;FIG3 is a schematic diagram of piecewise linear interpolation upsampling according to an embodiment of the present invention;
图4为本发明实施例的VAO和VBO的组织架构关系示意图;FIG4 is a schematic diagram of the organizational structure relationship between VAO and VBO according to an embodiment of the present invention;
图5为本发明实施例的图形渲染流水线的流程;FIG5 is a flow chart of a graphics rendering pipeline according to an embodiment of the present invention;
图6为本发明实施例的纹理坐标分布示意图;FIG6 is a schematic diagram of texture coordinate distribution according to an embodiment of the present invention;
图7为本发明实施例的顶点后处理阶段的流程;FIG7 is a flow chart of the vertex post-processing stage of an embodiment of the present invention;
图8为本发明实施例的透视投影的平头视椎体;FIG8 is a perspective projection flat-head viewing frustum of an embodiment of the present invention;
图9为本发明实施例的根据视点位姿和密集点云的最小外包自适应调整远近截面;FIG9 is a diagram showing the adaptive adjustment of near and far cross sections according to the viewpoint position and the minimum outer cover of the dense point cloud according to an embodiment of the present invention;
图10为本发明实施例的根据视口宽高比自适应调整上下左右截面;FIG10 is a diagram showing adaptive adjustment of upper, lower, left, and right cross sections according to the viewport aspect ratio according to an embodiment of the present invention;
图11为本发明实施例的归一化设备坐标系;FIG11 is a normalized device coordinate system according to an embodiment of the present invention;
图12为本发明实施例的开启抗锯齿时的栅格化方式。FIG. 12 is a diagram showing a rasterization method when anti-aliasing is enabled according to an embodiment of the present invention.
具体实施方式Detailed ways
下面对照附图,通过对最优实施例的描述,对本发明的具体实施方式作进一步详细的说明。The specific implementation of the present invention will be further explained in detail below by describing the optimal embodiment with reference to the accompanying drawings.
本实施例提供一种多字段密集点云高效着色的GPU方法,涉及多字段密集点云的可视化是呈现地物分类、语义提取以及目标检测等深层分析结果的关键技术。现有主流技术手段根据给定字段,在CPU(Central Processing Unit)端逐点对点云中的每个点进行着色,因此效率低下,难以满足深层应用分析的实时需求。本发明提出一种多字段密集点云高效着色的GPU(Graphics Processing Unit)方法,包含两大处理模块:CPU端预处理模块和GPU端并行着色模块。CPU端预处理模块包括以下环节:根据字段的数据类型与属性,将字段类型分类为二值字段和非二值字段;统计除颜色字段以外的每个非二值字段的取值范围;对字段进行多元编组;对字段进行编号排序;将预定义颜色条序列编码为一维图像序列;对一维图像序列进行分段线性内插升采样;创建一维图像序列对应的一维纹理序列;将上述处理结果传入图形渲染流水线。GPU端并行着色模块包含以下环节:将顶点坐标从全局坐标系转换至剪裁坐标系;根据活动字段类型对顶点进行着色;对着色后的顶点执行后处理;对光栅化后的片元进行着色。本发明的技术手段不仅考虑了标准点云文件的所有字段信息,而且还可以根据应用场景对字段种类进行扩展,利用GPU图形渲染流水线的并行处理能力高效呈现密集点云在不同字段、不同颜色条下的着色效果,可以服务于基于点云数据的道路测试、无人驾驶以及工业零部件检测等热门行业。The present embodiment provides a GPU method for efficient coloring of multi-field dense point clouds, and the visualization of multi-field dense point clouds is a key technology for presenting deep analysis results such as object classification, semantic extraction, and target detection. The existing mainstream technical means color each point in the point cloud point by point on the CPU (Central Processing Unit) side according to a given field, so it is inefficient and difficult to meet the real-time needs of deep application analysis. The present invention proposes a GPU (Graphics Processing Unit) method for efficient coloring of multi-field dense point clouds, which includes two major processing modules: a CPU-side preprocessing module and a GPU-side parallel coloring module. The CPU-side preprocessing module includes the following links: classifying the field type into binary fields and non-binary fields according to the data type and attributes of the field; counting the value range of each non-binary field except the color field; multi-grouping the fields; numbering and sorting the fields; encoding the predefined color bar sequence into a one-dimensional image sequence; performing piecewise linear interpolation upsampling on the one-dimensional image sequence; creating a one-dimensional texture sequence corresponding to the one-dimensional image sequence; and passing the above processing results into the graphics rendering pipeline. The parallel shading module on the GPU side includes the following steps: converting vertex coordinates from the global coordinate system to the clipping coordinate system; shading vertices according to the active field type; performing post-processing on the vertices after shading; and shading the rasterized fragments. The technical means of the present invention not only takes into account all the field information of the standard point cloud file, but also expands the field types according to the application scenario, and uses the parallel processing capabilities of the GPU graphics rendering pipeline to efficiently present the shading effects of dense point clouds under different fields and different color bars, which can serve popular industries such as road testing, unmanned driving, and industrial parts inspection based on point cloud data.
本实施例的一种多字段密集点云高效着色的GPU(Graphics Processing Unit)方法,主要包括CPU端预处理模块和GPU端并行着色模块,每个模块主要包含以下处理步骤:A GPU (Graphics Processing Unit) method for efficient coloring of a multi-field dense point cloud in this embodiment mainly includes a CPU-side preprocessing module and a GPU-side parallel coloring module, each of which mainly includes the following processing steps:
一、CPU端预处理模块1. CPU-side preprocessing module
步骤1,将字段类型分类为二值字段和非二值字段;Step 1, classify the field types into binary fields and non-binary fields;
步骤2,统计除颜色字段以外的每个非二值字段的取值范围;Step 2, counting the value range of each non-binary field except the color field;
步骤3,对字段进行多元编组;Step 3, perform multi-grouping on the fields;
步骤4,对字段进行编号排序(用户在交互过程中切换选中的编号即为活动字段);Step 4: Sort the fields by number (the number that the user switches to during the interaction is the active field);
步骤5,将预定义颜色条序列编码为一维图像序列;Step 5, encoding the predefined color bar sequence into a one-dimensional image sequence;
步骤6,对一维图像序列进行分段线性内插升采样;Step 6, performing piecewise linear interpolation upsampling on the one-dimensional image sequence;
步骤7,创建一维图像序列对应的一维纹理序列(用户在交互过程中切换选中的颜色条对应一条活动纹理);Step 7, creating a one-dimensional texture sequence corresponding to the one-dimensional image sequence (the color bar selected by the user during the interaction corresponds to an active texture);
步骤8,将上述处理结果传入图形渲染流水线的顶点着色器。Step 8: pass the above processing results to the vertex shader of the graphics rendering pipeline.
二、GPU端并行着色模块2. GPU-side parallel shading module
步骤9,将顶点坐标从全局坐标系转换至剪裁坐标系;Step 9, convert the vertex coordinates from the global coordinate system to the clipping coordinate system;
步骤10,根据活动字段类型对顶点进行着色;Step 10, color the vertices according to the active field type;
步骤11,对着色后的顶点执行后处理;Step 11, performing post-processing on the colored vertices;
步骤12,对光栅化后的片元进行着色。Step 12: coloring the rasterized fragments.
下面将详细介绍CPU端预处理模块和GPU端并行着色模块的技术细节:The following is a detailed description of the technical details of the CPU-side preprocessing module and the GPU-side parallel shading module:
一、CPU端预处理1. CPU-side preprocessing
步骤1,将字段类型分类为二值字段和非二值字段Step 1: Classify field types into binary fields and non-binary fields
通过多视密集匹配技术、结构光相机以及LiDAR采集设备可以获取目标场景的三维点云。三维点云由密集的点组成,每个点包含了诸多记录点云属性的字段信息。根据字段的数据类型与特点,可以将字段分类为二值字段和非二值字段。图2统计了标准点云文件所包含的所有字段信息,每个字段拥有不同的数据类型和数据大小,其中GPS时间:该非二值字段的数据长度最大,为8字节;方向标识和边缘标识:这两个二值字段的数据长度最小,为1位。二值字段与非二值在本发明中的定义如下:若该字段的取值为0或1,则该字段为二值字段(二值字段的数据长度仅为1位,以方向标识这个二值字段为例,当其取1时,代表扫描方向从左向右;取0时,代表扫描方向从右向左);否则,该字段为非二值字段。类别信息是数据长度为1字节(8位)的复合字段:前4位(0~4)代表32种分类结果,为非二值字段;第5位、第6位和第7位均是二值字段,分别代表合成点标识、关键点标识以及保留标识。扩展字段是用户自定义的字段,其既可以是二值字段,也可以是非二值字段。对每个字段更详细的描述,可以查阅文献《LAS Specification Version 1.2》。The three-dimensional point cloud of the target scene can be obtained through multi-view dense matching technology, structured light camera and LiDAR acquisition equipment. The three-dimensional point cloud is composed of dense points, and each point contains a lot of field information that records the properties of the point cloud. According to the data type and characteristics of the field, the field can be classified into binary fields and non-binary fields. Figure 2 counts all the field information contained in the standard point cloud file. Each field has a different data type and data size. Among them, GPS time: the data length of this non-binary field is the largest, which is 8 bytes; direction identification and edge identification: the data length of these two binary fields is the smallest, which is 1 bit. The definition of binary fields and non-binary values in the present invention is as follows: if the value of the field is 0 or 1, the field is a binary field (the data length of the binary field is only 1 bit. Taking the binary field of the direction identification as an example, when it takes 1, it represents the scanning direction from left to right; when it takes 0, it represents the scanning direction from right to left); otherwise, the field is a non-binary field. The category information is a composite field with a data length of 1 byte (8 bits): the first 4 bits (0 to 4) represent 32 classification results and are non-binary fields; the 5th, 6th, and 7th bits are binary fields, representing synthetic point identifiers, key point identifiers, and reserved identifiers, respectively. The extended field is a user-defined field, which can be either a binary field or a non-binary field. For a more detailed description of each field, please refer to the document "LAS Specification Version 1.2".
根据图2的字段信息,本发明在读取las/laz文件的文件头时(las文件采用liblas开源库加载;laz文件采用PDAL开源库加载),依次将所读取到的字段标记为二值字段或者非二值字段。According to the field information of Figure 2, when reading the file header of the las/laz file (las file is loaded using the liblas open source library; laz file is loaded using the PDAL open source library), the present invention marks the read fields as binary fields or non-binary fields in turn.
步骤2,统计除颜色字段以外的每个非二值字段的取值范围Step 2: Count the value range of each non-binary field except the color field
本发明对非二值字段执行基于空间范围的线性平滑着色,以实现着色效果的自然无缝衔接,因此需要统计除颜色字段以外的每个非二值字段的取值范围(颜色字段的取值本身就可以作为着色结果,因此无需统计)。本发明在首次加载点云文件时,会逐点统计每个字段的区间范围。设密集点云中非二值字段Fi的数量为N,则有N个二元组分别代表该N个非二值字段的取值范围(/>代表Fi的最小值,/>代表Fi的最大值)。为了便于统计,初始时令/>(FLT_MAX代表单精度浮点数可以取得的最大正值),则统计每个非二值字段取值范围的流程如下:The present invention performs linear smoothing based on spatial range on non-binary fields to achieve a natural and seamless coloring effect. Therefore, it is necessary to count the value range of each non-binary field except the color field (the value of the color field itself can be used as the coloring result, so there is no need to count). When the point cloud file is loaded for the first time, the present invention will count the interval range of each field point by point. Suppose the number of non-binary fields Fi in the dense point cloud is N, then there are N binary groups Respectively represent the value ranges of the N non-binary fields (/> represents the minimum value of F i , /> represents the maximum value of F i ). For the convenience of statistics, the initial time is / > (FLT_MAX represents the maximum positive value that a single-precision floating-point number can obtain), the process of counting the value range of each non-binary field is as follows:
(1)判断保存非二值字段取值范围的记录文件是否存在于磁盘上:若存在,则无需进行统计,直接加载该记录文件;否则,转(2);(1) Determine whether the record file that stores the value range of the non-binary field exists on the disk: if it does, no statistics are required and the record file is directly loaded; otherwise, go to (2);
(2)逐点读取点云文件中的每个点P;(2) Read each point P in the point cloud file point by point;
(3)遍历P的每个字段Fi;(3) Traverse each field F i of P;
(4)判断Fi的字段类型:若Fi为二值字段,则无需统计;否则,转(5);(4) Determine the field type of F i : If F i is a binary field, no statistics are required; otherwise, go to (5);
(5)判断Fi是否为颜色信息字段:若是,则无需统计;否则,转(6);(5) Determine whether F i is a color information field: if so, no statistics are required; otherwise, go to (6);
(6)基于Fi的值Gi更新即有(6) Update Gi based on the value of F i That is
其中,min和max分别代表取两者间的最小值和最大值。Among them, min and max represent the minimum and maximum values between the two respectively.
(7)将每个非二值字段的取值范围作为记录文件保存至磁盘上。(7) Save the value range of each non-binary field as a record file on the disk.
步骤3,对字段进行多元编组Step 3: Multi-group the fields
对字段进行多元编组目的在于降低图形渲染流水线中顶点属性关联点的使用数量,提高CPU向GPU传输字段值的效率以及应用的可扩展性。在本发明中,需要借助顶点着色器上的顶点属性关联点将所有字段值从CPU端传入GPU端。而在通用渲染引擎如OpenGL(Open Graphics Library)的图形渲染流水线中,顶点属性关联点的数量上限为16个。若将每个字段单独作为一个通道关联至顶点属性关联点上,一方面会降低传输效率,加剧GPU的负载;另一方面,还会降低着色器的扩展性(在基于图形渲染流水线的多字段密集点云着色过程中,除了基本的字段信息外,还会根据应用场景的需要求引入诸如删除标识、选中标识等额外字段,这些额外字段同样需要关联至顶点属性关联点上)。鉴于以上因素,本发明根据字段类型进行多元编组,即将多个字段组合形成一个多元组,将多元组作为整体关联到某个顶点属性关联点上并传入顶点着色器。具体的多元编组方案如下:The purpose of multi-grouping fields is to reduce the number of vertex attribute association points used in the graphics rendering pipeline, improve the efficiency of transmitting field values from the CPU to the GPU, and improve the scalability of the application. In the present invention, it is necessary to use the vertex attribute association points on the vertex shader to transfer all field values from the CPU side to the GPU side. In the graphics rendering pipeline of a general rendering engine such as OpenGL (Open Graphics Library), the upper limit of the number of vertex attribute association points is 16. If each field is associated with the vertex attribute association point as a channel, on the one hand, the transmission efficiency will be reduced and the GPU load will be increased; on the other hand, the scalability of the shader will be reduced (in the multi-field dense point cloud shading process based on the graphics rendering pipeline, in addition to the basic field information, additional fields such as deletion flags and selection flags will be introduced according to the needs of the application scenario. These additional fields also need to be associated with the vertex attribute association point). In view of the above factors, the present invention performs multi-grouping according to the field type, that is, multiple fields are combined to form a multi-group, and the multi-group is associated with a vertex attribute association point as a whole and passed to the vertex shader. The specific multi-grouping scheme is as follows:
(1)坐标字段(x,y,z)和颜色字段(r,g,b)是两种固有的非二值字段,将这两个字段编组在两个三元组中;(1) The coordinate field (x, y, z) and the color field (r, g, b) are two inherent non-binary fields. These two fields are grouped into two triplets.
(2)将其余非二值字段(如回波强度、回波编号、回波数量以及扫描角度)迭代地编组为四元组;当剩余非二值字段的数量m不足4时,则将这m个字段编组在m元组中;(2) iteratively grouping the remaining non-binary fields (such as echo intensity, echo number, echo quantity, and scanning angle) into four-tuples; when the number m of the remaining non-binary fields is less than 4, grouping the m fields into an m-tuple;
(3)将二值字段(如合成点标识ss、关键点标识kk、保留标识rr)迭代地编组为三元组(ss,kk,rr);当剩余二值字段的数量n不足3时,则将这n个字段编组在一个n元组中。(3) Iteratively group the binary fields (such as synthetic point identifier ss, key point identifier kk, and reserved identifier rr) into triplets (ss, kk, rr); when the number of remaining binary fields n is less than 3, group these n fields into an n-tuple.
通过编组处理,每个字段均隶属于某个多元组中。给定x个点,则每个多元组对应x条记录,这x条记录形成多元组序列,多个多元组序列形成多元组序列集合。在后续处理中,将每个多元组序列依次与顶点着色器上的一个顶点属性关联点进行关联,传入至GPU的图形渲染流水线。借助多元编组,顶点着色器上的顶点属性关联点的使用数量可以减少60%,不仅可以提高传输效率;更可以增强可扩展性,节约出的顶点属性关联点可以用于表达其他应用场景下的字段。Through grouping, each field belongs to a tuple. Given x points, each tuple corresponds to x records, and these x records form a tuple sequence, and multiple tuple sequences form a tuple sequence set. In subsequent processing, each tuple sequence is associated with a vertex attribute association point on the vertex shader in turn and passed to the GPU's graphics rendering pipeline. With the help of tuple grouping, the number of vertex attribute association points used on the vertex shader can be reduced by 60%, which not only improves transmission efficiency; it also enhances scalability, and the saved vertex attribute association points can be used to express fields in other application scenarios.
步骤4,对字段进行编号排序Step 4: Sort the fields by number
在渲染多字段密集点云时,需要根据用户指定的字段快速呈现相应的可视化效果,用户选中某一字段则返回该字段的编号,将该编号发送给图形渲染流水线中的着色器。因此,需要建立“字段名称-字段编号”间的一一对应关系。本发明根据标准点云文件文件头中的字段排列顺序,对每个字段建立如图2所示的编号。编号的数据类型为无符号整型变量,当用户切换选中某一字段时,该字段被激活,成为活动字段,对应编号值被送入顶点着色器,顶点着色器根据该编号在相应字段上进行并行着色处理。When rendering multi-field dense point clouds, it is necessary to quickly present the corresponding visualization effects according to the fields specified by the user. When the user selects a field, the number of the field is returned, and the number is sent to the shader in the graphics rendering pipeline. Therefore, it is necessary to establish a one-to-one correspondence between "field name-field number". The present invention establishes a number for each field as shown in Figure 2 based on the field arrangement order in the file header of the standard point cloud file. The data type of the number is an unsigned integer variable. When the user switches to select a field, the field is activated and becomes the active field. The corresponding number value is sent to the vertex shader, and the vertex shader performs parallel shading processing on the corresponding field according to the number.
步骤5,将预定义颜色条序列编码为一维图像序列Step 5: Encode the predefined color bar sequence into a one-dimensional image sequence
在基于GPU图形渲染流水线的多字段密集点云着色中,不仅要考虑用户切换选中的活动字段,还需考虑用户切换选中的颜色条,即根据颜色条呈现着色结果。如图3所示,颜色条是由一组不同颜色的像素所构成的序列,可以根据用户的喜好进行定制化,用户选中不同的颜色条即可呈现点云在当前颜色条下的着色效果。颜色条信息需要经过两个方面的处理方能被图形渲染流水线访问:先被编码转换为一维图像;然后创建一维纹理。设预定义颜色条的数量为d条,每个颜色条包含c个像素(每个颜色条的像素个数并不相同),每个像素包含(r,g,b)三个颜色通道,则生成一维图像序列的具体步骤为:In the multi-field dense point cloud coloring based on the GPU graphics rendering pipeline, it is necessary to consider not only the active field selected by the user, but also the color bar selected by the user, that is, the coloring result is presented according to the color bar. As shown in Figure 3, the color bar is a sequence composed of a group of pixels of different colors, which can be customized according to the user's preferences. The user selects different color bars to present the coloring effect of the point cloud under the current color bar. The color bar information needs to be processed in two aspects before it can be accessed by the graphics rendering pipeline: first, it is encoded and converted into a one-dimensional image; then a one-dimensional texture is created. Assume that the number of predefined color bars is d, each color bar contains c pixels (the number of pixels in each color bar is not the same), and each pixel contains three color channels (r, g, b). The specific steps to generate a one-dimensional image sequence are:
(1)依次从预定义颜色条集合中取出一个颜色条Ci;(1) Take a color bar Ci from the predefined color bar set in turn;
(2)为Ci分配3*c字节大小的内存Mi;(2) Allocate 3*c bytes of memory Mi for Ci ;
(3)依次将Ci的每个像素写入Mi;(3) Write each pixel of Ci into Mi in sequence;
(4)Ci的像素数量c和对应的内存Mi,构建一维图像Ii,即Ii的像素宽度等于c,像素高度等于1,像素数据为Mi (4) The number of pixels c of Ci and the corresponding memory Mi construct a one-dimensional image Ii , that is, the pixel width of Ii is equal to c, the pixel height is equal to 1, and the pixel data is Mi
(5)将Ii记录在一维图像序列L={Ii}中。(5) Record I i in a one-dimensional image sequence L = {I i }.
步骤6,对一维图像序列进行分段线性内插升采样得到像素宽度为ni的一维图像序列。Step 6: Perform piecewise linear interpolation upsampling on the one-dimensional image sequence to obtain a one-dimensional image sequence with a pixel width of n i .
储存在内存中的一维图像序列,无法直接被图形渲染流水线访问,需要转换为纹理,方能被着色器读取。纹理是渲染引擎中一种特殊的数据结构,用于在内存和显存中交换颜色信息。当纹理尺寸为2的幂次方大小时,不仅可以提高纹理在显存中的对齐程度和访问效率,更可以跨平台时的兼容性。因此,在向图形渲染流水线传输颜色信息前,本发明对在步骤5中生成的一维图像序列进行分段线性内插升采样,使得每个一维图像的像素宽度自适应地调节为最接近wi(wi为一维图像Ii的像素宽度)的正整数ni(ni等于2的幂次方且ni≥wi)。现从一维图像序列L={Ii}中取出一维图像Ii,则对Ii进行分段线性内插升采样的技术细节如下:The one-dimensional image sequence stored in the memory cannot be directly accessed by the graphics rendering pipeline and needs to be converted into a texture before it can be read by the shader. Texture is a special data structure in the rendering engine, which is used to exchange color information between the memory and the video memory. When the texture size is a power of 2, not only can the alignment degree and access efficiency of the texture in the video memory be improved, but also the compatibility across platforms can be improved. Therefore, before transmitting the color information to the graphics rendering pipeline, the present invention performs piecewise linear interpolation upsampling on the one-dimensional image sequence generated in step 5, so that the pixel width of each one-dimensional image is adaptively adjusted to the positive integer n i (n i is equal to the power of 2 and n i ≥ w i ) closest to w i (w i is the pixel width of the one-dimensional image I i ). Now take out the one-dimensional image I i from the one-dimensional image sequence L = {I i }, and the technical details of the piecewise linear interpolation upsampling of I i are as follows:
(1)通过按位与操作快速判断wi是否等于2的幂次方;(1) Quickly determine whether w i is equal to a power of 2 through a bitwise AND operation;
boolisPowerOfTwo=wi&(wi-1)boolisPowerOfTwo= wi &(wi - 1)
若isPowerOfTwo为真,代表wi等于2的幂次方,无需对Ii进行内插升采样;否则,转(2);If isPowerOfTwo is true, it means that w i is equal to the power of 2, and there is no need to perform interpolation upsampling on I i ; otherwise, go to (2);
(2)计算出最接近wi的正整数ni(ni等于2的幂次方且ni>wi)(2) Calculate the positive integer n i that is closest to w i (n i is equal to the power of 2 and n i > w i )
在上式中,首先通过对数换底公式计算以2为底时wi的对数a;然后对a加上偏移量0.5并向下取整,得到与a最接近的正整数b;接着计算2的b次方得到c;最后判断c与wi的大小:若c<wi,则ni等于2c;否则,ni等于c;其中公式中国floor函数表示向下取整计算。In the above formula, first, the logarithm a of wi with base 2 is calculated using the logarithmic base conversion formula; then an offset of 0.5 is added to a and rounded down to obtain the positive integer b closest to a; then 2 is raised to the power of b to obtain c; finally, the size of c and wi is determined: if c< wi , then ni is equal to 2c; otherwise, ni is equal to c; the floor function in the formula represents the rounding down calculation.
(3)根据目标像素宽度执行分段线性内插升采样(3) Perform piecewise linear interpolation upsampling according to the target pixel width
为了确保着色后的点云色彩可以平滑无缝衔接,本发明采用如图3所示的分段线性内插升采样策略。分段线性内插升采样的总体思路表述为,在Ii中的两个相邻像素间线性内插出指定个数的像素。具体技术细节为:In order to ensure that the color of the colored point cloud can be smoothly and seamlessly connected, the present invention adopts a piecewise linear interpolation upsampling strategy as shown in Figure 3. The general idea of piecewise linear interpolation upsampling is to linearly interpolate a specified number of pixels between two adjacent pixels in I i . The specific technical details are:
①确立Ii中两个相邻像素间需要内插的像素数量① Determine the number of pixels that need to be interpolated between two adjacent pixels in I i
因内插前Ii的像素宽度为wi,内插后Ii的像素宽度为ni,则需插入ni-wi个像素。本发明均匀地在相邻像素间进行线性内插,以降低内插前后的颜色条差异,即在Ii中两个相邻像素间的平均内插次数为若ni-wi无法被wi-1整除,则余数被添加到末尾两个相邻像素间,即末尾两个相邻像素间要插入的像素个数为Since the pixel width of I i before interpolation is w i and the pixel width of I i after interpolation is n i , n i - w i pixels need to be interpolated. The present invention uniformly performs linear interpolation between adjacent pixels to reduce the color bar difference before and after interpolation, that is, the average number of interpolation times between two adjacent pixels in I i is If n i - w i cannot be divided by w i - 1, the remainder is added between the last two adjacent pixels, that is, the number of pixels to be inserted between the last two adjacent pixels is
其中,mod(a,b)代表取的余数。Among them, mod(a,b) represents the The remainder of .
②以像素坐标距离为权重执行分段线性内插②Perform piecewise linear interpolation with pixel coordinate distance as weight
在确立好两个相邻像素间的需要内插的像素数量t后,则可根据下式依次在两个相邻像素间执行分段线性内插,即有After determining the number of pixels t that need to be interpolated between two adjacent pixels, piecewise linear interpolation can be performed between two adjacent pixels in sequence according to the following formula:
在上式中,j(0≤j≤t+2)代表当前内插像素的位置,t+2代表分段内插后总的像素数量,p,q为内插时的线性权重,p0,p1代表Ii中两个相邻像素的像素值,pj代表内插结果。因此,当j=0时,pj=p0;j=t+2时,pj=p1。In the above formula, j (0≤j≤t+2) represents the position of the current interpolation pixel, t+2 represents the total number of pixels after segment interpolation, p, q are the linear weights during interpolation, p0 , p1 represent the pixel values of two adjacent pixels in Ii , and pj represents the interpolation result. Therefore, when j=0, pj = p0 ; when j=t+2, pj = p1 .
步骤7,创建一维图像序列对应的一维纹理序列,一维图像序列是指分段线性内插升采样后的一维图像序列Step 7: Create a one-dimensional texture sequence corresponding to the one-dimensional image sequence. The one-dimensional image sequence refers to the one-dimensional image sequence after piecewise linear interpolation upsampling.
一维图像序列中的元素需转换为一维纹理后方能被图形渲染流水线访问。在经典通用的三维渲染引擎OpenGL/OpenGL ES(OpenGL ES为OpenGL的嵌入式移动版本)中,根据给定图像创建纹理的基本路程如下:The elements in the one-dimensional image sequence need to be converted into a one-dimensional texture before they can be accessed by the graphics rendering pipeline. In the classic general-purpose three-dimensional rendering engine OpenGL/OpenGL ES (OpenGL ES is the embedded mobile version of OpenGL), the basic process of creating a texture based on a given image is as follows:
(1)调用glGenTextures向渲染引擎申请纹理句柄;(1) Call glGenTextures to request a texture handle from the rendering engine;
(2)调用glBindTexture将纹理句柄切换为当前激活的资源,后续处理都是针对该激活纹理的操作;(2) Call glBindTexture to switch the texture handle to the currently activated resource. All subsequent processing is for the operation of the activated texture.
(3)调用glTexParameteri设定纹理缩小/放大时的滤波方式(GL_TEXTURE_MIN_FILTER/TEXTURE_MAG_FILTER),在本发明中为了实现密集点云自然平滑的着色效果,使用线性滤波GL_LINEAR作为glTexParameteri的参数;(3) Call glTexParameteri to set the filtering mode (GL_TEXTURE_MIN_FILTER/TEXTURE_MAG_FILTER) when the texture is reduced/enlarged. In the present invention, in order to achieve a natural and smooth shading effect of dense point clouds, the linear filter GL_LINEAR is used as the parameter of glTexParameteri;
(4)调用glTexImage1D将一维图像数据载入到纹理。(4) Call glTexImage1D to load the one-dimensional image data into the texture.
步骤8,将预处理结果(预处理结果为步骤1、2、3中处理的结果)传入图形渲染流水线的顶点着色器Step 8: pass the preprocessing result (the preprocessing result is the result of processing in steps 1, 2, and 3) to the vertex shader of the graphics rendering pipeline
为了降低CPU端向GPU端的传输次数,提高图形渲染流水线的访问和绘制效率,本发明根据上述每种处理结果的数据特性,采用不同的传输机制分别将非二值字段的取值范围、用户当前选中的活动字段编号、多元组序列集合以及用户当前选中的活动纹理传入顶点着色器,具体实施步骤如下:In order to reduce the number of transmissions from the CPU to the GPU and improve the access and drawing efficiency of the graphics rendering pipeline, the present invention adopts different transmission mechanisms to transfer the value range of the non-binary field, the active field number currently selected by the user, the multi-tuple sequence set and the active texture currently selected by the user to the vertex shader according to the data characteristics of each of the above processing results. The specific implementation steps are as follows:
(1)使用uniform一致性变量传递非二值字段的取值范围和用户当前选中的活动字段编号(1) Use uniform consistency variables to pass the value range of non-binary fields and the number of the active field currently selected by the user
一致性变量是着色器中的特殊变量,其在图形渲染流水线的处理流程中不可被修改,只能通过CPU向GPU进行传递、更新。一致性变量可以传递各种数据类型,包括基本数据类型如bool、int、uint、float、double等,以及以向量形式组织的多元组数据类型如vec2、vec3、vec4等,更可以传递矩阵等高级数据类型如mat2,mat3,mat4等。在本发明中,非二值字段的取值范围采用vec2进行传递,即非二值字段的取值范围的最小值和最大值均用浮点类型表示;用户当前选中的活动字段编号采用uint进行传递,即用户当前选中的活动字段编号用无符号整型表示。A consistency variable is a special variable in the shader, which cannot be modified in the processing flow of the graphics rendering pipeline and can only be transferred and updated from the CPU to the GPU. A consistency variable can transfer various data types, including basic data types such as bool, int, uint, float, double, etc., as well as multi-tuple data types organized in the form of vectors such as vec2, vec3, vec4, etc., and can also transfer advanced data types such as matrices such as mat2, mat3, mat4, etc. In the present invention, the value range of a non-binary field is transferred using vec2, that is, the minimum and maximum values of the value range of a non-binary field are both represented by a floating point type; the active field number currently selected by the user is transferred using uint, that is, the active field number currently selected by the user is represented by an unsigned integer.
(2)使用VAO和VBO传输多元组序列集合(2) Using VAO and VBO to transfer a set of tuple sequences
为了降低三维渲染时的状态切换次数并提升CPU与GPU间的传输效率,本发明使用顶点数组对象VAO(Vertex Array Object)和顶点缓冲对象VBO(Vertex Buffer Object)相结合的方式来进一步提升多字段密集点云的着色渲染效率。VAO和VBO的组织架构关系如图4所示:VAO含有16个顶点属性关联点,每一个顶点属性关联点对应一个VBO,每个VBO可以绑定一组顶点属性数组。因此,可以将步骤3中经过多元编组后形成的多元组序列转换为VBO上的顶点属性数组。通过预先设定VAO上VBO所绑定的顶点属性数组,在每一帧的渲染中无需调用绘制命令,仅需激活VAO即可高效完成密集点云的渲染。In order to reduce the number of state switching times during three-dimensional rendering and improve the transmission efficiency between the CPU and the GPU, the present invention uses a combination of vertex array objects VAO (Vertex Array Object) and vertex buffer objects VBO (Vertex Buffer Object) to further improve the shading rendering efficiency of multi-field dense point clouds. The organizational structure relationship between VAO and VBO is shown in Figure 4: VAO contains 16 vertex attribute association points, each vertex attribute association point corresponds to a VBO, and each VBO can be bound to a set of vertex attribute arrays. Therefore, the multi-tuple sequence formed after multi-tuple grouping in step 3 can be converted into a vertex attribute array on the VBO. By pre-setting the vertex attribute array bound to the VBO on the VAO, there is no need to call the drawing command in the rendering of each frame. Only the VAO needs to be activated to efficiently complete the rendering of the dense point cloud.
在实际应用中,VBO的使用方式包含两种:“一对一”和“多对一”。在“一对一”使用方式中,每个顶点属性数组绑定在一个VBO中;而在“多对一”使用方式中,可以将多个顶点属性数组绑定在一个VBO。根据VAO的渲染特性,“一对一”和“多对一”都可以提升渲染效率,但“多对一”可以减少VBO的创建数量,降低硬件资源的消耗。因此,本发明采用“多对一”的方案。综上所述,本发明基于VAO和VBO提升渲染性能的关键步骤如下:In practical applications, there are two ways to use VBO: "one-to-one" and "many-to-one". In the "one-to-one" usage mode, each vertex attribute array is bound to a VBO; in the "many-to-one" usage mode, multiple vertex attribute arrays can be bound to a VBO. According to the rendering characteristics of VAO, both "one-to-one" and "many-to-one" can improve rendering efficiency, but "many-to-one" can reduce the number of VBOs created and reduce the consumption of hardware resources. Therefore, the present invention adopts the "many-to-one" solution. In summary, the key steps of the present invention to improve rendering performance based on VAO and VBO are as follows:
①调用glGenVertexArrays创建VAO句柄;①Call glGenVertexArrays to create a VAO handle;
②调用glBindVertexArray激活VAO;②Call glBindVertexArray to activate VAO;
③调用glGenBuffers创建VBO句柄;③Call glGenBuffers to create a VBO handle;
④调用glBindBuffer激活VBO;④Call glBindBuffer to activate VBO;
⑤将多元组序列集合映射为VBO顶点属性数组;⑤Map the tuple sequence set to a VBO vertex attribute array;
设多元组序列集合中含有kx个多元组序列,每个多元组序列储存了nx个点在mx个字段上的数据记录,则将多元组序列集合转换为VBO上的顶点属性数组的步骤如下:Assume that the tuple sequence set contains k x tuple sequences, each of which stores data records of n x points on m x fields. The steps to convert the tuple sequence set into the vertex attribute array on the VBO are as follows:
a.计算多元组序列集合所需的总字节数a. Calculate the total number of bytes required for a set of tuple sequences
为了兼顾字节对齐以提高读取效率,本发明在转换过程中将每个字段的数据类型均取为float浮点类型(float数据类型的字节长度为4,4字节是各硬件平台默认的字节对齐长度),则多元组序列集合所需的总字节数为 In order to take byte alignment into consideration to improve reading efficiency, the data type of each field in the present invention is set to float floating point type during the conversion process (the byte length of the float data type is 4, and 4 bytes is the default byte alignment length of each hardware platform). The total number of bytes required for the tuple sequence set is
b.根据总字节数调用glBufferData分配VBO的内存空间b. Call glBufferData to allocate VBO memory space according to the total number of bytes
c.从第1个多元组序列开始,依次将多元组序列集合L中每个多元组序列D映射为VBO上的顶点属性数组,映射过程的伪代码如下:c. Starting from the first tuple sequence, map each tuple sequence D in the tuple sequence set L to the vertex attribute array on the VBO. The pseudo code of the mapping process is as follows:
其中,target代表顶点数组对象取GL_ARRAY_BUFFER,offset代表D在VBO中的字节偏移量,size代表D的字节大小,data代表D的起始内存地址,因此函数glBufferSubData将D存储的数据拷贝到VBO预分配的内存中;location代表D被映射到VBO上的位置索引,GL_FLOAT代表数据类型为浮点型,GL_FALSE代表被映射的是非归一化的顶点属性,stride代表顶点属性数组中相邻元素的步长间隔,默认取0,代表以紧凑格式排列,因此函数glVertexAttribPointer指定D在VBO上的映射位置和数据组织格式。Among them, target represents the vertex array object and takes GL_ARRAY_BUFFER, offset represents the byte offset of D in VBO, size represents the byte size of D, data represents the starting memory address of D, so the function glBufferSubData copies the data stored in D to the memory pre-allocated by VBO; location represents the position index where D is mapped to VBO, GL_FLOAT represents that the data type is floating point type, GL_FALSE represents that the mapped vertex attribute is a non-normalized vertex attribute, stride represents the step interval between adjacent elements in the vertex attribute array, and the default value is 0, which means that it is arranged in a compact format, so the function glVertexAttribPointer specifies the mapping position and data organization format of D on VBO.
映射到VBO上的顶点属性数组在图形渲染流水线工作期间被关联到顶点着色器的顶点属性关联点上。顶点着色器通过VBO记录的位置索引访问相应关联点上的顶点属。关于VAO和VBO更为详尽的技术细节,可以参见文献《TheGraphics System:ASpecification》。The vertex attribute array mapped to the VBO is associated with the vertex attribute association point of the vertex shader during the graphics rendering pipeline. The vertex shader accesses the vertex attributes at the corresponding association point through the position index recorded in the VBO. For more detailed technical details about VAO and VBO, please refer to the document "The Graphics System: A Specification.
(3)将用户当前选中的活动纹理作为uniform sampler1D一致性一维采样器绑定到顶点着色器(3) Bind the active texture currently selected by the user as a uniform sampler1D consistent one-dimensional sampler to the vertex shader
sampler(包含sampler1D、sampler2D和sampler3D三种采样器)采样器是GLSL(Graphics Library Shading Language)预定义的不透明数据类型(Opaque Types),所谓不透明数据类型是指这种数据类型无法直接通过变量的读写来访问,只能通过GLSL内建的特殊函数(如texture1D、texture2D、texture3D)来实现读取。通用图形渲染流水线一般附带8个采样器绑定点,着色器可以同时访问8个纹理以实现多重纹理效果。纹理被绑定到相应位置上即成为可以被着色器访问的采样器,通过采样器,着色器可以基于纹理坐标访问特定位置处的像素信息。对采样器更为详细的技术介绍,可以参见文献《TheShading Language》。Sampler (including sampler1D, sampler2D and sampler3D) samplers are opaque data types (Opaque Types) predefined in GLSL (Graphics Library Shading Language). The so-called opaque data type means that this data type cannot be accessed directly through the reading and writing of variables, and can only be read through special functions built into GLSL (such as texture1D, texture2D, texture3D). The general graphics rendering pipeline generally comes with 8 sampler binding points, and the shader can access 8 textures at the same time to achieve multiple texture effects. The texture is bound to the corresponding position and becomes a sampler that can be accessed by the shader. Through the sampler, the shader can access pixel information at a specific position based on the texture coordinates. For a more detailed technical introduction to samplers, please refer to the document "The Shading Language.
二、GPU端并行着色2. GPU-side parallel shading
并行渲染是GPU图形渲染流水线的固有特性。在执行复杂三维模型的渲染任务时,GPU上的每个图形处理单元可以同时执行相互独立的图形渲染流水线(每个图形处理单元对应一条图形渲染流水线,每条图形渲染流水线处理某个图元;在本发明中,每条图形渲染流水线中的顶点着色器应对密集点云中的一个点,顶点着色器可以访问该点的所有字段信息),达到并行加速的目的。图5是OpenGL图形渲染流水线的渲染流程,其中包含两类模块:可编程模块(图5虚线框对应的模块)和不可编程模块(图5实线框对应的模块)。有关各个模块的详细技术细节,可以查阅文献《TheShading Language》。Parallel rendering is an inherent characteristic of the GPU graphics rendering pipeline. When performing rendering tasks for complex three-dimensional models, each graphics processing unit on the GPU can simultaneously execute independent graphics rendering pipelines (each graphics processing unit corresponds to a graphics rendering pipeline, and each graphics rendering pipeline processes a certain primitive; in the present invention, the vertex shader in each graphics rendering pipeline corresponds to a point in a dense point cloud, and the vertex shader can access all field information of the point), thereby achieving the purpose of parallel acceleration. Figure 5 is the rendering process of the OpenGL graphics rendering pipeline, which includes two types of modules: programmable modules (modules corresponding to the dotted boxes in Figure 5) and non-programmable modules (modules corresponding to the solid boxes in Figure 5). For detailed technical details about each module, please refer to the document "The Shading Language.
本发明主要涉及图形渲染流水线的两大可编程模块(顶点着色器和片元着色器)以及两大不可编程模块(顶点后处理和光栅化)。The present invention mainly relates to two major programmable modules (vertex shader and fragment shader) and two major non-programmable modules (vertex post-processing and rasterization) of a graphics rendering pipeline.
步骤9,将顶点坐标从全局坐标系转换至剪裁坐标系Step 9: Convert vertex coordinates from the global coordinate system to the clip coordinate system
空间透视变换是顶点着色器必须执行的操作,其目的是将输入的顶点位置信息从全局坐标系转换到剪裁坐标系,为后续的处理提供依据。顶点着色器执行空间透视变换以实现将输入的顶点位置信息从全局坐标系转换到剪裁坐标系,空间透视变换可以用矩阵操作表示,即有The spatial perspective transformation is an operation that the vertex shader must perform. Its purpose is to transform the input vertex position information from the global coordinate system to the clip coordinate system to provide a basis for subsequent processing. The vertex shader performs the spatial perspective transformation to transform the input vertex position information from the global coordinate system to the clip coordinate system. The spatial perspective transformation can be represented by matrix operations, that is,
其中,M代表模型视图矩阵,(x,y,z,1)T代表顶点在全局坐标系下的坐标,(xe,ye,ze,we)T代表顶点在相机坐标系下的坐标,P代表投影矩阵,(xc,yc,zc,wc)T代表顶点在剪裁坐标系下的坐标。Among them, M represents the model view matrix, (x,y,z,1) T represents the coordinates of the vertex in the global coordinate system, (x e ,y e ,ze e , we ) T represents the coordinates of the vertex in the camera coordinate system, P represents the projection matrix, and (x c ,y c ,z c ,w c ) T represents the coordinates of the vertex in the clip coordinate system.
步骤10,根据活动字段类型对顶点进行着色Step 10: Color the vertices according to the active field type
对顶点进行着色是顶点着色器的核心任务。由于用户当前切换选中的字段类型分为二值字段和非二值字段,两种类型的字段需要采用不同的着色机制。具体的着色技术细节表述如下:Coloring vertices is the core task of the vertex shader. Since the field types currently selected by the user are divided into binary fields and non-binary fields, the two types of fields require different coloring mechanisms. The specific coloring technology details are described as follows:
(1)若活动字段编号所对应的字段类型为颜色字段,则直接从顶点颜色属性数组中取出该顶点的颜色作为着色结果;(1) If the field type corresponding to the active field number is a color field, the color of the vertex is directly taken from the vertex color attribute array as the shading result;
(2)若活动字段编号所对应的字段类型为二值字段,则将活动纹理的首末位置处的像素分别作为二值字段取0和1时的着色结果;(2) If the field type corresponding to the active field number is a binary field, the pixels at the first and last positions of the active texture are used as the coloring results when the binary field takes 0 and 1 respectively;
(3)若活动字段编号所对应的字段类型为除颜色字段以外的非二值字段,则根据该非二值字段的取值范围和顶点在该非二值字段上的取值,计算线性纹理坐标,通过线性纹理坐标在活动纹理中取得衔接自然的着色结果。(3) If the field type corresponding to the active field number is a non-binary field other than a color field, the linear texture coordinates are calculated based on the value range of the non-binary field and the value of the vertex on the non-binary field, and a naturally connected shading result is obtained in the active texture through the linear texture coordinates.
在三维可视化中,纹理坐标用于访问纹理特定位置处的像素。如图6所示,给定一张二维纹理,纹理坐标(tx,ty)的原点(0,0)位于左下角,横坐标tx由左向右从0逐渐递增至1;纵坐标ty由下向上从0逐渐递增至1。因此,纹理坐标取值范围为0≤tx,ty≤1。令二维纹理的像素宽度为w,像素高度为h,则位于第i行第j列(0≤i≤w,0≤j≤h)处像素的纹理坐标为In 3D visualization, texture coordinates are used to access pixels at specific locations of the texture. As shown in Figure 6, given a 2D texture, the origin (0,0) of the texture coordinate (tx,ty) is located at the lower left corner, and the horizontal coordinate tx increases from 0 to 1 from left to right; the vertical coordinate ty increases from 0 to 1 from bottom to top. Therefore, the texture coordinate value range is 0≤tx,ty≤1. Let the pixel width of the 2D texture be w and the pixel height be h, then the texture coordinate of the pixel at the i-th row and j-th column (0≤i≤w,0≤j≤h) is
若令传入至顶点着色中的活动纹理所对应的采样器为cb,则可以通过GLSL中预定义函数texture2D获取纹理坐标处的像素值pixelValue,即有If the sampler corresponding to the active texture passed into the vertex shading is cb, the pixel value pixelValue at the texture coordinate can be obtained through the predefined function texture2D in GLSL, that is,
vec4 pixelValue=texture2D(cb,vec2(tx,ty))vec4 pixelValue = texture2D(cb,vec2(tx,ty))
在本发明中cb对应的纹理为一维纹理,因此上述代码可以精简为In the present invention, the texture corresponding to cb is a one-dimensional texture, so the above code can be simplified to
vec4 pixelValue=texture1D(cb,vec2(tx,0))vec4 pixelValue = texture1D(cb,vec2(tx,0))
其中vec4是GLSL的内嵌数据类型,代表像素在(r,g,b,w)四个通道上的取值(0≤r,g,b,w≤1);vec2为二维向量,代表纹理坐标。Among them, vec4 is the built-in data type of GLSL, which represents the value of the pixel on the four channels (r, g, b, w) (0≤r, g, b, w≤1); vec2 is a two-dimensional vector, representing the texture coordinates.
因此,若用户切换选中的活动字段为二值字段,则当前顶点在该字段上的取值为0时,其对应的着色结果为Therefore, if the user switches the selected active field to a binary field, when the value of the current vertex on this field is 0, the corresponding shading result is
vec4 pixelValue=texture1D(cb,vec2(0,0))vec4 pixelValue = texture1D(cb,vec2(0,0))
否则,着色结果为Otherwise, the coloring result is
vec4 pixelValue=texture1D(cb,vec2(1,0))vec4 pixelValue = texture1D(cb,vec2(1,0))
若用户切换选中的活动字段为非二值字段,则计算线性纹理坐标(lx,0),通过(lx,0)从活动纹理中取出衔接自然的像素作为着色结果。下面以活动字段为高程字段为例,阐述计算lx的具体过程。令高程字段的取值范围为[hmin,hmax],当前顶点的高程为hv(hv为顶点坐标的z分量),活动纹理的像素宽度为wt,则有If the active field selected by the user is a non-binary field, the linear texture coordinate (lx, 0) is calculated, and the naturally connected pixel is taken from the active texture through (lx, 0) as the shading result. The following takes the active field as the elevation field as an example to explain the specific process of calculating lx. Let the value range of the elevation field be [h min ,h max ], the elevation of the current vertex be h v (h v is the z component of the vertex coordinates), and the pixel width of the active texture be w t , then
在上式中,rn代表高程的空间范围,su代表活动纹理中单个像素所对应的高程范围,则pi代表当前点被映射到活动纹理中的像素位置,lx为计算出的线性纹理坐标。In the above formula, rn represents the spatial range of elevation, su represents the elevation range corresponding to a single pixel in the active texture, then pi represents the pixel position of the current point mapped to the active texture, and lx is the calculated linear texture coordinate.
步骤11,对着色后的顶点执行后处理Step 11: Post-process the shaded vertices
经顶点着色器着色后的顶点,需经过后处理阶段方能被光栅化(在本发明中,着色的对象是密集点云中的每个点,因此无需曲面着色器和几何着色器的参与)。顶点后处理阶段的流程如图7所示,具体包含变换反馈(Transform Feedback)、图元装配(PrimitiveAssembly)、视椎体剪裁(Viewing Volume Clipping)、透视除法(Perspective Division)以及视口变换(Viewport Transform)5个环节。其中,与本发明密切相关的是最后三个环节。After being colored by the vertex shader, the vertices need to go through the post-processing stage before they can be rasterized (in the present invention, the object to be colored is each point in the dense point cloud, so there is no need for the participation of the surface shader and the geometry shader). The process of the vertex post-processing stage is shown in Figure 7, which specifically includes five links: transform feedback, primitive assembly, viewing volume clipping, perspective division, and viewport transform. Among them, the last three links are closely related to the present invention.
(1)视椎体剪裁(1) Visual cone clipping
视椎体剪裁的目的是为了剔除外点,减少额外的绘制开销(如图8所示,外点被定义为位于视椎体以外的点)。透视投影下的平头视椎体由六个截面构成,其由透视投影矩阵P确立。通过在相机坐标系e-xyz下指定lp,rp,tp,bp,np,fp六个截面的位置可以构建P,即有The purpose of frustum clipping is to remove outliers and reduce extra drawing overhead (as shown in Figure 8, outliers are defined as points outside the frustum). The frustum under perspective projection consists of six sections, which are established by the perspective projection matrix P. P can be constructed by specifying the positions of the six sections l p , r p , t p , b p , n p , and f p in the camera coordinate system e-xyz, that is,
其中lp代表左截面,rp代表右截面,tp代表上截面,bp代表下截面,np代表近截面,fp代表远截面。P将相机坐标变换为剪裁坐标,在剪裁坐标系c-xyz下,剪裁坐标(xc,yc,zc,wc)各个分量的关系如下Where lp represents the left section, rp represents the right section, tp represents the upper section, bp represents the lower section, np represents the near section, and fp represents the far section. P transforms the camera coordinates into clipping coordinates. In the clipping coordinate system c-xyz, the relationship between the components of the clipping coordinates ( xc , yc , zc , wc ) is as follows
其中,ze为相机坐标的z分量。若某个点在经过透视投影变换后,不满足上述不等式,则说明该点为外点,予以剔除。Where ze is the z component of the camera coordinates. If a point does not satisfy the above inequality after perspective projection transformation, it means that the point is an external point and should be removed.
透视投影矩阵P的截面位置与透视变形以及深度冲突(z-fighting)息息相关,因此本发明从以下两个方面入手,对六个截面的位置进行自适应调整,确立最佳的投影矩阵参数,保证密集点云的可视化效果。The cross-sectional positions of the perspective projection matrix P are closely related to perspective deformation and depth conflict (z-fighting). Therefore, the present invention starts from the following two aspects to adaptively adjust the positions of the six cross sections, establish the optimal projection matrix parameters, and ensure the visualization effect of the dense point cloud.
①根据视点位姿和密集点云的最小外包自适应调整远近截面① Adaptively adjust the near and far sections according to the viewpoint pose and the minimum outer cover of the dense point cloud
由于顶点深度信息在深度缓冲区中是非线性分布的(即近截面附近的深度精度更高,远截面附近的深度精度较低),若近截面离视点很近,远截面离视点很远,会导致渲染大场景密集点云时产生深度冲突现象(与场景交互时会产生闪烁现象),影响可视化效果。因此,本发明设计了一种自适应调整远近截面位置的机制,根据视点位姿和场景最小外包自动计算每一帧中的远近截面位置,优化深度缓冲区的精度分布。Since vertex depth information is nonlinearly distributed in the depth buffer (i.e., the depth accuracy near the near section is higher, and the depth accuracy near the far section is lower), if the near section is very close to the viewpoint and the far section is very far from the viewpoint, it will cause depth conflict when rendering a large scene dense point cloud (flickering will occur when interacting with the scene), affecting the visualization effect. Therefore, the present invention designs a mechanism for adaptively adjusting the position of the near and far sections, automatically calculating the position of the near and far sections in each frame according to the viewpoint posture and the minimum outsourcing of the scene, and optimizing the accuracy distribution of the depth buffer.
如图9所示,密集点云的最小外包BB的空间范围在全局坐标系o-xyz下由其八个角点确立,视点位姿由相机坐标系e-xyz确立。可以根据视点的视线方向获取距离视点位置e最近和最远的一对角点p和q,则近截面np和远截面fp分别等于/>和/>在/>上的投影长度,即有As shown in Figure 9, the spatial range of the minimum outer bounding box BB of the dense point cloud is established by its eight corner points in the global coordinate system o-xyz, and the viewpoint pose is established by the camera coordinate system e-xyz. Get the pair of corner points p and q closest to and farthest from the viewpoint position e, then the near section np and the far section fp are equal to/> and/> In/> The projection length on
通过以上机制可以保证远近截面在相机坐标系的z轴方向上恰好锁定密集点云的整体深度范围,充分利用深度缓冲区的精度分布,防止深度冲突。The above mechanism can ensure that the near and far sections just lock the overall depth range of the dense point cloud in the z-axis direction of the camera coordinate system, make full use of the precision distribution of the depth buffer, and prevent depth conflicts.
②根据视口大小的比例关系自适应调整上下左右截面②Adaptively adjust the upper, lower, left, and right sections according to the proportion of the viewport size
上下左右截面位置应与渲染视口的尺寸相互协调,以防止透视变形。如图10所示,本发明根据视口宽高比λ以及近截面np和远截面fp确立上下左右四个截面的位置,即有The positions of the upper, lower, left, and right sections should be coordinated with the size of the rendering viewport to prevent perspective distortion. As shown in FIG10 , the present invention establishes the positions of the upper, lower, left, and right sections according to the viewport aspect ratio λ and the near section np and the far section fp , that is,
在上式中,a代表视场角FOV(Field Of View Angle);r代表的正切值,其等于视口高度h与2np的比值;λ代表视口宽高比,w为视口宽度,h为视口高度。图10中的加粗线条即为调整后的四个截面位置。In the above formula, a represents the field of view FOV (Field Of View Angle); r represents The tangent value of is equal to the ratio of the viewport height h to 2n p ; λ represents the viewport aspect ratio, w is the viewport width, and h is the viewport height. The bold lines in Figure 10 are the four adjusted cross-section positions.
在实际应用中,当视口大小发生变化时需要调整上下左右截面;当改变视点位姿或者重载加载密集点云时需要调整远近截面。In practical applications, when the viewport size changes, the upper, lower, left, and right sections need to be adjusted; when the viewpoint posture is changed or a dense point cloud is overloaded, the near and far sections need to be adjusted.
(2)透视除法(2) Perspective division
透视除法将剪裁坐标转换为归一化设备坐标NDC(Normalized DeviceCoordinate),作为后续视口变换的依据。如图11所示,归一化设备坐标位于左手坐标系n-xyz中,其是通过下式的透视除法获取的,即有Perspective division converts the clipping coordinates into normalized device coordinates NDC (Normalized Device Coordinate) as the basis for subsequent viewport transformation. As shown in Figure 11, the normalized device coordinates are in the left-hand coordinate system n-xyz, which is obtained by the perspective division of the following formula, that is,
因此,xn,yn,zn取值范围均位于[-1,1]之间。Therefore, the value range of xn , yn , and zn is all between [-1,1].
(3)视口变换(3) Viewport Transformation
视口变换将归一化设备坐标转换为窗口坐标,为后续光栅化提供依据。窗口坐标系w-xyz的起点w(xo,yo,zo)在默认配置下位于视口左下角(也可以指定为左上角或者由用户自定义),x轴向右,y轴向上,z轴指向屏幕内部。令视口的宽为w,高为h,则可以通过下式将归一化设备坐标(xn,yn,zn)转化窗口坐标(xw,yw,zw),即有The viewport transformation converts the normalized device coordinates into window coordinates, providing a basis for subsequent rasterization. The starting point w ( xo , yo , zo ) of the window coordinate system w-xyz is located at the lower left corner of the viewport in the default configuration (it can also be specified as the upper left corner or customized by the user), the x-axis is to the right, the y-axis is upward, and the z-axis points to the inside of the screen. Let the width of the viewport be w and the height be h, then the normalized device coordinates ( xn , yn , zn ) can be converted to window coordinates ( xw , yw , zw ) by the following formula, that is,
其中,df代表远截面处的深度值,dn代表近截面处的深度值,默认配置下,df=1,dn=0。在进行视口转换前,可以调用OpenGL的接口glViewport指定视口的起点与宽高,调用glDepthRange指定深度取值范围。Where df represents the depth value at the far section, dn represents the depth value at the near section, and in the default configuration, df = 1, dn = 0. Before viewport conversion, you can call the OpenGL interface glViewport to specify the start point, width and height of the viewport, and call glDepthRange to specify the depth range.
步骤12,对光栅化后的片元进行着色Step 12: Color the rasterized fragments
在前述处理中,所处理的图元均是矢量点,矢量点需要经过光栅化处理方能呈现在屏幕上。光栅化根据窗口坐标将矢量点栅格化为片元(片元可以理解为像素单元),然后在片元着色器中对片元进行着色。在光栅化过程中,矢量点被栅格化后所占的像素区域大小称之为点的尺寸大小ps,ps可以通过下式计算In the above processing, the processed primitives are all vector points, which need to be rasterized before they can be displayed on the screen. Rasterization rasterizes vector points into fragments (fragments can be understood as pixel units) according to window coordinates, and then shades the fragments in the fragment shader. In the rasterization process, the size of the pixel area occupied by the vector point after rasterization is called the point size ps , which can be calculated by the following formula
其中,us是通过glPointSize所指定的尺寸大小,ax,by,cz是距离衰减系数,dw代表矢量顶点到视点的距离,maxs,mins代表栅格化后尺度大小的上下限,clamp将ps的取值限定在[mins,maxs]范围内;ax,by,cz,maxs,mins的取值与具体的硬件实现相关。Among them, u s is the size specified by glPointSize, a x , by , c z is the distance attenuation coefficient, d w represents the distance from the vector vertex to the viewpoint, max s , min s represent the upper and lower limits of the scale size after rasterization, and clamp limits the value of p s to the range of [min s , max s ]; the values of a x , by , c z , max s , min s are related to the specific hardware implementation.
为了提高密集点云的可视化效果,本发明在光栅化阶段开启抗锯齿。如图12所示,在开启抗锯齿的条件下,图形渲染流水线会以待栅格化的矢量点为圆心,构建半径为ps的圆形区域。片元中心坐标位于该圆形区域内的片元构成矢量点栅格化后的区域。In order to improve the visualization effect of dense point clouds, the present invention turns on anti-aliasing in the rasterization stage. As shown in Figure 12, when anti-aliasing is turned on, the graphics rendering pipeline will construct a circular area with a radius of ps with the vector point to be rasterized as the center. The fragments whose center coordinates are located in the circular area constitute the area after the vector point is rasterized.
光栅化生成的离散片元被送入片元着色器进行最终着色。矢量点经过光栅化后形成的离散化片元拥有相同的属性值(这里的属性值即为顶点着色器输出的着色器结果),片元着色器中利用顶点着色器的输出结果,并行地对离散化的片元进行着色。最后,经过逐片元操作阶段的裁剪测试(Scissor Test)、模版测试(Stencil Test)以及深度测试(DepthTest)后,将可见片元送入帧缓冲区并绘制到屏幕上,完成一帧中的着色处理。The discrete fragments generated by rasterization are sent to the fragment shader for final shading. The discretized fragments formed by rasterization of vector points have the same attribute values (the attribute values here are the shader results output by the vertex shader). The fragment shader uses the output results of the vertex shader to shade the discretized fragments in parallel. Finally, after the clipping test (Scissor Test), stencil test (Stencil Test) and depth test (DepthTest) in the fragment-by-fragment operation stage, the visible fragments are sent to the frame buffer and drawn to the screen, completing the shading process in one frame.
本方案的技术优势、技术效果或技术优点包括:The technical advantages, effects or strengths of this solution include:
(1)相较于传统的CPU逐点串行着色技术路线,本发明利用GPU图形渲染流水线的并行处理能力,快速对多字段密集点云进行定制化着色,满足后续应用分析的实效需求;(1) Compared with the traditional CPU point-by-point serial shading technology route, the present invention uses the parallel processing capability of the GPU graphics rendering pipeline to quickly perform customized shading on multi-field dense point clouds to meet the effectiveness requirements of subsequent application analysis;
(2)本发明的技术手段不仅考虑了标准点云文件的所有字段信息,还可以根据应用场景对字段种类进行扩展,高效呈现密集点云在不同字段、不同颜色条下的着色效果;(2) The technical means of the present invention not only considers all the field information of the standard point cloud file, but also can expand the field types according to the application scenario, and efficiently present the coloring effect of the dense point cloud under different fields and different color bars;
(3)由于本发明基于通用显卡的图形渲染流水线和C++代码进行开发,可以兼容不同的硬件设备,具有可靠的跨平台性。(3) Since the present invention is developed based on the graphics rendering pipeline of a general graphics card and C++ code, it is compatible with different hardware devices and has reliable cross-platform performance.
显然本发明具体实现并不受上述方式的限制,只要采用了本发明的方法构思和技术方案进行的各种非实质性的改进,均在本发明的保护范围之内。Obviously, the specific implementation of the present invention is not limited to the above-mentioned methods. As long as various non-substantial improvements are made using the method concept and technical solution of the present invention, they are all within the protection scope of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410082733.8A CN117911597A (en) | 2024-01-19 | 2024-01-19 | A GPU method for efficient colorization of multi-field dense point clouds |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410082733.8A CN117911597A (en) | 2024-01-19 | 2024-01-19 | A GPU method for efficient colorization of multi-field dense point clouds |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117911597A true CN117911597A (en) | 2024-04-19 |
Family
ID=90690416
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410082733.8A Pending CN117911597A (en) | 2024-01-19 | 2024-01-19 | A GPU method for efficient colorization of multi-field dense point clouds |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117911597A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119313799A (en) * | 2024-12-17 | 2025-01-14 | 山东智和创信息技术有限公司 | A graphics rendering method and system based on GLSL shader |
-
2024
- 2024-01-19 CN CN202410082733.8A patent/CN117911597A/en active Pending
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119313799A (en) * | 2024-12-17 | 2025-01-14 | 山东智和创信息技术有限公司 | A graphics rendering method and system based on GLSL shader |
| CN119313799B (en) * | 2024-12-17 | 2025-04-08 | 山东智和创信息技术有限公司 | Graph rendering method and system based on GLSL (Global navigation satellite System) shader |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113178014B (en) | Scene model rendering method and device, electronic equipment and storage medium | |
| US20230108967A1 (en) | Micro-meshes, a structured geometry for computer graphics | |
| US8570322B2 (en) | Method, system, and computer program product for efficient ray tracing of micropolygon geometry | |
| CN102081804B (en) | Subdividing geometry images in graphics hardware | |
| KR102811127B1 (en) | Octree-based convolutional neural network | |
| EP3026633B1 (en) | Texture processing method and unit | |
| CA2373707A1 (en) | Method and system for processing, compressing, streaming and interactive rendering of 3d color image data | |
| WO2016139488A2 (en) | Method of and apparatus for processing graphics | |
| CN112927339B (en) | Graphics rendering method and device, storage medium and electronic device | |
| CN113593028A (en) | Three-dimensional digital earth construction method for avionic display control | |
| CN115168682B (en) | Large-scale space-time point data LOD drawing method and device | |
| CN116883575B (en) | Building group rendering method, device, computer equipment and storage medium | |
| CN117911597A (en) | A GPU method for efficient colorization of multi-field dense point clouds | |
| US11715253B2 (en) | Pixelation optimized delta color compression | |
| CN114663566A (en) | Loading rendering processing method and system for three-dimensional model | |
| WO2023239799A1 (en) | Systems and methods for efficient rendering and processing of point clouds using textures | |
| CN116758206A (en) | Vector data fusion rendering method and device, computer equipment and storage medium | |
| CN113379814B (en) | Three-dimensional space relation judging method and device | |
| CN115239895B (en) | Mass data loading and optimal rendering method for GIS water environment 3D map | |
| WO2023184139A1 (en) | Methods and systems for rendering three-dimensional scenes | |
| CN120563769B (en) | Voxel data extraction method based on high-fidelity scene | |
| CN111862331A (en) | CPU operation-based model voxelization efficiency optimization method and system | |
| CN105957132B (en) | High-performance rendering optimization method for 3D scenes containing highly complex rendering elements | |
| CN119206028B (en) | Generation method of WebGPU real-time rendering pipeline | |
| CN116958388A (en) | Voxel sampling method, voxel sampling device, electronic device and computer program product |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |