CN111857081A - Performance control method of chip packaging and testing production line based on Q-learning reinforcement learning - Google Patents
Performance control method of chip packaging and testing production line based on Q-learning reinforcement learning Download PDFInfo
- Publication number
- CN111857081A CN111857081A CN202010797879.2A CN202010797879A CN111857081A CN 111857081 A CN111857081 A CN 111857081A CN 202010797879 A CN202010797879 A CN 202010797879A CN 111857081 A CN111857081 A CN 111857081A
- Authority
- CN
- China
- Prior art keywords
- production line
- performance
- station
- production
- rate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B19/00—Programme-control systems
- G05B19/02—Programme-control systems electric
- G05B19/418—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM]
- G05B19/41885—Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control [DNC], flexible manufacturing systems [FMS], integrated manufacturing systems [IMS] or computer integrated manufacturing [CIM] characterised by modeling, simulation of the manufacturing system
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B2219/00—Program-control systems
- G05B2219/30—Nc systems
- G05B2219/32—Operator till task planning
- G05B2219/32339—Object oriented modeling, design, analysis, implementation, simulation language
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Manufacturing & Machinery (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- General Factory Administration (AREA)
Abstract
本发明涉及半导体芯片封装测试生产线性能控制与优化领域,具体为一种基于Q‑learning强化学习的芯片封装测试生产线性能控制方法。本发明建立了更加精确的半导体封装测试串并联生产线性能预测模型,并综合使用Morris筛选法与Arena仿真法开展全局灵敏度定量分析,得到对生产线性能影响最大的若干影响因素及其影响规律,避免了设备马尔科夫状态空间庞大,传统数学模型分析不适用的情况。本发明在性能预测和灵敏度分析的基础上对生产线变动性因素进行控制,并改进参数ε的取值方式,使得算法收敛速度更快并避免局部最优,同时性能控制方法具有更好的灵活性和实时性。
The invention relates to the field of performance control and optimization of a semiconductor chip packaging and testing production line, in particular to a performance control method for a chip packaging and testing production line based on Q-learning reinforcement learning. The invention establishes a more accurate performance prediction model of the series-parallel production line for semiconductor packaging and testing, and comprehensively uses the Morris screening method and the Arena simulation method to carry out quantitative analysis of the global sensitivity, and obtains several influencing factors and their influence rules that have the greatest impact on the performance of the production line, thereby avoiding the need for The equipment Markov state space is huge, and the traditional mathematical model analysis is not applicable. The invention controls the variability factors of the production line on the basis of performance prediction and sensitivity analysis, and improves the value method of the parameter ε, so that the algorithm converges faster and avoids local optimization, and the performance control method has better flexibility. and real-time.
Description
技术领域technical field
本发明涉及半导体芯片封装测试生产线性能控制与优化领域,具体是面向半导体芯片封装测试生产线的,涉及一种结合灵敏度分析和Q-learning强化学习算法的性能控制方法。The invention relates to the field of performance control and optimization of a semiconductor chip packaging and testing production line, in particular to a semiconductor chip packaging and testing production line, and relates to a performance control method combining sensitivity analysis and Q-learning reinforcement learning algorithm.
背景技术Background technique
半导体制造业对国民经济的发展具有巨大的战略价值,为保持我国半导体制造业良好发展,除了扩大生产规模,还需关注制造系统的生产效率,加强生产管理控制技术。由于半导体制造系统具有工艺路径高度重入、生产过程高度复杂、制造周期漫长、系统规模庞大及高度不确定性等生产特点,对生产线进行性能控制难度较大。缓冲区容量大小、设备突发故障、设备预防性维护、产品重工等多种变动性因素也大大影响了制造系统的生产性能,导致生产效率降低,生产周期延长,影响生产计划的正常执行。The semiconductor manufacturing industry has great strategic value for the development of the national economy. In order to maintain the good development of my country's semiconductor manufacturing industry, in addition to expanding the production scale, it is also necessary to pay attention to the production efficiency of the manufacturing system and strengthen the production management and control technology. Due to the production characteristics of semiconductor manufacturing systems such as highly re-entrant process paths, highly complex production processes, long manufacturing cycles, large system scale and high uncertainty, it is difficult to control the performance of the production line. A variety of variable factors such as buffer capacity, sudden equipment failure, equipment preventive maintenance, and product rework also greatly affect the production performance of the manufacturing system, resulting in reduced production efficiency, prolonged production cycle, and impact on the normal execution of production plans.
当前对生产线性能进行智能、全面、动态控制的研究较少,大多局限于生产线变动性的某一方面,未能全局地考察生产线上的多种变动性因素;当前研究中建立的半导体串并联生产线性能预测模型与实际生产情况存在一定偏差,精确度有所欠缺;传统的性能控制优化方法难以针对生产线变动性因素的变化进行实时控制,灵活性不足。At present, there are few studies on intelligent, comprehensive and dynamic control of production line performance, most of which are limited to one aspect of production line variability, and fail to comprehensively examine various variable factors on the production line; the semiconductor series-parallel production line established in the current research There is a certain deviation between the performance prediction model and the actual production situation, and the accuracy is lacking; the traditional performance control optimization method is difficult to control the changes of the production line variability factors in real time, and the flexibility is insufficient.
发明内容SUMMARY OF THE INVENTION
针对现有半导体芯片封装测试生产线性能控制模型与策略的不足,本发明提出了一种基于Q-learning强化学习的芯片封装测试生产线性能控制方法。本发明方法针对现有的变动性因素响应不及时、变动性因素考虑不周全、控制策略存在冲突等问题,结合灵敏度分析和Q-learning强化学习算法对半导体芯片封装测试生产线制造性能进行智能控制。Aiming at the deficiencies of the existing semiconductor chip packaging and testing production line performance control models and strategies, the present invention proposes a performance control method for chip packaging and testing production lines based on Q-learning reinforcement learning. Aiming at the problems of untimely response to variable factors, incomplete consideration of variable factors, and conflicting control strategies, the method of the invention combines sensitivity analysis and Q-learning reinforcement learning algorithm to intelligently control the manufacturing performance of semiconductor chip packaging and testing production lines.
一种基于Q-learning强化学习的芯片封装测试生产线性能控制方法,包括以下步骤:A method for controlling the performance of a chip packaging and testing production line based on Q-learning reinforcement learning, comprising the following steps:
步骤1:构建半导体芯片封装测试串并联生产线抽象模型;Step 1: Build an abstract model of a series-parallel production line for semiconductor chip packaging and testing;
步骤2:基于步骤1构建的生产线抽象模型,建立半导体芯片封装测试串并联生产线性能的预测模型;Step 2: Based on the abstract model of the production line constructed in
步骤3:基于步骤1构建的生产线抽象模型,根据Morris筛选法定性分析与Arena仿真定量分析,得到关键变动性因素对生产线性能的影响机制;Step 3: Based on the abstract model of the production line constructed in
步骤4:基于步骤2建立的半导体芯片封装测试串并联生产线性能的预测模型和步骤3所得的关键变动性分析,建立基于Q-learning强化学习算法的性能控制模型,以生产线效益指标最优为性能控制目标进行迭代求解,得到全局的最优性能控制策略。Step 4: Based on the prediction model of the performance of the series-parallel production line for semiconductor chip packaging and testing established in Step 2 and the key variability analysis obtained in
所述的步骤1具体为:The
半导体芯片封装测试生产线模型抽象:以半导体生产制造产线后道工序,即芯片封装测试生产线作为研究对象,假设工站间存在有限缓冲区,排队规则为先来先服务,将其抽象为包含重入(重工)的多工站串并联排队生产线模型。Model abstraction of semiconductor chip packaging and testing production line: Taking the back-end process of the semiconductor manufacturing production line, that is, the chip packaging and testing production line as the research object, it is assumed that there is a limited buffer between the stations, and the queuing rule is first-come, first-served, and it is abstracted to include heavy Multi-station serial-parallel queuing production line model for input (heavy industry).
所述的步骤2具体为:The step 2 is specifically:
步骤2.1:变动性计算:计算到达变动性ca和加工时间变动性ce。Step 2.1: Variability calculation: Calculate arrival variability ca and processing time variability ce .
步骤2.2:确定性能预测基本指标。Step 2.2: Determine the basic indicators of performance prediction.
由工件在队列处的平均加工时间CTq和有效加工时间te得到驻留于工站的平均时间CT(生产周期),进一步计算得到工站处平均在制品水平WIP,将工件生产速率TH、生产周期CT、在制品水平WIP作为生产线性能预测基本指标。From the average processing time CT q and the effective processing time t e of the workpiece at the queue, the average time CT (production cycle) resident at the station is obtained, and the average work-in-process level WIP at the station is further calculated. The workpiece production rate TH, The production cycle CT and the WIP level of the work-in-progress are used as the basic indicators for the performance prediction of the production line.
CT=CTq+te CT=CT q + te
WIP=CT×THWIP=CT×TH
步骤2.3:建立生产线性能预测模型。Step 2.3: Build a production line performance prediction model.
步骤2.3.1:计算产品j在工站i的排队时间:Step 2.3.1: Calculate the queuing time of product j at station i:
其中ca ij、ce ij分别为产品j在工站i的到达变动性和加工时间变动性,uij为工站i的利用率,mij为工站i并联设备数量,te ij为产品j在工站i的有效加工时间。where c a ij and c e ij are the arrival variability and processing time variability of product j at station i, respectively, u ij is the utilization rate of station i, m ij is the number of parallel devices at station i, and t e ij is Effective processing time of product j at station i.
步骤2.3.2:计算工件生产速率TH。Step 2.3.2: Calculate the workpiece production rate TH.
假设工站i中有mij(b>m>1)台并联设备,b为工站i前缓冲区容量大小,k为工站i正在加工工件数,若有0≤k≤b,工站i前无等待的工件j(0<j<r,r表示生产线中一共加工的产品数量)加工时的概率p0为:Assuming that there are m ij (b>m>1) parallel devices in station i, b is the buffer capacity in front of station i, k is the number of workpieces being processed by station i, if there is 0≤k≤b, station i The probability p 0 of the workpiece j (0<j<r, r represents the total number of products processed in the production line) without waiting before i is:
工件j在缓存区容量大小为b的阻塞概率为:Blocking probability of workpiece j in buffer size b for:
设qhj为工件j在工站h上的不良品率,Qij为工站i监测到的不良品率,其取值范围为0<h<i≤s,其中s表示该串并联生产线中工站数量,则在工站i上检测并移除的工件j的不良品概率Qij为:Let q hj be the defective product rate of workpiece j on station h, and Q ij be the defective product rate monitored by station i, and its value range is 0<h<i≤s, where s represents the series-parallel production line. the number of stations, then the defective product probability Q ij of workpiece j detected and removed at station i is:
表示生产线中所有带有不良品检测工站编号的集合。 Indicates the collection of all stations with defective product inspection station numbers in the production line.
则工件j在工站i的生产速率THij为:Then the production rate TH ij of workpiece j at station i is:
当某工站利用率为最大时,记工站I为产品J的瓶颈工站,生产速率记为rb IJ=max(uij)。When the utilization rate of a certain station is the maximum, station I is denoted as the bottleneck station of product J, and the production rate is denoted as r b IJ =max(u ij ).
步骤2.3.3:计算生产线的生产周期(逻辑生产周期)CTj和在制品水平WIPj。Step 2.3.3: Calculate the production line's production cycle (logical production cycle) CT j and the work-in-process level WIP j .
计算工件平均等待成批时间WTBT:Calculate the average waiting batch time WTBT for workpieces:
其中ra代表工件到达工站的速率,其中kij表示工站i的产品j加工批量大小,此时则改写CTq ij计算公式:where ra represents the rate at which the workpiece arrives at the station, and k ij represents the batch size of the product j at the station i. At this time, but Rewrite the calculation formula of CT q ij :
计算产品j在工站i的生产周期CTj和在制品水平WIPj:Calculate the production cycle CT j of product j at station i and the WIP j level of work in process:
从而得到产品j在整条串并联生产线的生产周期(逻辑生产周期)CTj和在制品水平WIPj:Thus, the production cycle (logical production cycle) CT j of product j in the entire series-parallel production line and the WIP j of the work-in-process level are obtained:
步骤2.4:对生产线性能预测模型性能进行评估。Step 2.4: Evaluate the performance of the production line performance prediction model.
步骤2.4.1:计算生产线性能指标F。Step 2.4.1: Calculate the production line performance index F.
如图3,以生产线最佳情形、最差情形和实际最差情形下的WIP-CT和WIP-TH曲线作为标杆划定了性能象限中的“优区”和“劣区”,构成生产线的性能评估图。As shown in Figure 3, the WIP-CT and WIP-TH curves in the best case, the worst case and the actual worst case of the production line are used as benchmarks to delineate the "excellent area" and "inferior area" in the performance quadrant. Performance evaluation graph.
将实际性能点的距离除以最佳情形与实际最差情形标杆之间距离的比值作为性能评估指标,记为F:Divide the distance of the actual performance point by the ratio of the distance between the best case and the actual worst case benchmark as the performance evaluation index, denoted as F:
其中w代表给定实际在制品水平,t代表实际生产周期,T0表示生产线的理论加工时间,此处T0=CT;rb代表生产线的瓶颈速率,此处rb=THij,当且仅当uij=umax。where w represents a given actual WIP level, t represents the actual production cycle, T 0 represents the theoretical processing time of the production line, where T 0 =CT; rb represents the bottleneck rate of the production line, where r b = TH ij , if and only if u ij =u max .
步骤2.4.2:计算生产线效益指标Bf。Step 2.4.2: Calculate the production line benefit index Bf.
考察生产成本,将生产线性能指标F改写为效益指标Bf:Considering the production cost, rewrite the production line performance index F as the benefit index Bf:
Bf=C*FBf=C*F
其中C为成本因子,c1为单位设备成本,c2为单位缓冲区容量成本,c3为其余固定成本,m1和b1分别为当前并联设备数量和缓冲区容量大小,m0和b0分别为初始并联设备数量和缓冲区容量大小。where C is the cost factor, c 1 is the unit equipment cost, c 2 is the unit buffer capacity cost, c 3 is the remaining fixed cost, m 1 and b 1 are the current number of parallel devices and buffer capacity, respectively, m 0 and b 0 is the initial number of parallel devices and the size of the buffer capacity, respectively.
所述步骤3具体为:The
步骤3.1:Morris筛选法灵敏度定性分析。Step 3.1: Qualitative analysis of the sensitivity of the Morris screening method.
选取生产线性能预测模型中的随机参数x,预先设定固定步长C和最大变幅M,以步长C对参数x进行扰动变化,将性能评估指标F的平均变化率作为灵敏度系数S:Select the random parameter x in the performance prediction model of the production line, preset a fixed step size C and the maximum variation M, and use the step size C to perturb the change of the parameter x, and take the average change rate of the performance evaluation index F as the sensitivity coefficient S:
其中,Y0为参数x初始值对应的性能评估指标F;Yg、Yg+1为第g次和第g+1次参数xg扰动变化后的性能评估指标F;Pg、Pg+1分别为第g次、第g+1次参数扰动变化后其值相对于初始值的变化率,n为运算次数。Among them, Y 0 is the performance evaluation index F corresponding to the initial value of parameter x; Y g , Y g+1 are the performance evaluation index F after the gth and g+1th perturbation changes of the parameter xg; P g , P g + 1 is the rate of change of its value relative to the initial value after the g-th and g+1-th parameter perturbation changes, respectively, and n is the number of operations.
根据表1的灵敏度分级标准,将较灵敏和高灵敏度系数的参数确定为对半导体封装测试生产线性能影响较大的因素。According to the sensitivity classification standard in Table 1, the parameters with relatively sensitive and high sensitivity coefficients are determined as factors that have a greater impact on the performance of the semiconductor packaging and testing production line.
表1灵敏度分级标准Table 1 Sensitivity grading standard
步骤3.2:Arena仿真灵敏度定量分析。Step 3.2: Arena simulation sensitivity quantitative analysis.
在Arena软件中建立半导体芯片封装测试串并联生产线模型。每台设备具有独立的随机加工时间,失效时间和维修时间。A series-parallel production line model for semiconductor chip packaging and testing is established in Arena software. Each piece of equipment has independent random processing time, failure time and maintenance time.
令生产线上的工件到达速率、工站设备加工速率、平均失效前时间mf、平均修复时间mp分别服从负指数分布和正态分布,加工批量大小k、缓冲区容量大小b和并联设备数量m均为固定的正整数,且有b>m>1,并设置仿真实验预热时间设置、运行总时间和实验重复次数。Let the arrival rate of the workpiece on the production line, the processing rate of the station equipment, the average time before failure m f , and the average repair time mp obey the negative exponential distribution and the normal distribution, respectively, the processing batch size k, the buffer capacity size b and the number of parallel equipment m is a fixed positive integer, and there is b>m>1, and set the simulation experiment warm-up time setting, the total running time and the number of repetitions of the experiment.
实验得到生产线总体性能、生产周期CT、生产速率TH和在制品水平WIP关于影响生产线性能的关键因素的变化曲线。The experiment obtains the change curve of the overall performance of the production line, the production cycle CT, the production rate TH and the WIP level of the work-in-process about the key factors affecting the performance of the production line.
所述步骤4具体为:The step 4 is specifically:
步骤4.1:以生产线性能预测模型作为强化学习外界环境,生产线变动性的变化为触发条件,基于事件触发策略与周期触发策略相结合的动态控制方法,建立如图5所示的基于强化学习的半导体芯片封装测试生产线性能控制模型。Step 4.1: Using the production line performance prediction model as the external environment for reinforcement learning, the change in the variability of the production line is the triggering condition, and based on the dynamic control method combining the event-triggered strategy and the cycle-triggered strategy, establish the semiconductor based on reinforcement learning as shown in Figure 5. Chip packaging test production line performance control model.
步骤4.2:初始化Q(s,a),a∈A(s),其中Q值是对长期报酬的反映,S为系统状态集,A(s)为步骤4.2所得关键因素的动作策略集。给定参数学习率因子α和折扣因子γ,确定回报函数r。Step 4.2: Initialize Q(s, a), a∈A(s), where the Q value is a reflection of long-term rewards, S is the system state set, and A(s) is the action policy set of key factors obtained in step 4.2. Given the parameters learning rate factor α and discount factor γ, determine the reward function r.
步骤4.3:给定起始状态s,并根据ε-贪婪策略在状态s选择动作a。改进ε的取值方式,设为函数:其中p为算法当前执行部署步数,M为算法总迭代步数,所以随着算法执行步数的增加其值会从初始值0.2逐渐减小。Step 4.3: Given a starting state s, and choose action a in state s according to the ε-greedy policy. Improve the value of ε and set it as a function: Among them, p is the current number of deployment steps of the algorithm, and M is the total number of iteration steps of the algorithm, so as the number of execution steps of the algorithm increases, its value will gradually decrease from the initial value of 0.2.
步骤4.4:根据ε-贪婪策略在状态s选择动作a,b为a的选择序号,得到回报r和下一个状态snext,anext代表下一个动作,更新Q值:Step 4.4: According to the ε-greedy strategy, select the action a in the state s, and b is the selection number of a, get the reward r and the next state s next , a next represents the next action, and update the Q value:
s=snext,a=anext s=s next , a=a next
步骤4.5:转向步骤4.4,直到系统趋向稳定状态,也就是收敛状态。Step 4.5: Go to step 4.4 until the system tends to a steady state, that is, a convergent state.
步骤4.6:重复执行步骤4.2到步骤4.5,直到学习周期(算法预先设置的步骤4.2到步骤4.5重复执行的次数)结束则停止迭代。Step 4.6: Repeat steps 4.2 to 4.5 until the learning cycle (the number of repeated executions of steps 4.2 to 4.5 preset by the algorithm) ends, then stop the iteration.
步骤4.7:输出最终策略并得到生产线性能的指标优化情况。Step 4.7: Output the final policy And get the index optimization of the production line performance.
本发明建立了更加精确的半导体封装测试串并联生产线性能预测模型,并综合使用Morris筛选法与Arena仿真法开展全局灵敏度定量分析,得到对生产线性能影响最大的若干影响因素及其影响规律,避免了设备马尔科夫状态空间庞大,传统数学模型分析不适用的情况。本发明提出了一种基于Q-learning算法的生产线性能控制模型,在性能预测和灵敏度分析的基础上对生产线变动性因素进行控制,并改进参数ε的取值方式,使得算法收敛速度更快并避免局部最优,同时性能控制方法具有更好的灵活性和实时性。The invention establishes a more accurate performance prediction model of the series-parallel production line for semiconductor packaging and testing, and comprehensively uses the Morris screening method and the Arena simulation method to carry out quantitative analysis of the global sensitivity, and obtains several influencing factors and their influence rules that have the greatest impact on the performance of the production line, thereby avoiding the need for The equipment Markov state space is huge, and the traditional mathematical model analysis is not applicable. The invention proposes a production line performance control model based on the Q-learning algorithm, which controls the variability factors of the production line on the basis of performance prediction and sensitivity analysis, and improves the value method of the parameter ε, so that the algorithm converges faster and is more efficient. Avoid local optima, while the performance control method has better flexibility and real-time performance.
附图说明Description of drawings
图1为本发明的流程图;Fig. 1 is the flow chart of the present invention;
图2为半导体芯片封装测试生产线抽象模型;Figure 2 is an abstract model of a semiconductor chip packaging and testing production line;
图3为现有工厂物理学三大标杆性能评估方法图;Figure 3 is a diagram of the three benchmark performance evaluation methods for existing factory physics;
图4为生产线仿真模型逻辑结构示意图;Fig. 4 is a schematic diagram of the logic structure of a production line simulation model;
图5为实施例基于强化学习的生产线性能控制模型;Fig. 5 is the production line performance control model based on reinforcement learning of the embodiment;
图6为生产线性能关于变动性ca和ce的变化图;Figure 6 is a graph showing the variation of production line performance with respect to variability ca and ce;
图7为不同变动性水平CV1下性能控制前后的生产线性能指标变化情况;Figure 7 shows the changes of production line performance indicators before and after performance control under different variability levels CV1;
图8为不同变动性水平CV2下性能控制前后的生产线性能指标变化情况。Figure 8 shows the changes of production line performance indicators before and after performance control under different variability levels CV2.
具体实施方式Detailed ways
下面结合附图和实施例对本发明做进一步的详细说明,本实施例在以本发明技术方案为前提下进行实施,给出了详细的实施方式和具体的操作过程(图1),但本发明的保护范围不限于下述的实施例。The present invention will be further described in detail below in conjunction with the accompanying drawings and examples. The scope of protection is not limited to the following examples.
实施例主要可以分为以下几个步骤:The embodiment can be mainly divided into the following steps:
步骤1:半导体芯片封装测试生产线模型抽象:以芯片封装测试生产线作为研究对象,假设工站间存在有限大小的缓冲区,排队规则为先来先服务,将其抽象为包含重入(重工)的多工站串并联排队生产线模型(图2)。Step 1: Model abstraction of semiconductor chip packaging and testing production line: Taking the chip packaging and testing production line as the research object, assuming that there is a buffer of limited size between stations, the queuing rule is first-come-first-served, and abstracting it to include re-entry (heavy industry) Multi-station series-parallel queuing production line model (Figure 2).
步骤2:Step 2:
步骤2.1:变动性计算。Step 2.1: Volatility calculation.
计算到达变动性ca和加工时间变动性ce。Arrival variability ca and processing time variability ce are calculated.
步骤2.2:确定性能预测基本指标。Step 2.2: Determine the basic indicators of performance prediction.
由工件在队列处的平均加工时间CTq和有效加工时间te得到驻留于工站的平均时间CT(生产周期),进一步计算得到工站处平均在制品水平WIP,将工件生产速率TH、生产周期CT、在制品水平WIP作为生产线性能预测基本指标。From the average processing time CT q and the effective processing time t e of the workpiece at the queue, the average time CT (production cycle) resident at the station is obtained, and the average work-in-process level WIP at the station is further calculated. The workpiece production rate TH, The production cycle CT and the WIP level of the work-in-progress are used as the basic indicators for the performance prediction of the production line.
CT=CTq+te CT=CT q + te
WIP=CT×THWIP=CT×TH
步骤2.3:建立生产线性能预测模型。Step 2.3: Build a production line performance prediction model.
步骤2.3.1:计算产品j在工站i的排队时间:Step 2.3.1: Calculate the queuing time of product j at station i:
其中ca ij、ce ij分别为产品j在工站i的到达变动性和加工时间变动性,uij为工站i的利用率,mij为工站i并联设备数量,te ij为产品j在工站i的有效加工时间。where c a ij and c e ij are the arrival variability and processing time variability of product j at station i, respectively, u ij is the utilization rate of station i, m ij is the number of parallel devices at station i, and t e ij is Effective processing time of product j at station i.
步骤2.3.2:计算工件生产速率TH。Step 2.3.2: Calculate the workpiece production rate TH.
假设工站i中有mij(b>m>1)台并联设备,b为工站i前缓冲区容量大小,k为工站i正在加工工件数,若有0≤k≤b,工站i前无等待的工件j(0<j<r,r表示生产线中一共加工的产品数量)加工时的概率p0为:Assuming that there are m ij (b>m>1) parallel devices in station i, b is the buffer capacity in front of station i, k is the number of workpieces being processed by station i, if there is 0≤k≤b, station i The probability p 0 of the workpiece j (0<j<r, r represents the total number of products processed in the production line) without waiting before i is:
工件j在工站i的损失率为:Loss rate of workpiece j at station i for:
设qhj为工件j在工站h上的不良品率,Qij为工站i监测到的不良品率,其取值范围为0<h<i≤s,其中s表示该串并联生产线中工站数量。在工站i上检测并移除的工件j的不良品概率Qij为:Let q hj be the defective product rate of workpiece j on station h, and Q ij be the defective product rate monitored by station i, and its value range is 0<h<i≤s, where s represents the series-parallel production line. number of stations. The defective product probability Q ij of workpiece j detected and removed at station i is:
表示生产线中所有带有不良品检测工站编号的集合。 Indicates the collection of all stations with defective product inspection station numbers in the production line.
则工件j在工站i的生产速率THij为:Then the production rate TH ij of workpiece j at station i is:
记产品J的瓶颈工站I生产速率为rb IJ=max(uij)。Denote the production rate of the bottleneck station I of product J as r b IJ =max(u ij ).
步骤2.3.3:计算生产线的生产周期(逻辑生产周期)CTj和在制品水平WIPj。Step 2.3.3: Calculate the production line's production cycle (logical production cycle) CT j and the work-in-process level WIP j .
计算工件平均等待成批时间WTBT:Calculate the average waiting batch time WTBT for workpieces:
其中ra代表工件到达工站的速率,kij表示工站i的产品j加工批量大小,此时则改写CTq ij计算公式:where ra represents the rate at which the workpiece arrives at the station, and k ij represents the processing batch size of product j at station i. At this time, but Rewrite the calculation formula of CT q ij :
计算产品j在工站i的生产周期CTj和在制品水平WIPj:Calculate the production cycle CT j of product j at station i and the WIP j level of work in process:
从而得到产品j在整条串并联生产线的生产周期(逻辑生产周期)CTj和在制品水平WIPj:Thus, the production cycle (logical production cycle) CT j of product j in the entire series-parallel production line and the WIP j of the work-in-process level are obtained:
步骤2.4:对生产线性能预测模型性能进行评估。Step 2.4: Evaluate the performance of the production line performance prediction model.
步骤2.4.1:计算生产线性能指标F。Step 2.4.1: Calculate the production line performance index F.
如图3,以生产线最佳情形、最差情形和实际最差情形下的WIP-CT和WIP-TH曲线作为标杆划定了性能象限中的“优区”和“劣区”,构成生产线的性能评估图。As shown in Figure 3, the WIP-CT and WIP-TH curves in the best case, the worst case and the actual worst case of the production line are used as benchmarks to delineate the "excellent area" and "inferior area" in the performance quadrant. Performance evaluation graph.
将实际性能点的距离除以最佳情形与实际最差情形标杆之间距离的比值作为性能评估指标,记为F:Divide the distance of the actual performance point by the ratio of the distance between the best case and the actual worst case benchmark as the performance evaluation index, denoted as F:
其中w代表给定实际在制品水平,t代表实际生产周期,T0表示生产线的理论加工时间,此处T0=CT;rb代表生产线的瓶颈速率,此处rb=THij,当且仅当uij=umax。where w represents a given actual WIP level, t represents the actual production cycle, T 0 represents the theoretical processing time of the production line, where T 0 =CT; rb represents the bottleneck rate of the production line, where r b = TH ij , if and only if u ij =u max .
步骤2.4.2:计算生产线效益指标Bf。Step 2.4.2: Calculate the production line benefit index Bf.
考察生产成本,将生产线性能指标F改写为效益指标Bf:Considering the production cost, rewrite the production line performance index F as the benefit index Bf:
Bf=C*FBf=C*F
其中C为成本因子,c1为单位设备成本,c2为单位缓冲区容量成本,c3为其余固定成本,m1和b1分别为当前并联设备数量和缓冲区容量大小,m0和b0分别为初始并联设备数量和缓冲区容量大小。where C is the cost factor, c 1 is the unit equipment cost, c 2 is the unit buffer capacity cost, c 3 is the remaining fixed cost, m 1 and b 1 are the current number of parallel devices and buffer capacity, respectively, m 0 and b 0 is the initial number of parallel devices and the size of the buffer capacity, respectively.
步骤3:Step 3:
步骤3.1:Morris筛选法灵敏度定性分析。Step 3.1: Qualitative analysis of the sensitivity of the Morris screening method.
选取生产线性能预测模型中的某一个参数x,预先设定固定步长C和最大变幅M,以步长C对参数x进行扰动变化,将性能评估指标F的平均变化率作为灵敏度系数S:Select a certain parameter x in the production line performance prediction model, preset a fixed step size C and a maximum variation M, and use the step size C to perturb and change the parameter x, and take the average rate of change of the performance evaluation index F as the sensitivity coefficient S:
其中,Y0为参数x初始值对应的性能评估指标F;Yg、Yg+1为第g次和第g+1次参数x扰动变化后的性能评估指标F;Pg、Pg+1分别为第g次、第g+1次参数扰动变化后其值相对于初始值的变化率,n为运算次数。Among them, Y 0 is the performance evaluation index F corresponding to the initial value of parameter x; Y g , Y g+1 are the performance evaluation index F after the gth and g+1th disturbance changes of parameter x; P g , P g + 1 is the rate of change of its value relative to the initial value after the g-th and g+1-th parameter perturbation changes, respectively, and n is the number of operations.
表1为Morris筛选法所得性能评估指标F对于不同参数的灵敏度系数。Table 1 shows the sensitivity coefficients of the performance evaluation index F obtained by the Morris screening method for different parameters.
表1指标F的灵敏度系数STable 1 Sensitivity coefficient S of index F
根据表2的灵敏度分级标准以及参数间的关系,将并联设备数量m、加工批量大小k、工件到达时间变动性ca、加工变动性ce和缓冲区容量大小b确定为对半导体封装测试生产线性能影响较大的因素。According to the sensitivity classification standard and the relationship between the parameters in Table 2, the number of parallel devices m, the processing batch size k, the workpiece arrival time variability c a , the processing variability c e and the buffer capacity size b are determined as the semiconductor packaging test production line Factors that have a greater impact on performance.
表2灵敏度分级标准Table 2 Sensitivity grading standard
步骤3.2:Arena仿真灵敏度定量分析。Step 3.2: Arena simulation sensitivity quantitative analysis.
在Arena软件中建立半导体芯片封装测试串并联生产线模型,如图4。每台设备具有独立的随机加工时间,失效时间和维修时间。A series-parallel production line model for semiconductor chip packaging and testing is established in Arena software, as shown in Figure 4. Each piece of equipment has independent random processing time, failure time and maintenance time.
令生产线上的工件到达速率、工站设备加工速率、平均失效前时间mf、平均修复时间mp分别服从负指数分布和正态分布,加工批量大小k、缓冲区容量大小b和并联设备数量m均为固定的正整数,且有b>m>1,仿真实验预热时间设置为600分钟,运行总时间设置为1200分钟,重复3次试验。Let the arrival rate of the workpiece on the production line, the processing rate of the station equipment, the average time before failure m f , and the average repair time mp obey the negative exponential distribution and the normal distribution, respectively, the processing batch size k, the buffer capacity size b and the number of parallel equipment m is a fixed positive integer, and there is b>m>1, the preheating time of the simulation experiment is set to 600 minutes, the total running time is set to 1200 minutes, and the experiment is repeated 3 times.
实验得到生产线总体性能、生产周期CT、生产速率TH和在制品水平WIP关于影响生产线性能的关键因素的变化曲线。如图6所示,为生产线性能关于时间变动性ca和加工变动性ce的变化图。The experiment obtains the change curve of the overall performance of the production line, the production cycle CT, the production rate TH and the WIP level of the work-in-process about the key factors affecting the performance of the production line. As shown in FIG. 6 , it is a graph showing changes in line performance with respect to time variability ca and process variability ce .
步骤4:Step 4:
步骤4.1:以生产线性能预测模型作为强化学习外界环境,以生产线变动性的变化为触发条件,基于事件触发策略与周期触发策略相结合的动态控制方法,建立如图5所示的基于强化学习的半导体芯片封装测试生产线性能控制模型。Step 4.1: Take the production line performance prediction model as the external environment for reinforcement learning, take the change of the variability of the production line as the triggering condition, and establish a dynamic control method based on the combination of the event trigger strategy and the cycle trigger strategy, as shown in Figure 5. Semiconductor chip packaging and testing production line performance control model.
步骤4.2:初始化Q(s,a),a∈A(s),其中Q值是对长期报酬的反映,S为系统状态集。划分方式如表3所示:Step 4.2: Initialize Q(s, a), a∈A(s), where the Q value is a reflection of long-term rewards and S is the system state set. The division method is shown in Table 3:
表3系统状态集S划分Table 3 Division of system state set S
A(s)为动作策略集,A(s):{a1:工站i并联设备数量+1,a2:工站i并联设备数量-1,a3:工站i缓冲区容量+1,a4:工站i缓冲区容量-1,a5:产品j加工批量大小+1,a6:产品j加工批量大小-1}。设参数学习率因子α为0.1,折扣因子γ为0.9,确定回报函数r如下,Bfpre代表生产线上一次优化后的效益指标:A(s) is the action strategy set, A(s): {a1: the number of parallel devices in station i +1, a2: the number of parallel devices in station i -1, a3: the buffer capacity of station i +1, a4: Station i buffer capacity -1, a5: product j processing batch size +1, a6: product j processing batch size -1}. Let the parameter learning rate factor α be 0.1 and the discount factor γ to be 0.9, determine the reward function r as follows, Bf pre represents the benefit index after the first optimization of the production line:
步骤4.3:给定起始状态s,并根据ε-贪婪策略在状态s选择动作a。Step 4.3: Given a starting state s, and choose action a in state s according to the ε-greedy policy.
步骤4.4:根据ε-贪婪策略在状态s选择动作a,b为a的选择序号,得到回报r和下一个状态snext,anext代表下一个动作,更新Q值:Step 4.4: According to the ε-greedy strategy, select the action a in the state s, and b is the selection number of a, get the reward r and the next state s next , a next represents the next action, and update the Q value:
s=snext,a=anext s=s next , a=a next
步骤4.5:转向步骤4.4,直到系统趋向稳定状态,也就是收敛状态。Step 4.5: Go to step 4.4 until the system tends to a steady state, that is, a convergent state.
步骤4.6:重复执行步骤4.2到步骤4.5,直到学习周期(算法预先设置的步骤4.2到步骤4.5重复执行的次数)结束则停止迭代。Step 4.6: Repeat steps 4.2 to 4.5 until the learning cycle (the number of repeated executions of steps 4.2 to 4.5 preset by the algorithm) ends, then stop the iteration.
步骤4.7:输出最终策略并得到生产线性能的指标优化情况。图7和图8分别为不同变动性水平CV1和CV2下性能控制前后的生产线性能指标变化情况。Step 4.7: Output the final policy And get the index optimization of the production line performance. Figures 7 and 8 show the changes of production line performance indicators before and after performance control under different variability levels CV1 and CV2, respectively.
综上所述,本发明建立了更加精确的半导体封装测试串并联生产线性能预测模型,综合使用Morris筛选法与Arena仿真法开展全局灵敏度定量分析,得到对生产线性能影响最大的若干影响因素及其影响规律,避免了设备马尔科夫状态空间庞大,传统数学模型分析不适用的情况;并改进参数ε的取值方式,使得算法收敛速度更快并避免局部最优,同时具有更好的灵活性和实时性。To sum up, the present invention establishes a more accurate model for predicting the performance of a series-parallel production line for semiconductor packaging and testing, comprehensively uses the Morris screening method and the Arena simulation method to carry out quantitative analysis of global sensitivity, and obtains several influencing factors that have the greatest impact on the performance of the production line and their effects. It avoids the situation where the equipment Markov state space is huge and the traditional mathematical model analysis is not applicable; and the value method of the parameter ε is improved to make the algorithm converge faster and avoid local optimization, and at the same time have better flexibility and real-time.
Claims (5)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010797879.2A CN111857081B (en) | 2020-08-10 | 2020-08-10 | Chip packaging test production linear energy control method based on Q-learning reinforcement learning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010797879.2A CN111857081B (en) | 2020-08-10 | 2020-08-10 | Chip packaging test production linear energy control method based on Q-learning reinforcement learning |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111857081A true CN111857081A (en) | 2020-10-30 |
| CN111857081B CN111857081B (en) | 2023-05-05 |
Family
ID=72971238
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010797879.2A Expired - Fee Related CN111857081B (en) | 2020-08-10 | 2020-08-10 | Chip packaging test production linear energy control method based on Q-learning reinforcement learning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111857081B (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112631216A (en) * | 2020-12-11 | 2021-04-09 | 江苏晶度半导体科技有限公司 | Semiconductor test packaging production line performance prediction control system based on DQN and DNN twin neural network algorithm |
| CN113033815A (en) * | 2021-02-07 | 2021-06-25 | 广州杰赛科技股份有限公司 | Intelligent valve cooperation control method, device, equipment and storage medium |
| CN113962470A (en) * | 2021-10-29 | 2022-01-21 | 上海新科乾物联技术有限公司 | Optimized scheduling method and system based on disturbance prediction |
| CN115933412A (en) * | 2023-01-12 | 2023-04-07 | 中国航发湖南动力机械研究所 | Aero-engine control method and device based on event-triggered predictive control |
| CN120631674A (en) * | 2025-08-12 | 2025-09-12 | 弘润半导体(苏州)有限公司 | Chip packaging test production linear energy control method based on reinforcement learning |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2004127170A (en) * | 2002-10-07 | 2004-04-22 | Matsushita Electric Ind Co Ltd | Production plan creation method |
| CN103676881A (en) * | 2013-12-16 | 2014-03-26 | 北京化工大学 | Dynamic bottleneck analytical method of semiconductor production line |
| CN108646684A (en) * | 2018-05-30 | 2018-10-12 | 电子科技大学 | A kind of multi-product production line production cycle prediction technique based on mobility measurement |
| CN109270904A (en) * | 2018-10-22 | 2019-01-25 | 中车青岛四方机车车辆股份有限公司 | A kind of flexible job shop batch dynamic dispatching optimization method |
| CN110378439A (en) * | 2019-08-09 | 2019-10-25 | 重庆理工大学 | Single robot path planning method based on Q-Learning algorithm |
| CN110517002A (en) * | 2019-08-29 | 2019-11-29 | 烟台大学 | Production Control Method Based on Reinforcement Learning |
-
2020
- 2020-08-10 CN CN202010797879.2A patent/CN111857081B/en not_active Expired - Fee Related
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2004127170A (en) * | 2002-10-07 | 2004-04-22 | Matsushita Electric Ind Co Ltd | Production plan creation method |
| CN103676881A (en) * | 2013-12-16 | 2014-03-26 | 北京化工大学 | Dynamic bottleneck analytical method of semiconductor production line |
| CN108646684A (en) * | 2018-05-30 | 2018-10-12 | 电子科技大学 | A kind of multi-product production line production cycle prediction technique based on mobility measurement |
| CN109270904A (en) * | 2018-10-22 | 2019-01-25 | 中车青岛四方机车车辆股份有限公司 | A kind of flexible job shop batch dynamic dispatching optimization method |
| CN110378439A (en) * | 2019-08-09 | 2019-10-25 | 重庆理工大学 | Single robot path planning method based on Q-Learning algorithm |
| CN110517002A (en) * | 2019-08-29 | 2019-11-29 | 烟台大学 | Production Control Method Based on Reinforcement Learning |
Non-Patent Citations (1)
| Title |
|---|
| 张树林: "一种机器人搬运生产线的调度优化方法及实验平台设计" * |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112631216A (en) * | 2020-12-11 | 2021-04-09 | 江苏晶度半导体科技有限公司 | Semiconductor test packaging production line performance prediction control system based on DQN and DNN twin neural network algorithm |
| CN113033815A (en) * | 2021-02-07 | 2021-06-25 | 广州杰赛科技股份有限公司 | Intelligent valve cooperation control method, device, equipment and storage medium |
| CN113962470A (en) * | 2021-10-29 | 2022-01-21 | 上海新科乾物联技术有限公司 | Optimized scheduling method and system based on disturbance prediction |
| CN113962470B (en) * | 2021-10-29 | 2022-06-24 | 上海新科乾物联技术有限公司 | Optimized scheduling method and system based on disturbance prediction |
| CN115933412A (en) * | 2023-01-12 | 2023-04-07 | 中国航发湖南动力机械研究所 | Aero-engine control method and device based on event-triggered predictive control |
| CN120631674A (en) * | 2025-08-12 | 2025-09-12 | 弘润半导体(苏州)有限公司 | Chip packaging test production linear energy control method based on reinforcement learning |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111857081B (en) | 2023-05-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111857081A (en) | Performance control method of chip packaging and testing production line based on Q-learning reinforcement learning | |
| CN113792924B (en) | A single-piece job shop scheduling method based on Deep Q-network deep reinforcement learning | |
| CN106874581B (en) | Building air conditioner energy consumption prediction method based on BP neural network model | |
| CN108694467B (en) | A method and system for predicting line loss rate of distribution network | |
| CN106651089B (en) | Modeling and Optimal Solving Method of Distributed Set Robust Model for Production Scheduling Problem | |
| CN111427750B (en) | GPU power consumption estimation method, system and medium of computer platform | |
| CN107767022A (en) | A kind of Dynamic Job-shop Scheduling rule intelligent selecting method of creation data driving | |
| CN111898867B (en) | Airplane final assembly production line productivity prediction method based on deep neural network | |
| CN115062528A (en) | A forecasting method for industrial process time series data | |
| CN111880489B (en) | Regression scheduling method for complex manufacturing system | |
| CN117408433A (en) | A decision-making method and device for technical transformation project optimization considering multi-objective contributions | |
| CN113328467B (en) | Probability voltage stability evaluation method, system, terminal device and medium | |
| CN110097205A (en) | A kind of building load prediction weather forecast data preprocessing method | |
| CN119671365A (en) | A project performance control method, device, equipment, product and storage medium | |
| CN117713084A (en) | Power system partition load demand forecasting method, system, equipment and storage medium | |
| CN118779737A (en) | A method and system for fault control of electric energy metering multi-calibration pipeline | |
| CN109523136A (en) | A kind of scheduling knowledge management system towards intelligence manufacture | |
| Chen et al. | A fuzzy-neural approach for remaining cycle time estimation in a semiconductor manufacturing factory—a simulation study | |
| CN111369072A (en) | An Online Prediction Model of Kernel Least Mean Square Time Series Based on Sparsification Method | |
| Jinlian et al. | Long and medium term power load forecasting based on a combination model of GMDH, PSO and LSSVM | |
| CN108171435A (en) | A kind of production schedule decision-making technique for considering preventive maintenance | |
| CN118586438A (en) | A method for predicting rural photovoltaic power generation based on improved gated recurrent unit network | |
| CN118644008A (en) | Data-driven method and device for supporting adjustable resource scheduling domain aggregation evaluation | |
| CN107563511A (en) | A kind of real-time system pot life is quickly estimated and optimization method | |
| CN116933639A (en) | High-precision polyethylene pipe slow crack growth rate prediction method and system |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20230505 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |