WO2024224504A1 - Verification device, verification method, and program - Google Patents
Verification device, verification method, and program Download PDFInfo
- Publication number
- WO2024224504A1 WO2024224504A1 PCT/JP2023/016420 JP2023016420W WO2024224504A1 WO 2024224504 A1 WO2024224504 A1 WO 2024224504A1 JP 2023016420 W JP2023016420 W JP 2023016420W WO 2024224504 A1 WO2024224504 A1 WO 2024224504A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- output data
- verification
- group
- machine learning
- ais
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Definitions
- the present invention relates to a technology for verifying the identity of multiple AIs (hereinafter also referred to as machine learning models).
- Non-Patent Document 1 Digital signatures (see Non-Patent Document 1) are known as a conventional technology for verifying identity.
- Non-Patent Document 1 Some AIs continuously learn and change in order to adapt to the external environment and improve their accuracy.
- Non-Patent Document 1 it is possible to verify the identity of completely identical AIs, but for AIs that continuously learn and change, it is not possible to determine that the AI before and after the change is the same AI.
- AIs that continuously learn and change there is a demand to determine that the AI before and after the change is the same AI if the AI changes within the range expected by the AI creator or user, and to determine that the AI is not the same AI if the AI changes beyond the expected range due to erroneous learning, falsification, or tampering.
- the problem with AIs that continuously learn and change is that it is difficult to notice whether changes beyond the expected range have occurred because they are changing.
- the present invention aims to provide a verification device, verification method, and program that can verify the identity of AI that continuously learns and changes.
- a verification device includes an acquisition unit that acquires a first output data group obtained when input data is provided to a group of machine learning models considered to be identical, and a second output data obtained when the input data is provided to the machine learning model to be verified, and a verification unit that uses the first output data group and the second output data to verify whether the machine learning model to be verified is identical to the group of machine learning models through statistical hypothesis testing.
- the present invention has the effect of being able to verify the identity of an AI that is continually learning and changing.
- FIG. 2 is a functional block diagram of a verification device according to the first embodiment
- FIG. 4 is a diagram showing an example of a processing flow of the verification apparatus according to the first embodiment
- FIG. 13 is a diagram showing an example of the configuration of a computer to which the present technique is applied.
- FIG. 1 is a functional block diagram of a verification device 100 according to the first embodiment, and FIG. 2 shows the processing flow thereof.
- the verification device 100 includes an acquisition unit 110, a storage unit 120, and a verification unit 130.
- the verification device 100 is a special device configured by loading a special program into a publicly known or dedicated computer having, for example, a central processing unit (CPU), a main memory (RAM), etc.
- the verification device 100 executes each process under the control of, for example, the central processing unit. Data input to the verification device 100 and data obtained in each process are stored, for example, in the main memory, and the data stored in the main memory is read out to the central processing unit as necessary and used for other processes.
- At least a part of each processing unit of the verification device 100 may be configured by hardware such as an integrated circuit.
- Each memory unit provided in the verification device 100 can be configured by, for example, a main memory such as a RAM (Random Access Memory), or middleware such as a relational database or a key-value store.
- each storage unit does not necessarily need to be provided inside the verification device 100, but may be configured as an auxiliary storage device made up of a hard disk, optical disk, or semiconductor memory element such as flash memory, and may be configured to be provided outside
- the group of AIs considered to be identical includes N AIs. N is any integer equal to or greater than 1, and is a sufficient amount for verification.
- the acquisition unit 110 acquires the output data Ym (S110) and outputs it to the verification unit 130.
- a group of AIs considered to be identical refers to a group of AIs within the range of the assumptions of the creator or user of the AI (the range of acceptable changes or differences), and the creator or user of the AI may set the range of assumptions and conditions appropriately according to the purpose.
- a group of AIs considered to be identical may be obtained, for example, by acquiring an AI that continuously learns and changes N times at each predetermined period of time, or by distributing an AI that continuously learns and changes at a certain point in time to N clients and acquiring the changed AI from the N clients after a predetermined period of time has passed.
- a single original AI may be one of the N AIs included in the group of AIs considered to be identical, or it may not be included in the AI group.
- an AI that has changed beyond the range expected by the creator or user of the AI is not included in the group of AIs considered to be identical, even if it has changed from a single original AI.
- the determination of whether or not the change has exceeded the expected range may be performed automatically or manually based on the magnitude relationship between the variable parameters used inside the AI and the threshold value, or the magnitude relationship between the output data of the AI and the threshold value, by setting a threshold value or the like.
- a random parameter is used as the initial value of the parameter used inside the AI before learning
- these multiple AIs may be considered as a group of AIs that are considered to be the same, with the condition of having the same structure being that the AIs are considered to be the same.
- a group of AIs may be obtained by combining these methods. For example, N AIs that have the same structure but different random parameters may be prepared and distributed to N clients, and the changed AIs may be obtained from the N clients after a predetermined period of time has passed.
- the verification result may be information indicating whether the AI to be verified can be considered to be identical to the group of AIs considered to be identical, and may be, for example, a value (e.g., 1) indicating that "the AI to be verified can be considered to be identical to the group of AIs considered to be identical" or a value (e.g., 0) indicating that "the AI to be verified cannot be considered to be identical to the group of AIs considered to be identical".
- the verification method used by the verification unit 130 is explained below.
- the output data Ym and Y are regarded as samples statistically extracted from a certain population, and a statistical hypothesis test is used to examine whether the probability distributions of the populations are different, and the results of the test are used to determine whether the AIs are different as well. In other words, if the probability distributions of the populations are different, the AIs are considered to be different, and if the probability distributions of the populations are the same, the AIs are considered to be the same.
- the statistical hypothesis test used here can verify whether the probability distributions of the populations are different, and examples of such tests include the Kolmogorov-Smirnov test for two samples and the Mann-Whitney U test.
- a method for verifying the identity of AIs it is possible to consider a method in which the AI is regarded as a function, the similarity is approximated from the distance between the input and output, and whether the AI to be verified can be considered to be the same as a group of AIs that are considered to be the same based on the magnitude relationship between the similarity and a threshold value. In this case, however, it is necessary to set an appropriate threshold value for the similarity.
- p' is m, 1, 2, ..., N other than g (p' ⁇ g).
- the output data of that source AI may be selected. If we calculate a distance such that dp ',k is a scalar value, we can use the same process as when the output data Ym or element yp,k of Y is a scalar value (but use distance dp ', k instead of element yp, k) to examine whether the probability distributions of the populations are different through statistical hypothesis testing, and use the test results to determine whether the AI is also different.
- the distance can be Euclidean distance.
- yp ',k,q and yq ,k,q are the q-th dimension values of the vector-valued elements yp ',k and yg,k, respectively
- Qk is the number of dimensions of the vector-valued elements yp ',k and yg,k .
- the method for calculating the distance is just one example, and other methods may be used as long as they can convert vector values into scalar values.
- the acquisition unit 110 outputs the output data Ym to the verification unit 130, but when verification is not performed immediately after or simultaneously with receiving the output data Ym , the output data Ym may be stored in the storage unit 120 until verification is performed. In this case, the verification unit 130 may take out the output data Ym from the storage unit 120 at the time of verification.
- the present invention is not limited to the above-mentioned embodiment and modified examples.
- the above-mentioned various processes may be executed not only in chronological order as described, but also in parallel or individually depending on the processing capacity of the device executing the processes or as necessary.
- appropriate modifications are possible within the scope of the present invention.
- ⁇ Program and recording medium> The various processes described above can be implemented by loading a program that executes each step of the above method into the recording unit 2020 of the computer 2000 shown in Figure 3, and operating the control unit 2010, input unit 2030, output unit 2040, display unit 2050, etc.
- the program describing this processing can be recorded on a computer-readable recording medium.
- Examples of computer-readable recording media include magnetic recording devices, optical disks, magneto-optical recording media, and semiconductor memories.
- the program may be distributed, for example, by selling, transferring, or lending portable recording media such as DVDs or CD-ROMs on which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of a server computer and transferring the program from the server computer to other computers via a network.
- a computer that executes such a program for example, first stores in its own storage device the program recorded on a portable recording medium or the program transferred from a server computer. Then, when executing a process, the computer reads the program stored on its own recording medium and executes the process according to the read program. As another execution form of the program, the computer may read the program directly from the portable recording medium and execute the process according to the program, or may execute the process according to the received program each time a program is transferred from the server computer to the computer.
- the above-mentioned process may also be executed by a so-called ASP (Application Service Provider) type service that does not transfer the program from the server computer to the computer, but realizes the processing function only by issuing an execution instruction and obtaining the results.
- ASP Application Service Provider
- the program in this form includes information used for processing by an electronic computer that is equivalent to a program (such as data that is not a direct command to the computer but has properties that specify the processing of the computer).
- the device is configured by executing a specific program on a computer, but at least a portion of the processing may be realized by hardware.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
本発明は、複数のAI(以下、機械学習モデルともいう)の同一性を検証する技術に関する。 The present invention relates to a technology for verifying the identity of multiple AIs (hereinafter also referred to as machine learning models).
第三者から提供されたAIを利用するとき、AIの偽装や改竄の恐れがあるため、想定するAIと利用するAIが同一かを確かめたいという要望がある。 When using AI provided by a third party, there is a risk of the AI being faked or tampered with, so there is a demand to verify that the AI intended and the AI being used are the same.
同一性を検証する従来技術としてディジタル署名(非特許文献1参照)が知られている。 Digital signatures (see Non-Patent Document 1) are known as a conventional technology for verifying identity.
AIの中には、外部環境への適応や精度の向上を目的として継続的に学習して変化するものもある。非特許文献1の技術を用いることで、完全に同一であるAIの同一性を検証することはできるが、継続的に学習して変化するAIについて、変化前後のAIを同一のAIと判定することはできない。継続的に学習して変化するAIについて、AIの作成者や利用者が想定している範囲内でAIが変化した場合には変化前後のAIを同一のAIと判定し、誤った学習や偽装・改竄によって想定の範囲を超えてAIが変化した場合には同一のAIではないと判定したいという要望がある。ここで、継続的に学習して変化するAIは、変化するが故に、想定の範囲を超えた変化が生じているか否かを気付きにくいという問題がある。
Some AIs continuously learn and change in order to adapt to the external environment and improve their accuracy. By using the technology in Non-Patent
従来同様、AIの偽装や改竄の恐れを検知したいという要望がある。ここで、継続的に学習して変化するAIは、変化するが故に、第三者によるAIの偽装や改竄が行われた場合に、気付きにくいという問題がある。 As in the past, there is a demand to detect the possibility of AI being disguised or tampered with. However, there is a problem with AI that continuously learns and changes, in that the fact that it changes means that it is difficult to notice when a third party disguises or tampers with the AI.
そのため、継続的に学習して変化するAIについて、想定の範囲内で変化したAIか否か、および、利用者や製作者が想定するAIか否か(偽装や改竄が行われたAIではないか否か)、を検証したい。 For this reason, we want to verify whether AI that continuously learns and changes has changed within the expected range, and whether it is the AI that users and creators expect (whether it has been faked or tampered with).
本発明は、継続的に学習して変化するAIの同一性も検証することができる検証装置、検証方法、およびプログラムを提供することを目的とする。 The present invention aims to provide a verification device, verification method, and program that can verify the identity of AI that continuously learns and changes.
上記の課題を解決するために、本発明の一態様によれば、検証装置は、同一と見做される機械学習モデル群に入力データを与えたときに得られる第一の出力データ群と、入力データを検証対象の機械学習モデルに与えたときに得られる第二の出力データとを取得する取得部と、第一の出力データ群と第二の出力データとを用いて、検証対象の機械学習モデルが、機械学習モデル群と同一であるかに否かを統計的仮説検定により検証する検証部と、を含む。 In order to solve the above problem, according to one aspect of the present invention, a verification device includes an acquisition unit that acquires a first output data group obtained when input data is provided to a group of machine learning models considered to be identical, and a second output data obtained when the input data is provided to the machine learning model to be verified, and a verification unit that uses the first output data group and the second output data to verify whether the machine learning model to be verified is identical to the group of machine learning models through statistical hypothesis testing.
本発明によれば、継続的に学習して変化するAIの同一性も検証することができるという効果を奏する。 The present invention has the effect of being able to verify the identity of an AI that is continually learning and changing.
以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。以下の説明において、ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。 Below, an embodiment of the present invention will be described. In the drawings used in the following description, components having the same functions and steps performing the same processing will be given the same reference numerals, and duplicate explanations will be omitted. In the following description, processing performed on an element-by-element basis of a vector or matrix will be applied to all elements of that vector or matrix, unless otherwise specified.
<第一実施形態のポイント>
同一と見做されるAIの特性の分布と検証したいAIの特性の分布を取得し、統計的仮説検定によって分布が異なるかを調べることで、同一性を検証する。ここで、ある入力に対するAIの出力を「AIの特性」と呼ぶ。
<Key Points of the First Embodiment>
The distribution of the characteristics of the AI considered to be identical and the distribution of the characteristics of the AI to be verified are obtained, and the identity is verified by examining whether the distributions differ through statistical hypothesis testing. Here, the output of an AI for a certain input is called the "AI characteristic."
変化するAIであっても、検証対象のAIが、利用者が想定するAIか、あるいは、利用者や製作者の想定しない範囲でAIが変化していないかを、事前に取得した想定されるAIの情報との同一性を検証することで確かめることができる。 Even if the AI is one that changes, it is possible to verify whether the AI being verified is the AI the user expects, or whether the AI has changed in ways not anticipated by the user or creator, by verifying its identity with information about the expected AI obtained in advance.
なお、本実施形態では、完全に同一であるAIのみでなく、想定の範囲(許容される変化の範囲)内のAIを同一のAIと見做す。このような構成とすることで、継続的に学習し変化するAIに追従しやすくなる。 In addition, in this embodiment, not only completely identical AIs are considered to be the same, but AIs that fall within the expected range (the range of allowable changes) are considered to be the same AI. This configuration makes it easier to keep up with AI that is continually learning and changing.
<第一実施形態>
図1は第一実施形態に係る検証装置100の機能ブロック図を、図2はその処理フローを示す。
First Embodiment
FIG. 1 is a functional block diagram of a
検証装置100は、取得部110と記憶部120と検証部130とを含む。
The
検証装置100は、同一と見做されるAI群に入力データX=(x1,…,xK)を与えたときに得られる出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}と、同じ入力データX=(x1,…,xK)を検証対象のAIに与えたときに得られる出力データYm=(ym,1,…,ym,K)とを入力とし、検証対象のAIが、同一と見做されるAI群と同一であるかに否かを検証し、検証結果を出力する。
The
検証装置100は、例えば、中央演算処理装置(CPU: Central Processing Unit)、主記憶装置(RAM: Random Access Memory)などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。検証装置100は、例えば、中央演算処理装置の制御のもとで各処理を実行する。検証装置100に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて中央演算処理装置へ読み出されて他の処理に利用される。検証装置100の各処理部は、少なくとも一部が集積回路等のハードウェアによって構成されていてもよい。検証装置100が備える各記憶部は、例えば、RAM(Random Access Memory)などの主記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。ただし、各記憶部は、必ずしも検証装置100がその内部に備える必要はなく、ハードディスクや光ディスクもしくはフラッシュメモリ(Flash Memory)のような半導体メモリ素子により構成される補助記憶装置により構成し、検証装置100の外部に備える構成としてもよい。
The
以下、各部について説明する。 Each part is explained below.
<取得部110および記憶部120>
取得部110は、予め図示しない記憶部に記憶しておいた入力データX=(x1,…,xK)を取り出し、同一と見做されるAI群に含まれる各AIに対して出力する。同一と見做されるAI群は、N個のAIを含む。Nは1以上の整数の何れかであり、検証を行うのに十分な量とする。n番目のAIは、入力データX=(x1,…,xK)を入力とし、出力データ(yn,1,…,yn,K)を求め、取得部110へ出力する。n=1,2,…,Nについて同様の処理を行い、取得部110は、出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}を取得し(S110)、記憶部120に格納する。この処理を、検証対象のAIに入力データX=(x1,…,xK)を与えたときに得られる出力データYm=(ym,1,…,ym,K)が入力される前に行う。
<
The
検証対象のAIが決まると、取得部110は、入力データX=(x1,…,xK)を検証対象のAIに対して出力する。検証対象のAIは、入力データX=(x1,…,xK)を入力とし、出力データYm=(ym,1,…,ym,K)を求め、取得部110へ出力する。取得部110は、出力データYmを取得し(S110)、検証部130に出力する。
Once the AI to be verified is decided, the
ここで、同一と見做されるAI群は、AIの作成者や利用者の想定の範囲(許容される変化または差異の範囲)内のAI群を指し、AIの作成者や利用者が用途に合わせて想定の範囲や条件を適宜設定すればよい。同一と見做されるAI群は、例えば、継続的に学習して変化するAIを所定の期間経過毎にN回取得することで得てもよいし、ある時点の継続的に学習して変化するAIをN個のクライアントに配布し、所定の期間経過後にN個のクライアントから変化後のAIを取得することで得てもよい。なお、元となる1つのAIを同一と見做されるAI群に含まれるN個のAIの中の一つとしても良いし、AI群に含まなくともよい。ここで、AIの作成者や利用者が想定している範囲を超えて変化したAIは、元となる1つのAIから変化したものであっても、同一と見做されるAI群に含まないようにする。なお、想定の範囲を超えて変化したか否かの判定は、閾値等を設けて、AIの内部で用いる変化するパラメータと閾値との大小関係、または、AIの出力データと閾値との大小関係に基づいて、自動的に行ってもよいし、人手によって行ってもよい。また、AIの内部で用いるパラメータの学習前の初期値としてランダムパラメータを用いる場合には、同一の構造を持つAIであって、ランダムパラメータが異なるAIを複数用意することができるので、同一の構造を持つことを同一のAIと見做す条件として、これら複数のAIを同一と見做されるAI群としてもよい。さらに、これらの方法を組み合わせてAI群を取得してもよい。例えば、同一の構造を持つAIであって、ランダムパラメータが異なるAIをN個用意し、N個のクライアントに配布し、所定の期間経過後にN個のクライアントから変化後のAIを取得してもよい。
<検証部130>
検証部130は、出力データYm=(ym,1,…,ym,K)を入力とし、出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}を記憶部120から取り出し、検証対象のAIが、同一と見做されるAI群と同一であるかに否かを統計的仮説検定により検証し(S130)、検証結果を出力する。検証結果は、検証対象のAIが同一と見做されるAI群と同一と見做せるか否かを示す情報であればよく、例えば、「検証対象のAIが同一と見做されるAI群と同一と見做せる」ことを示す値(例えば、1)、または、「検証対象のAIが同一と見做されるAI群と同一と見做せない」ことを示す値(例えば、0)である。
Here, a group of AIs considered to be identical refers to a group of AIs within the range of the assumptions of the creator or user of the AI (the range of acceptable changes or differences), and the creator or user of the AI may set the range of assumptions and conditions appropriately according to the purpose. A group of AIs considered to be identical may be obtained, for example, by acquiring an AI that continuously learns and changes N times at each predetermined period of time, or by distributing an AI that continuously learns and changes at a certain point in time to N clients and acquiring the changed AI from the N clients after a predetermined period of time has passed. Note that a single original AI may be one of the N AIs included in the group of AIs considered to be identical, or it may not be included in the AI group. Here, an AI that has changed beyond the range expected by the creator or user of the AI is not included in the group of AIs considered to be identical, even if it has changed from a single original AI. In addition, the determination of whether or not the change has exceeded the expected range may be performed automatically or manually based on the magnitude relationship between the variable parameters used inside the AI and the threshold value, or the magnitude relationship between the output data of the AI and the threshold value, by setting a threshold value or the like. In addition, when a random parameter is used as the initial value of the parameter used inside the AI before learning, it is possible to prepare multiple AIs that have the same structure but different random parameters, and these multiple AIs may be considered as a group of AIs that are considered to be the same, with the condition of having the same structure being that the AIs are considered to be the same. Furthermore, a group of AIs may be obtained by combining these methods. For example, N AIs that have the same structure but different random parameters may be prepared and distributed to N clients, and the changed AIs may be obtained from the N clients after a predetermined period of time has passed.
<
The
以下、検証部130の検証方法について説明する。
The verification method used by the
(1)出力データYmまたはYの要素yp,k(pはm、1,2,…,Nであり、k=1,2,…,K)がスカラ値の場合、出力データYm、Yをそれぞれ統計的にある母集団から抽出された標本とみなし、母集団の確率分布が異なるかどうかを統計的仮説検定により調べ、検定の結果をもって、AIとしても異なるか否かを判断する。つまり、母集団の確率分布が異なる場合にはAIとして異なり、母集団の確率分布が同じ場合にはAIとして同一と見做せる。ここで用いる統計的仮説検定は、母集団の確率分布が異なるかどうかを検証することができるものであり、例えば2標本のコルモゴロフ-スミルノフ検定や、マン・ホイットニーのU検定が挙げられる。AIの同一性を検証する方法として、AIを関数と見做して入出力の距離から類似度を近似し、類似度と閾値との大小関係に基づき、検証対象のAIが同一と見做されるAI群と同一と見做せるか否かを判定する方法も考えられるが、その場合、類似度に対する適切な閾値を設定する必要がある。一方、統計的仮説検定では、別途適切な閾値を設定する必要はなく、従来の手法に基づいて母集団の確率分布が異なるかどうかを調べることができる。
(2)出力データYmまたはYの要素yp,kがベクトル(多次元)の場合、出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}の中からある任意のAIの出力データ(yg,1,…,yg,K)(gは1,2,…,Nの何れか)を選択し、出力データyg,kを基準として、要素yp',kとの距離dp',kを計算する。ここで、p'はg以外(p'≠g)のm、1,2,…,Nである。例えば、同一と見做されるAI群に含まれるN-1個のAIの元となる1つのAIがある場合には、その元となるAIの出力データを選択してもよい。ここで、dp',kがスカラ値となるような距離を計算すれば、出力データYmまたはYの要素yp,kがスカラ値の場合と同様の処理(ただし、要素yp,kに代えて距離dp',kを用いる)によって、母集団の確率分布が異なるかどうかを統計的仮説検定により調べ、検定の結果をもって、AIとしても異なるか否かを判断することができる。例えば、距離として、ユークリッド距離が挙げられる。
ただし、yp',k,q、yq,k,qは、それぞれベクトル値である要素yp',k、yg,kのq次元目の値であり、Qkはベクトル値である要素yp',k、yg,kの次元数である。なお、距離を計算する手法は一例であり、ベクトル値をスカラ値に変換できる方法であれば他の方法を用いてもよい。
(1) When the output data Ym or the element yp,k (p is m, 1, 2, ..., N, k = 1, 2, ..., K) of Y is a scalar value, the output data Ym and Y are regarded as samples statistically extracted from a certain population, and a statistical hypothesis test is used to examine whether the probability distributions of the populations are different, and the results of the test are used to determine whether the AIs are different as well. In other words, if the probability distributions of the populations are different, the AIs are considered to be different, and if the probability distributions of the populations are the same, the AIs are considered to be the same. The statistical hypothesis test used here can verify whether the probability distributions of the populations are different, and examples of such tests include the Kolmogorov-Smirnov test for two samples and the Mann-Whitney U test. As a method for verifying the identity of AIs, it is possible to consider a method in which the AI is regarded as a function, the similarity is approximated from the distance between the input and output, and whether the AI to be verified can be considered to be the same as a group of AIs that are considered to be the same based on the magnitude relationship between the similarity and a threshold value. In this case, however, it is necessary to set an appropriate threshold value for the similarity. On the other hand, statistical hypothesis testing does not require the setting of a separate appropriate threshold, and can examine whether the probability distributions of populations differ based on conventional methods.
(2) If the output data Ym or element yp,k of Y is a vector (multidimensional), select the output data (yg ,1 , ..., yg ,K ) (g is 1, 2 , ... , N) of any AI from the output data Y = {(y1,1, ..., y1, K ), ..., (yN,1, ..., yN,K)}, and calculate the distance dp',k from element yp ',k based on the output data yg,k . Here, p' is m, 1, 2, ..., N other than g (p' ≠ g). For example, if there is one AI that is the source of N-1 AIs included in a group of AIs that are considered to be identical, the output data of that source AI may be selected. If we calculate a distance such that dp ',k is a scalar value, we can use the same process as when the output data Ym or element yp,k of Y is a scalar value (but use distance dp ', k instead of element yp, k) to examine whether the probability distributions of the populations are different through statistical hypothesis testing, and use the test results to determine whether the AI is also different. For example, the distance can be Euclidean distance.
Here, yp ',k,q and yq ,k,q are the q-th dimension values of the vector-valued elements yp ',k and yg,k, respectively, and Qk is the number of dimensions of the vector-valued elements yp ',k and yg,k . Note that the method for calculating the distance is just one example, and other methods may be used as long as they can convert vector values into scalar values.
<効果>
以上の構成により、AIの内部構造やAI内部のパラメータを参照することなく、出力データのみから、継続的に学習して変化するAIの同一性も検証することができる。
<Effects>
With the above configuration, it is possible to verify the identity of an AI that continuously learns and changes from only the output data, without referring to the internal structure or parameters of the AI.
<変形例>
本実施形態では、取得部110は、出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}を記憶部120に格納しているが、出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}を受け取った直後、または、同時に、検証を行う場合には、記憶部120を経由せずに、出力データYを検証部130に直接出力する構成としてもよい。また、取得部110は、出力データYmを検証部130に出力しているが、出力データYmを受け取った直後、または、同時に、検証を行わない場合には、検証を行うまで記憶部120に格納する構成としてもよい。この場合、検証部130が、検証時に記憶部120から出力データYmを取り出せばよい。
<Modification>
In this embodiment, the
本実施形態では、取得部110は、入力データX=(x1,…,xK)を、同一と見做されるAI群に含まれる各AIおよび検証対象のAIに対して出力しているが、この処理を外部の装置が行い、取得部110は、出力データY={(y1,1,…,y1,K),…,(yN,1,…,yN,K)}、Ym=(ym,1,…,ym,K)の取得、および、格納または出力を行う構成としてもよい。
In this embodiment, the
<その他の変形例>
本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。
<Other Modifications>
The present invention is not limited to the above-mentioned embodiment and modified examples. For example, the above-mentioned various processes may be executed not only in chronological order as described, but also in parallel or individually depending on the processing capacity of the device executing the processes or as necessary. In addition, appropriate modifications are possible within the scope of the present invention.
<プログラム及び記録媒体>
上述の各種の処理は、図3に示すコンピュータ2000の記録部2020に、上記方法の各ステップを実行させるプログラムを読み込ませ、制御部2010、入力部2030、出力部2040、表示部2050などに動作させることで実施できる。
<Program and recording medium>
The various processes described above can be implemented by loading a program that executes each step of the above method into the
この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program describing this processing can be recorded on a computer-readable recording medium. Examples of computer-readable recording media include magnetic recording devices, optical disks, magneto-optical recording media, and semiconductor memories.
また、このプログラムの流通は、例えば、そのプログラムを記録したDVD、CD-ROM等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program may be distributed, for example, by selling, transferring, or lending portable recording media such as DVDs or CD-ROMs on which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of a server computer and transferring the program from the server computer to other computers via a network.
このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるASP(Application Service Provider)型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの(コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等)を含むものとする。 A computer that executes such a program, for example, first stores in its own storage device the program recorded on a portable recording medium or the program transferred from a server computer. Then, when executing a process, the computer reads the program stored on its own recording medium and executes the process according to the read program. As another execution form of the program, the computer may read the program directly from the portable recording medium and execute the process according to the program, or may execute the process according to the received program each time a program is transferred from the server computer to the computer. The above-mentioned process may also be executed by a so-called ASP (Application Service Provider) type service that does not transfer the program from the server computer to the computer, but realizes the processing function only by issuing an execution instruction and obtaining the results. Note that the program in this form includes information used for processing by an electronic computer that is equivalent to a program (such as data that is not a direct command to the computer but has properties that specify the processing of the computer).
また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、本装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In addition, in this embodiment, the device is configured by executing a specific program on a computer, but at least a portion of the processing may be realized by hardware.
Claims (3)
前記第一の出力データ群と前記第二の出力データとを用いて、前記検証対象の機械学習モデルが、前記機械学習モデル群と同一であるかに否かを統計的仮説検定により検証する検証部と、を含む、
検証装置。 an acquisition unit that acquires a first output data group obtained when input data is provided to a group of machine learning models that are considered to be identical, and a second output data obtained when the input data is provided to a machine learning model to be verified;
A verification unit that verifies whether the machine learning model to be verified is identical to the machine learning model group by statistical hypothesis testing using the first output data group and the second output data.
Verification device.
検証部が、前記第一の出力データ群と前記第二の出力データとを用いて、前記検証対象の機械学習モデルが、前記機械学習モデル群と同一であるかに否かを統計的仮説検定により検証する検証ステップと、を含む、
検証方法。 An acquisition step in which an acquisition unit acquires a first output data group obtained when input data is provided to a group of machine learning models considered to be identical, and a second output data obtained when the input data is provided to a machine learning model to be verified;
A verification step in which a verification unit verifies whether the machine learning model to be verified is identical to the machine learning model group by a statistical hypothesis test using the first output data group and the second output data.
Verification method.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/016420 WO2024224504A1 (en) | 2023-04-26 | 2023-04-26 | Verification device, verification method, and program |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/016420 WO2024224504A1 (en) | 2023-04-26 | 2023-04-26 | Verification device, verification method, and program |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024224504A1 true WO2024224504A1 (en) | 2024-10-31 |
Family
ID=93256148
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2023/016420 Pending WO2024224504A1 (en) | 2023-04-26 | 2023-04-26 | Verification device, verification method, and program |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024224504A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018182442A1 (en) * | 2017-03-27 | 2018-10-04 | Huawei Technologies Co., Ltd. | Machine learning system and method for generating a decision stream and automonously operating device using the decision stream |
| WO2020194509A1 (en) * | 2019-03-26 | 2020-10-01 | 三菱電機株式会社 | Reliability assessment device and reliability assessment method |
| US20230034136A1 (en) * | 2021-07-30 | 2023-02-02 | Kabushiki Kaisha Toshiba | System and method for scheduling communication within a distributed learning and deployment framework |
-
2023
- 2023-04-26 WO PCT/JP2023/016420 patent/WO2024224504A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018182442A1 (en) * | 2017-03-27 | 2018-10-04 | Huawei Technologies Co., Ltd. | Machine learning system and method for generating a decision stream and automonously operating device using the decision stream |
| WO2020194509A1 (en) * | 2019-03-26 | 2020-10-01 | 三菱電機株式会社 | Reliability assessment device and reliability assessment method |
| US20230034136A1 (en) * | 2021-07-30 | 2023-02-02 | Kabushiki Kaisha Toshiba | System and method for scheduling communication within a distributed learning and deployment framework |
Non-Patent Citations (1)
| Title |
|---|
| SUZUKI RYOHEI, ASHIZAWA NAMI, KIRIBUCHI NAOTO: "A Study on the Identification of AI 鈴木 亮平", 2022 JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE NATIONAL CONFERENCE (36TH), JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE, JP, vol. 35, 14 June 2022 (2022-06-14) - 17 June 2022 (2022-06-17), JP, pages 1 - 3, XP093228404 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12056583B2 (en) | Target variable distribution-based acceptance of machine learning test data sets | |
| EP3924858B1 (en) | Efficient access of chainable records | |
| CN109657499A (en) | Metadata validation method, system server and computer readable storage medium | |
| WO2018179765A1 (en) | Information processing device, information processing method, and computer-readable storage medium | |
| CN111222176B (en) | Blockchain-based cloud storage possession proof method, system and medium | |
| CN108875061A (en) | A kind of conformance test method and relevant apparatus of distributed file system | |
| US8347052B2 (en) | Initializing of a memory area | |
| CN112286457B (en) | Object deduplication method and device, electronic equipment and machine-readable storage medium | |
| CN117955730A (en) | Identity authentication method, product, equipment and medium | |
| WO2023051308A1 (en) | Data verification method and apparatus, device and storage medium | |
| CN112099870B (en) | Document processing method, device, electronic equipment and computer readable storage medium | |
| US7685211B2 (en) | Deterministic file content generation of seed-based files | |
| CN105354506B (en) | The method and apparatus of hidden file | |
| WO2024224504A1 (en) | Verification device, verification method, and program | |
| CN117331956A (en) | Task processing method, device, computer equipment and storage medium | |
| CN117708605A (en) | A method for confirming the rights of data resources | |
| KR101850650B1 (en) | Portable storage device perfoming a ransomeware detection and method for the same | |
| CN117033172A (en) | Test data processing method, apparatus, device, storage medium and program product | |
| CN119156610A (en) | Boot code transparency system | |
| CN115270766A (en) | A data quality verification method for long text extraction results | |
| US20190294820A1 (en) | Converting plaintext values to pseudonyms using a hash function | |
| US12284296B2 (en) | Method of managing data history and device performing the same | |
| WO2025177389A1 (en) | Verification device, verfication method and program | |
| WO2025177388A1 (en) | Verification device, verfication method and program | |
| CN120124106B (en) | Incremental desensitization method and system for heterogeneous data sources |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23935284 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2025516370 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2025516370 Country of ref document: JP |