1 Introduction

In recent years, PPC methods have played a significant role in enabling the computation of a function by two or more mutually distrusting parties over their respective private datasets. PPCs can be used to (effectively) enrich, diversify, or enlarge the available data and thereby obtain improved, more representative, or otherwise impossible results from a wide range of computations, e.g., statistics, machine learning (ML), or risk modeling.

Due to their potential to allow for such collaborations without losing confidentiality, PPC methods have been adopted in many domains, from healthcare [15], to auctions [6], and finance [21]. Due to recent advances in the field, we even observe usage of PPC schemes for data-intensive applications such as neural network inference [145].

At the same time, (homomorphic) message authentication codes (MACs), trusted execution environments (TEEs), and ZKPs allow honest parties to verify computations executed by other untrusted parties. These techniques can be applied when parties are mutually distrusting, or when auditability by external parties or the general public is required. Most notably, the introduction of efficient ZKPs has led to a wide range of verifiable computing (VC) applications, ranging from verifiable outsourcing [84, 129] to anonymous payment systems [29, 38], and publicly verifiable voting [94, 123].

Whilst PPC methods almost always provide clear guarantees regarding privacy of data and/or computation within a certain security model, they often do not, by themselves, guarantee data authenticity or — more importantly — offer verifiability of computation. Recent advances, in both verifiability and PPC, have led to schemes that combine a PPC method with a certain verifiability approach to enable verifiable, privacypreserving computations (VPPCs) between distrusting parties. These solutions are particularly relevant when not only input privacy, but also the correctness of a computation’s results are at stake, e.g., in computation outsourcing [129], decentralized applications [39], or e-voting [116].

The combination of privacy-preserving and verifiable computations is needed in a wide range of settings, especially when private data is distributed over mutually distrusting parties. In the situation where a group of parties wants to collaboratively compute a function over their private data, but does not have sufficient computational resources to do so, outsourcing of the computation (to the cloud) has become a common practice. However, in addition to privacy concerns, the parties have no guarantee whether the computational results are actually correct. By using VPPC solutions, all parties can be sure that their data remains private ánd that their results are correct, thereby increasing trust in outsourced computations.

An even more prolific setting these days, is where an external party or the general public needs to be able to verify the results of a privacy-preserving computation. In electronic voting, auctions, and many distributed ledger technology (DLT) applications, the public, i.e., a group of people that need not be known a priori, should be able to verify correctness to ascertain honest election results, auction outcomes, and so on.

Due to the diversity in both PPC and verifiability techniques, we observe a wide range of VPPC schemes, making analyses and comparisons a non-trivial task. In this work, we aim to answer the following questions to better understand the differences between different VPPC schemes and how they can be applied in practice:

  1. 1.

    Which classes of VPPC schemes are there? In which use cases or settings are they most applicable?

  2. 2.

    Under which conditions do privacy and verifiability hold? Which schemes provide public verifiability?

  3. 3.

    How efficient/suitable are the schemes for practical usage? Are there any other limiting factors?

  4. 4.

    Which open problems are still to be solved? How can current solutions be improved upon?

Contributions

In this work, we first determine the most interesting VPPC schemes and classify them into four main classes. This is followed by a high-level comparison and discussion of underexposed topics. Next, we analyze and compare the different methods, by dividing the solutions per construction method. Finally, we describe open challenges for the different solutions and discuss the most promising constructions and future research directions. To summarize, our contributions are as follows:

  • We identify and classify 41 existing solutions from academic literature into four main classes. We further divide each class based on the schemes’ distinguishing primitives.

  • We compare these works based on the settings in which privacy and verifiability are provided. Moreover, we study the efficiency of the different schemes, and discuss practical aspects such as public verifiability and use case.

  • We provide formal, overarching definitions for generic VPPC schemes. These definitions combine and generalize individual definitions for instantiations based on specific building block types.

  • We compare the different construction methods for VPPC schemes and discuss which ones seem most promising for practical usage. Next to this, we denote open challenges, improvement areas, and future research directions for the different solutions, regarding security, efficiency, or practicality.

Organization

The remainder of this work is organized as follows. Section 2 discusses preliminaries regarding verifiability approaches and Section 3 discusses relevant background information regarding (non-verifiable) PPCs. We classify a large number of existing VPPC schemes in Section 4 and present formal, generalizing security definitions, as well as a high-level comparison in Section 5. Each solution (paradigm) and its challenges are discussed in detail, per class, in Section 8 to 9. In Section 10 we use the aforementioned discussion to define promising directions for future research. Finally, we conclude in Section 11.

2 Preliminaries

This section provides relevant background on the three methods for (public) verifiability that are used to construct the significant majority of VPPC schemes: ZKPs, homomorphic MACs and TEEs.

2.1 Zero-knowledge proofs

A zero-knowledge proof (ZKP) allows a prover to prove the truth value of a certain relation to a (group of) verifier(s), whilst hiding all private information [75]. This is useful in cases where the truth value of a claim is not evident, but the prover does hold private information that is sufficient to construct a proof thereof. Concretely, given some public values (or statement) x and some secret values (or witness) w, a prover can use a ZKP to show that (xw) satisfies some relation R. For example, given the public output y of a hash function H, the prover can show that there exists a secret pre-image x, that satisfies \(y=H(x)\).

A ZKP scheme should at least satisfy the following (informal) properties [35].

  • Completeness: An honest prover should be able to convince an honest verifier.

  • Soundness: A dishonest prover should not be able to convince an honest verifier.

  • Zero-Knowledge: Given the statement x, the proof should reveal nothing other than the truth value of x for the relation R. Especially, nothing should be revealed concerning the witness w.

In this work, we mostly consider schemes satisfying a stronger notion of soundness, namely knowledge soundness, which not only guarantees that there exists a witness w such that (xw) satisfies R, but also that the prover knows w. Given our hash function example, this guarantees that the prover actually knows the pre-image of a given hash y, rather than only guaranteeing its existence. We call such proofs ZKPs of knowledge, but for brevity use ZKP in the remainder of this work.

Initially, most ZKP schemes were solutions for specific problems, mostly in the form of \(\varSigma \)-protocols, e.g., Schnorr’s protocol [126]. While these schemes could be used to create proofs for arbitrary NP-relations, they often have a proof size and verification time that scale linearly (or worse) in the computation size. On top of that, \(\varSigma \)-protocols are interactive, which is often undesired. Fortunately, \(\varSigma \)-protocols can be made non-interactive using the Fiat-Shamir (FS) heuristic [66], which additionally enables public verifiability, and can be proved secure in the random oracle model.

The introduction of Pinocchio [117] as the first succinct ZKP or zero-knowledge succinct non-interactive argument of knowledge (zk-SNARK) brought us the first scheme with efficient verification and small proof size. Succinctness means that both the proof size — or communication costs in the interactive case — and verification time are \(\textsf {poly}(\kappa , |x|, \log |w|)\), where \(\kappa \) is the security parameter [33, 143]. We stress that |w| depends on the complexity of the relation to be checked, i.e., for verifiable computation with respect to an arithmetic circuit \(\mathcal {C}\), |w| scales as |C|. Thus, for most schemes the practicality of their performance largely depends on how they scale in |w|.

Moreover, Pinocchio could be used to prove any computation that can be described as an arithmetic circuit, i.e., it supports VC. It was superseded by many schemes with improved efficiency and/or different properties, such as Groth16 [77] and Marlin [50]. The large majority of these zk-SNARKs have constant proof size, and a verification time that is linear only in |x|, i.e. independent of |w|. This makes these schemes very desirable in cases with little storage availability or a large number of proofs to be verified, such as in DLTs. Of course, this does come at a cost, zk-SNARKs often rely on non-standard cryptographic assumptions and a trusted setup, which are not desirable in many use cases. In the context of zk-SNARKs, non-standard assumptions generally refers to so called knowledge(-of-exponent) assumptions that underlie these efficient ZKPs. These assumptions are non-falsifiable, and therefore harder to have confidence in.

Some more recent zk-SNARKs, like Fractal [51], at least remove the trusted setup, but this comes at the cost of proof sizes and verification times that do depend on \(\log |w|\), but are still succinct. An additional advantage of schemes like Fractal, is that they are plausibly post-quantum secure. In addition, we note that the trusted setup as used in many zk-SNARKs is often costly, and therefore only pays off when it is re-used multiple times. This is for example the case for situations where the same relation is checked many times, e.g., anonymous cryptocurrencies [29]. Moreover, many recent zk-SNARKs offer a computation-independent, universal setup, thereby greatly increasing the set of applicable use cases.

Another direction for efficient non-interactive zeroknowledge proofs (NIZKs) was started with the introduction of Bulletproofs [41]. More recently, the efficient inner-product argument of Bulletproofs has been applied to create \(\varSigma \)-bullets [40], or compressed \(\varSigma \)-protocol theory [13]. The protocols based on the latter can be used to construct ZKPs for arithmetic circuits from standard security assumptions and without trusted setups. I.e., they rely on discrete-log(-like) assumptions and their non-interactive versions can be proved secure in the random-oracle model. However, while these schemes do have succinct (logarithmic) proof sizes, they are not succinct with respect to their often linearly scaling verification time. They are however still useful for many use cases, where verification time is less important, e.g., when it is dominated by other cryptographic computations inside the same construction.

For a more complete and formal treatment of ZKP schemes, and trust assumptions, we refer the reader to, e.g., [135, 147].

2.2 Homomorphic MACs

A message authentication code (MAC) scheme enables a user to generate a short, unforgeable tag for a message such that, later, any recipient of both tag and message can verify the integrity and authenticity of that message. The tag is computed using a secret key and is verified by checking the MAC against this same secret authentication key.

Regular MACs are non-malleable on purpose, i.e., it is undesirable for parties to be able to alter both message and tag in such a way that verification still succeeds. However, homomorphic MACs deliberately allow for a restricted and pre-defined set of operations to be applied to both message and tag such that verification is still successful.

2.3 Trusted Execution Environments

TEEs, or secure enclaves/trusted hardware, are dedicated hardware (and software) components, that run in an isolated part of the main CPU, and are capable of running code while maintaining input privacy and integrity. These days, TEEs are offered by most major hardware vendors as well as a number of open source projects, e.g., Intel SGX [81], AMD SEV [8], or Enarx [65]. Code running on a TEE is isolated in such a way that it cannot be tampered with or accessed by any other process.

A user wishing to use a remote TEE securely can verify that it is running the right code and has been created using safe setup procedures, by using a process known as (remote) attestation, as supported by, e.g., Intel SGX [81]. Whilst this attestation could, in principle, be made publicly verifiable without requiring any trusted hardware on the verifier’s side, it should be stated that it is no trivial task to verify that the code as run by the TEE in fact does what it is intended to do [111]. Moreover, publicly verifiable remote attestation is not provided by all platforms, or might require communication with the TEE while it is still online. For many parties, it would be much too complex to perform an audit of this code, or to guarantee that attestation can still be checked in the future, making this approach not as suitable for public verifiability as other paradigms.

Another possibly undesirable property is that users have to trust that the hardware has not been tampered with or has been broken in an undetectable manner. There are cases in academic literature where TEE security has been broken, however due to the novelty of the technology it is still unclear to which extent such attacks are possible on the most recent TEE hardware [107].

In this work, we focus on the use of TEEs for verifiable privacy-preserving generic computations on distributed data. Admittedly, TEEs can also be used for outsourced computations, distributed computations on data from a single source, or to realize protocols for specific computations. We consider such schemes out of scope for this survey, and refer to [96], for a discussion of these type of schemes.

3 Privacy-preserving computations on distributed data

There are different technologies that can be used to construct schemes for PPC over distributed data. Some are specifically suitable as building blocks for VPPCs: HE, MPC, DLT, and differential privacy (DP). We will see in Section 5.2 that different building blocks are best suitable for different use cases.

3.1 Homomorphic encryption

Homomorphic encryption (HE) denotes a collection of (asymmetric) encryption schemes that allow users to perform operations on encrypted data without decrypting it first. In general, an HE scheme is described by four probabilistic polynomial time (p.p.t.) algorithms \((\textsf {KeyGen}, \textsf {Enc}, \textsf {Dec}, \textsf {Eval})\), for key generation, encryption, decryption, and function evaluation.

While there are different ways to represent the function that is to be evaluated on HE ciphertexts, most constructions translate it to an arithmetic circuit. We distinguish, following [10], several types of homomorphic schemes in increasing order of functionality.

Partially homomorphic encryption (PHE) is a type of homomorphic encryption that is either additively or multiplicatively homomorphic, implying that it can only evaluate arithmetic circuits consisting solely of addition, respectively multiplication gates.

Somewhat homomorphic encryption (SWHE) and leveled homomorphic encryption (LHE) constructions can evaluate arithmetic circuits consisting of both addition and multiplication gates. However, the circuits that can be evaluated may be limited in the number of operations (of either type). These schemes can also have practical limitations, such as ciphertexts or key sizes that increase exponentially in the number of multiplications.

Fully homomorphic encryption (FHE) schemes can evaluate arbitrary arithmetic circuits, without exponential blowup of ciphertext or key size. This generality often comes at the cost of computationally expensive bootstrapping after a number of multiplications [73]. We observe that bootstrapping is becoming more practical in recent years, as exemplified by, e.g., TFHE [52].

In the remainder of this work, when we talk about HE we refer to SWHE, LHE and (predominantly) FHE schemes, unless explicitly stated otherwise.

3.2 Secure multiparty computation

MPC is a collection of techniques that allows for performing joint computations over distributed data, while keeping all input data private and revealing nothing but the computation output. There are several ways to construct an MPC protocol or framework, some of which are more frequently used than others.

Generally, there are n parties that participate in an MPC computation, and any participating party contributes with their respective private inputs. The n parties jointly execute an interactive protocol to compute the function output from their private inputs. Some of these parties might be corrupted by an adversary. Therefore, any secure MPC protocol should at least satisfy: input privacy, correctness, and independence of inputs [97]. Moreover, in the active security setting (see Appendix B), a scheme should also be secure against a certain number of malicious parties, meaning that correctness should hold when these parties deviate from the protocol.

The active security setting is very similar to the setting we consider in this work, i.e., it also focuses on generic computations for distributed data, and offers security against malicious parties. Quite a few actively secure schemes, even guarantee correctness when all but one parties are malicious. However, in this work, we put an emphasis on verifiability, meaning that we only compare works that specifically focus on providing verifiable correctness guarantees, beyond the traditional active security. For sake of completeness, we do mention a few important schemes that have led up to the those discussed on this survey, e.g., SPDZ [61], MASCOT [90], and SPDZ-2k [56]. Many of the works included in this survey have built upon these works, making them important developments towards reaching (publicly) verifiable MPC.

In Appendix A we summarize the most frequently used constructions for MPC: secret sharing, garbled circuits (GCs), and PHE-based. Additionally, we note that MPC can also be realized by means of TEEs, as is used by some of the constructions in this survey.

3.3 Distributed ledger technologies

DLTs cover a collection of technologies where a (large) number of parties can collectively create, maintain, and agree on a shared state or ledger. The shared state can be known completely to all parties, but can also be distributed over the respective parties. A well-known example of a DLT is blockchain, where all parties agree — through a consensus mechanism — on a public ledger describing the full system state. The blockchain, a specific instance of a DLT, is most well-known for realizing decentralized online currencies such as Bitcoin [108], or smart contract systems like Ethereum [142]. However, DLTs also have applications in many other domains, from identity management [4] (e.g., self-sovereign identity) to healthcare [3]. Particularly, we observe the proliferation of decentralized apps, or dApps, each of which can be seen as a blockchain-based autonomous application, whose behavior is defined by a collection of scripts, or smart contracts.

In lockstep with an increasing number of dApps, we observe an increase in using DLT for decentralized computations. Such decentralized computations can be used in any situation where multiple parties wish to perform a group computation. A group computation can be: (1) a single computation on distributed data; (2) a decentralized computation on centrally stored data; or (3) a combination of the two. Especially of interest for this work are decentralized computations with input privacy, e.g., [5, 30].

3.4 Differential privacy

In DP methods, sensitive input data, or results obtained from sensitive data, are perturbed such that the underlying input data is hidden. The technique was first formalized in [64] and while it gives a weaker privacy guarantee than is provided by, e.g., MPC or HE it also comes with benefits. DP computations are often significantly more efficient than other methods, and the fact that results are not exact, due the added perturbation, can be used to prevent other types of attacks such as de-anonymization using external sources, or inference attacks on ML models [2, 47].

DP is often divided in two settings: local differential privacy (LDP) and a model with a trusted curator. In LDP, the data owners perturb the data themselves (locally), and subsequently send their DP-data to an (untrusted) curator who computes the final result. For the model with a trusted curator, the parties send their private data directly to the trusted curator, who computes the final result from the data. The final result is only released to the public after adding differentially private noise in order to protect the sensitive input data.

There have been many variations on these models, including the more recent shuffle model [34, 49], combining DP with MPC [121], or verifiable DP. The latter has a basis in cryptographic randomized response techniques as used in [7], and can be used to realize, e.g., verifiable differentially private polling [106]. In the remainder of this work we focus on verifiable DP for generic applications.

4 Classification

In this section, we first summarize the approach for our literature search and then classify the resulting works in four classes. This is followed by a description of the relevant properties that we have analyzed for each of the works included in our study. These properties are formally defined in Section 5, followed by a comparison of the classes and a discussion of generic, high-level observations.

4.1 Literature search

To obtain a comprehensive list of relevant works we apply the snowballing approach [141]. First, we determined a small set of recent, relevant articles on VPPC schemes. Specifically, we determined at least one very recent ‘seed’ article for different underlying (privacy-preserving) computation techniques, by querying the Google Scholar and Scopus databases. The most recent, at the time of writing, matching verifiable MPC paper [124] was found with the query: (public verifiability OR public auditability) AND (MPC OR SMPC OR multi-party computation OR secure multi-party computation). Two relevant, recent verifiable HE papers [99, 139] were found using: (verifiable OR public verifiability OR public auditability) AND (homomorphic encryption OR fully homomorphic encryption). The most recent, at the time of writing, matching verifiable DP paper [105] was found with the query: (verifiable OR public verifiability OR public auditability) AND (differential privacy). The final seed paper [39] on DLT-based privacy-preserving computations was, up front, known to be recent and relevant, hence no specific query was used.

Subsequently, we checked all papers in the reference list of these ‘seed’ articles to the second degree. Next to this, all articles that referenced our ‘seed’ articles were also checked. Additionally, we included papers that we were pointed to by field experts and checked their reference lists as well. After filtering the articles on their relevance, based on title, abstract, and a quick scan of the content, we found 41 relevant works that present a novel or improved solution for VPPCs.

Table 1 Classification of VPPC \(\hbox {schemes}^\dagger \)

4.2 VPPC classes

Following the literature review, we divide the works into four main classes of VPPC schemes, based on the underlying (privacy-preserving) computation technique.

An overview of our classification is given in Table 1, including the names we use to refer to the specific schemes in the remainder of this work. This table also includes a subdivision of each class based on the distinguishing technique used to achieve verifiability, along with other properties. We consider the following classes:

  • MPC-based VPPC   This class gathers all solutions that rely on — pre-existing, adapted, or novel — MPC schemes for computing functions on distributed private input data.

  • HE-based VPPC   This class gathers all solutions relying on existing HE for computing functions on private data. The computation can be evaluated by multiple parties, but this need not be the case, e.g., when considering verifiable outsourcing.

  • DLT-based VPPC   This class gathers all schemes that rely on DLT for realizing computation and communication between the parties. For most solutions in this class, data is only kept private from outside observers that do not partake in the computation.

  • DP-based VPPC   This class gathers all solutions relying on DP for evaluating functions on private input data.

  • TTP-based VPPC   Finally, we consider CP16  separately, since it solely relies on a trusted third party (TTP) to guarantee input privacy. Whilst it does use ZKPs to achieve verifiability, we feel that the use of a single TTP for input privacy, makes this scheme unsuitable for a real-world implementation of a VPPC scheme. For that reason, we do not further discuss this class.

Each of the remaining classes is discussed in detail in Sections 8 to 9, where we also describe the open challenges. Additionally, we split each larger class in different subclasses based on the verifiability paradigm. We distinguish the subclasses below, most of which appear within multiple classes. For ZKPs we distinguish specifically between succinct and non-succinct schemes, as this has a clear correlation with scalability and trust assumptions. However, in reality, a more fine-grained distinction (see also Section 2.1) is possible. Where relevant, these differences will be discussed in subsequent sections.

  • Succinct ZKP   This subclass contains constructions that predominantly rely on ZKPs for verifiability, where the used ZKP is succinct with respect to proof size ánd verification time (see Section 2.1).

  • Non-succinct ZKP   This subclass contains the constructions that predominantly rely on ZKPs that are not succinct with respect to either verification time or proof size, e.g., Bulletproofs have a logarithmic proof size, but linear verification time.

  • Homomorphic MAC   This subclass is only found within the HE class, and contains constructions that predominantly rely on homomorphic MACs. Similar to combining classical MACs with classical encryption schemes, we observe three ways to combine HE with homomorphic MACs: Encrypt-then-MAC, Encrypt-and-MAC, MAC-then-Encrypt. Each way of combining HE with homomorphic MACs will be elaborated in Section 7.1. Some MPC-based schemes also rely on (partially) homomorphic MACs, however it plays a smaller role in the constructions we investigate.

  • TEE   This category consists of schemes that rely mainly on TEEs, trusted hardware, or secure enclaves, to guarantee correctness in the presence of malicious adversaries (see also Section 2.3).

  • Other   Finally, in some main classes, we also find construction paradigms that are only observed once. These do not get their own subclass, but are instead discussed in the Other category.

A summary of the different classes and open challenges per class is provided towards the end in Tables 3 and 4. Additionally, we provide a brief overview of interesting techniques for works that are adjacent to VPPC schemes, but cannot be classified as such, in Section 5.4.

4.3 Properties

Below, we describe the properties we consider in our analysis of the considered VPPC schemes.

4.3.1 Security, privacy, and public verifiability

The first category of the properties we consider are related to security and verifiability, and are formally described in Section 5.1. We will not explicitly state whether schemes are correct (and complete), since all included schemes are by definition. The schemes can however differ in whether they provide public verifiability and under which conditions input privacy and security hold. An evaluation of all solutions with respect to these properties is provided in Table 1 and discussed in more detail in the sections that come after.

4.3.2 Assumptions

Security and input privacy can be based on different kinds of assumptions. Most schemes base their security on (computational) cryptographic assumptions. We classify the used assumptions as either standard or non-standard. Standard assumptions are those that have been widely accepted by the cryptographic community, e.g., discrete log, CDH, or DDH; such assumptions have been accepted for a longer period of time, and are well-understood. Next to cryptographic assumptions, some schemes, especially those based on zk-SNARKs may require a trusted setup. If this trusted setup is broken, malicious parties are able to create false proofs and break security guarantees.

Alternatively, some solutions rely on trusted hardware or TTPs for guaranteeing input privacy and/or security.

Table 2 Asymptotic complexities of each VPPC \(\hbox {scheme.}^\dagger \)

4.3.3 Efficiency and practicality

Practical performance and efficiency also play a big role in which solution is best suitable in a certain setting. However, since all solutions use different techniques, and different ways of evaluating the practical performance (if any), it is not feasible to compare the practical performance fairly. Instead, we focus on the asymptotic computation and communication complexities of all schemes and summarize these. We provide an overview of the asymptotic complexities of each scheme in Table 2. In case the authors did not mention the complexity of their solution in the article, and we could not find it in related works, we marked it as ‘N/A’.

Specifically, we describe the computation and communication complexity of the verifiable computation. Whenever possible, we split the costs between those used to create the function result y and those used for creating the verification attestation \(\tau _y\). If a solution makes use of a generic building block, rather than specifying the exact scheme used, we describe the type of building block used, e.g., linear secret sharing scheme (LSSS), HE, or zk-SNARK. Since verification is executed apart from the computation in the case of public verifiability, we list its complexity in a separate column.

Finally, we list the asymptotic number of commitments used by each solution, if they are used and if their complexity is not yet included in the other columns.

Lastly, we describe in Table 1 whether the original paper includes some form of experimental performance evaluation, whether an implementation is publicly available, or whether there are any other practical aspects that influence the usability of the solution.

5 Verifiable privacy-preserving computation

Verifiable protocols for specific applications have been around for several decades. To the best of our knowledge, the first use of public verifiability was in the context of e-voting [46, 54, 127]. Public verifiability has also been researched for other applications, such as online auctions [109], and secret sharing [127, 133].

5.1 Definitions

Research into schemes for (public) verifiability of computations has gained significant traction over the past decade. Solutions come from different subfields and under different names, but all have a common ground in that they aim to provide (public) verifiability for generic computations, often over private data. Below, we expand on these developments, by starting from VC and then providing a general definition for VPPC schemes. This is followed by a specialized definition for MPC-based protocols. A more extensive discussion of prior definitions from literature is presented in Appendix C.

Verifiable Computing. VC describes the setting where a (computationally constrained) client outsources the evaluation of a function f to an external worker. A VC scheme should satisfy at least: correctness, security, and efficiency (Definitions 11 to 13 in Appendix C). The basic definition of verifiable computation does not provide input privacy with respect to the workers, however it can be additionally defined as Definition 14. The notion of a VC scheme (Definition 10) is, to our best knowledge, first discussed in [70] in the designated verifier setting. An extension towards public verifiability was proposed in [71] and further generalized in [117] to Definition 15.

5.1.1 Generic VPPC

As observed in [139], there is a wide variety of definitions for verifiable HE (a more elaborate discussion can be found in Appendix C.3). We observe a similar variety for definitions of DLT- and DP-based VPPC. To better discuss and compare the different VPPC schemes and their properties, we provide a generalizing definition for VPPC schemes, based on existing definitions from VC [70, 71, 117] and verifiable HE [139]:

Definition 1

(VPPC scheme) A VPPC scheme is a 5-tuple of p.p.t. algorithms \(\mathcal {VPPC}\), such that:

  • \(\textsf {KeyGen} (f, 1^\kappa ) \rightarrow (sk, pk, ek)\): given a function f and security parameter \(\kappa \), outputs a secret key \(sk\), public key \(pk\), and evaluation key \(ek\);

  • \(\textsf {Encode} (pk/sk, x) \rightarrow (\sigma _x, \tau _x)\): given either \(pk\) or \(sk\), and an input x, returns an encoded input \(\sigma _x\) and attestation \(\tau _x\);

  • \(\textsf {Compute} (ek, \sigma _x) \rightarrow (\sigma _y, \tau _y)\): given \(ek\) and \(c_x\), returns an encoded output \(\sigma _y\) and attestation \(\tau _y\);

  • \(\textsf {Verify} (\sigma _y, \tau _x, \tau _y) \rightarrow \{0,1\}\): given \(\sigma _y\), \(\tau _x\) and \(\tau _y\), returns 1 if \(\sigma _y\) is a valid encoding for f(x), and 0 otherwise;

  • \(\textsf {Decode} (sk, \sigma _y) \rightarrow y\): given \(sk\) and \(\sigma _y\), returns the decoded output y.

A scheme \(\mathcal {VPPC}\) should satisfy at least, correctness, completeness, security, and input privacy. We explicitly split the VC definition of correctness into correctness and completeness to also allow for schemes with alternative correctness definitions, e.g., for schemes offering approximate computations. I.e., for approximate HE or DP methods we can define approximate correctness as in [139]. Moreover, we note that DP-based schemes do not satisfy strict input privacy but instead offer a different form of privacy, known as \(\epsilon \)-differential privacy, as defined in, e.g., [64, 88].

Definition 2

(Correctness) A scheme \(\mathcal {VPPC}\) is correct if for all functions f, \(\textsf {KeyGen} (f, 1^\kappa ) \rightarrow (sk, pk, ek)\), such that \(\forall x \in Domain(f)\), if \(\textsf {Encode} (pk/sk, x) \rightarrow (\sigma _x, \tau _x)\) and \(\textsf {Compute} (ek, \sigma _x) \rightarrow (\sigma _y, \tau _y)\)

then \(\Pr [\textsf {Decode} (sk, \sigma _y) = f(x)] = 1\).

Definition 3

(Completeness) A scheme \(\mathcal {VPPC}\) is complete if for all functions f, with \(\textsf {KeyGen} (f, 1^\kappa ) \rightarrow (sk, pk, ek)\), such that \(\forall x \in Domain(f)\),

if \(\textsf {Encode} (pk/sk, x) \rightarrow (\sigma _x, \tau _x)\) and \(\textsf {Compute} (ek, \sigma _x) \rightarrow (\sigma _y, \tau _y)\), then \(\Pr [\textsf {Verify} (\sigma _y, \tau _x, \tau _y) = 1] = 1\).

Definition 4

(Security) A scheme \(\mathcal {VPPC}\) is secure, if for any p.p.t. adversary \(\mathcal {A} = (\mathcal {A}_1, \mathcal {A}_2)\), such that for all functions f, if \(\textsf {KeyGen} (f, 1^\kappa ) \rightarrow (sk, pk, ek)\),

\(\mathcal {A}_1^{\mathcal {O}_{\textsf {Encode}},\mathcal {O}_{\textsf {Decode}}}(pk, ek) \rightarrow x\), \(\textsf {Encode} (pk/sk, x) \rightarrow (\sigma _x, \tau _x)\) and \(\mathcal {A}_2^{\mathcal {O}_{\textsf {Encode}},\mathcal {O}_{\textsf {Decode}}}(\sigma _x, \tau _x) \rightarrow (\sigma _y, \tau _y)\) then

\(\Pr [\textsf {Verify} (\sigma _y, \tau _x, \tau _y) = 1 \wedge \textsf {Decode} (sk, \sigma _y) \ne f(x)] \le \textsf {negl}(\kappa )\).

Definition 5

(Input privacy) A scheme \(\mathcal {VPPC}\) has input privacy, if for any p.p.t. adversary \(\mathcal {A} = (\mathcal {A}_1, \mathcal {A}_2)\), such that for all functions f,

if \(\textsf {KeyGen} (f, 1^\kappa ) \rightarrow (sk, pk, ek)\),

\(\mathcal {A}_1^{\mathcal {O}_{\textsf {Encode}},\mathcal {O}_{\textsf {Decode}}}(pk, f, 1^\kappa ) \rightarrow (x_0, x_1)\), \(\{0,1\} \xrightarrow {R} b\) and

\(\textsf {Encode} (pk/sk, x_b) \rightarrow (\sigma _b, \tau _b)\) then

\(2| \Pr [\mathcal {A}_2^{\mathcal {O}_{\textsf {Encode}}}(\sigma _b,\tau _b) = b] - \frac{1}{2}| \le \textsf {negl}(\kappa )\).

In Definitions 4 and 5 the adversary has access to (one of) two oracles. \(\mathcal {\mathcal {O}_{\textsf {Decode}}}(\sigma _y, \tau _x, \tau _y)\) is the decoding oracle, and returns \(\bot \) if \(\textsf {Verify} (sk, \sigma _y, \tau _x, \tau _y) = 0\) and \(\textsf {Decode} (sk, \sigma _y)\) otherwise. Analogously, the encoding oracle \(\mathcal {O}_{\textsf {Encode}}(pk/sk, x)\) returns \(\textsf {Encode} (pk/sk, x)\) for any valid input x.

We say that a scheme has public verifiability if \(\tau _x\) is public rather than private, and all security definitions still hold, noting that the adversary now also has access to \(\tau _b\) in the definition of input privacy.

5.1.2 MPC-based VPPC

For MPC-based VPPC schemes we will adopt an alternative definition, that better fits the construction of such schemes: Definition 6. This definition is inspired by existing definitions for publicly auditable MPC [23, 115] and is analogous to our generic definition above.

Definition 6

(MPC-based VPPC) An MPC-based VPPC scheme \(\mathcal {VPPC}\) for N parties is a 3-tuple of p.p.t. algorithms:

  • \(\textsf {Setup} (f, 1^\kappa ) \rightarrow \textsf {pp} \): given a function \(f: X^N \rightarrow Y\) and security parameter \(\kappa \), returns public parameters pp;

  • \(\varPi (\textsf {pp}, \vec {x}) \rightarrow (y, \tau _y, vk)\): Given pp and inputs \(\vec {x}\), all parties together execute a protocol to compute the output value y, a corresponding attestation \(\tau _y\) and a (possibly private) \(vk\);

  • \(\textsf {Verify} (\textsf {pp}, y, \tau _y, vk) \rightarrow \{0,1\}\): Given pp, y, \(\tau _y\), and \(vk\), returns 1 if \(f(\vec {x}) = y\), and 0 otherwise.

An MPC-based scheme \(\mathcal {VPPC}\) should, at least, satisfy: correctness, security (against \(t \le N\) malicious parties), and input privacy.

Definition 7

(Correctness — MPC-based) An MPC-based scheme \(\mathcal {VPPC}\) is correct if \(\varPi \) (restricted to its output y) is a correct MPC-protocol for evaluating f; and if for all functions f, with \(\textsf {Setup} (f, 1^\kappa ) \rightarrow (\textsf {pp})\),

such that \(\forall x \in Domain(f)\), if \(\varPi (\textsf {pp}, \vec {x}) \rightarrow (y, \tau _y, vk)\)

then \(\Pr [\textsf {Verify} (\textsf {pp}, y, \tau _y, vk) = 1] = 1\).

Definition 8

(Security — MPC-based) A scheme \(\mathcal {VPPC}\) for N parties is secure (against \(t\le N\) malicious parties), if for any p.p.t. adversary (controlling up to t parties), such that for all functions f, if \(\textsf {Setup} (f, 1^\kappa ) \rightarrow (\textsf {pp})\), \(\varPi (\textsf {pp}, \vec {x}) \rightarrow (y, \tau _y, vk)\)

then \(\Pr [\textsf {Verify} (\sigma _y, \tau _x, \tau _y) = 1 \wedge y \ne f(x)] \le \textsf {negl}(\kappa )\).

Definition 9

(Input Privacy — MPC-based) A scheme \(\mathcal {VPPC}\) for N parties has input privacy (against \(t\le N\) malicious parties), if \(\varPi \) is a secure MPC protocol for evaluating f in the presence of t malicious parties.

We say that a scheme has public verifiability if \(vk \) is public rather than private, and all of the above definitions still hold.

5.2 High-level overview

Below, we discuss the typical application settings of the different classes, and additionally compare efficiency at a high level. DLT- and DP-based schemes are applicable to more specialized use cases, whereas MPC- and HE-based solutions are more general and can often be used in the same situations.

DLT. First, we note that DLT-based solutions are best suited for situations with participants that do not have direct communication channels with one another, or for computations with varying groups of participants. A downside of DLT with respect to the other solutions are the limitations on message sizes and verification times that are imposed by the use of a shared ledger. Moreover, the lack of private communication channels often leads to more complicated solutions than are possible when such channels are present. In cases where direct communication channels are available, HE-, DP- and MPC-based solutions will generally be more practical and efficient.

DP. As described in Section 3.4, DP-based solutions generally make use of a central party to compute the result, irrespective of where the noise is applied. Generally, these solutions are significantly more efficient than MPC- or HE-based approaches, but only provide approximate correctness of the result and provide weaker privacy guarantees for the original inputs. In return DP-based method can protect against inference and membership attacks, unlike any of the other methods. Therefore, in use cases with large amounts of data or clients, complex computations, or risks of inference and membership attacks, DP-based methods can outperform alternative constructions.

MPC vs. HE. MPC-based solutions have a strong focus on computations over distributed data, however can also be applied for verifiable outsourcing. In either setting, the computations are performed by a group of workers, of which a fraction is required to be honest to guarantee input privacy. The minimum size of this fraction depends on the underlying MPC protocol (see Table 1).

In verifiable outsourcing settings, the use of an HE-based VPPC scheme is often more practical, since computations on HE ciphertexts can be executed by a single central server that need not be trusted to guarantee input privacy. However, HE-based solutions can also be used in settings with distributed data. In that case, all data owners use a distributed, or threshold, key generation protocol to obtain a public-private key pair where the private key is shared over all data owners, e.g., threshold FHE [11, 83, 104]. Then, all data owners encrypt their data under the same public key and let the central server perform the actual computation on the received ciphertexts.

The main difference between HE- and MPC-based schemes lies in the efficiency of both schemes. Generally speaking MPC-based schemes require significant amounts of communication, either in multiple online rounds, or in one large offline pre-processing round. The computations themselves are often rather simple and can be executed locally by each party. Alternatively, HE-based schemes can communicate all required ciphertexts in one round and let the computations be executed by a single party. Downsides of HE, with respect to MPC, are the high computational costs of performing HE operations and the large size of HE ciphertexts. For complex computations, MPC will often be more time-efficient, however the right choice will differ per use case.

Additionally, we observe that MPC schemes have been widely researched for multiple decades and have long reached practical efficiency. HE schemes are more novel, and have only recently started to reach practical efficiency. Being an active area of research, many optimizations for HE are yet to be expected.

5.3 High-level observations

We make a number of high-level observations regarding VPPC schemes in general. This predominantly concerns topics that are currently underexposed, but are very relevant for their adoption.

Input data authentication. Verifiability of a PPC guarantees that the (public) output is computed correctly from the secret input. In other words, no corrupted party has meddled with the results. However, corrupted parties could still produce incorrect or malicious outputs by providing false inputs.

In most real-world use cases, computation correctness alone will not be sufficient, and input authentication will be required. One would protect the entire data pipeline from authenticated input to result, by combining the two. Moreover, such solutions can be used to guarantee reproducibility, i.e., that computations are verifiably executed on the same data. In our analysis we found only one recent solution, DGPS22, that focused on both verifiability and input authentication.

Reusability. In practice, we often run different algorithms on the same data, or the same algorithms on different data. Naturally, one can wonder whether reusing parts of the (intermediate) data can improve efficiency, reduce communication, or provide guarantees for reproducibility. The solutions we found in our analysis had little to no attention for such reusability.

However, we do observe a number of non-verifiable solutions for reusable MPC [20] or reusable GC [78] appear in recent years. With increased efficiency and decreased communication costs, reusability is especially beneficial in decentralized settings with large amounts of data or many participants. Benefits become even more clear, when considering that VPPCs use costly primitives like HE and ZKP.

Post-quantum security. In a world where the threat of quantum computers on classical cryptography is increasing rapidly, post-quantum (PQ) secure solutions are becoming increasingly important. This is underscored by, e.g., the NIST standardization for PQ primitives [55], the NCSC migration recommendations [112], and the implementation efforts bringing PQ primitives to OpenSSL [101]. Even though sufficiently large quantum computers may seem out of reach at the moment, current predictions expect quantum computers to break non-PQ primitives in 30 years [86]. Especially in cases where ciphertexts or commitments are made publicly available nowadays (as for publicly verifiable VPPC), attackers may adopt harvest-now-decrypt-later attacks, to store ciphertexts today and break them in a few decades. It is thus essential, to design PQ secure VPPC schemes now, so that we can implement them on time to keep offering security levels that increase exponentially in the security parameter, even in the presence of quantum computers.

Most, recent FHE schemes are believed to be PQ secure, and information-theoretically secure MPC protocols have been around for a long time. However, many other primitives used to create VPPC schemes are not PQ secure. More importantly, security against quantum adversaries was not discussed in the majority of the works we saw, even though its relevance increases by the day.

Comparing practical performances. In our analysis, we observed that all works use very different methods to evaluate their asymptotic and/or practical performance (see also Table 2). A surprisingly large subset of papers does not discuss performance aspects in its entirety. We admit that it is no small feat to compare different VPPC schemes, especially those of different classes. However, to make well-informed choices for future research and real-world adoption it is of the utmost importance to be able to fairly compare different solution approaches at least at a high level.

Making implementations of most schemes publicly available, would also greatly improve the ability to compare the practical performance of the different solutions, and stimulate real-world adoption. We specifically mention as an example the MP-SPDZ work [89], which presented a well maintained framework of many recent MPC solutions, making easy comparison of performance and real-world applications available to a wide audience.

5.4 Solutions adjacent to VPPC

In this section we briefly discuss a number of interesting solution types that are adjacent or complementary to VPPC schemes, or show promise of becoming a major building block for VPPC schemes in the future. They are not VPPC schemes themselves, either because they are not (yet) suitable for generic computations, or because they can only be used in a specific context. This list is by no means exhaustive, but is meant to provide a high-level overview of relevant, related solution approaches that could be combined with the schemes in Table 1, provide an alternative in a specific use case, or become an important building block for novel schemes.

Federated Learning (FL) Federated Learning (FL) is a paradigm that describes how ML models can be trained collaboratively by a group of data owners [91], without sharing their respective private datasets. Generally speaking, local models are trained at each data owner, such that private data need not be shared. On top of that, model updates are shared in a clever way, in order to construct one global model that is similar to one trained on the combined data. Since private data need not be distributed or shared with other parties, a certain degree of input privacy is provided. We observe, however, that this input privacy is not absolute and shared model updates as well as the global model might leak significant information about the original input data [98].

Recently, more attention is given to the verification of FL schemes, considering either verification of the individual clients or of the central orchestrator. The focus of verification in general lies on verifying the aggregation of local model updates, and not on the local computations themselves. For example, in PFLM [85], the authors propose a privacy-preserving FL scheme with secret-sharing-based aggregation. The aggregated results can be verified directly from ElGamal encryptions of the local gradients. Authenticity is guaranteed using aggregatable signatures. Similarly, in [37] an efficient, privacy-preserving protocol for local gradients is designed, based on secret sharing and authenticated encryption. The scheme does not offer public verifiability, but does offer security in the presence of some malicious parties. An overview overview of verifiable (privacy-preserving) federated learning approaches is provided in, e.g., [146].

Private Set Intersection (PSI) Private Set Intersection (PSI) is another example of a PPC for a specific problem, namely that of determining the intersection over a number of private sets [103, 140]. Each set is controlled by a different party, and the other parties should learn nothing more than the eventual intersection, which may also be secret-shared in some cases. Oftentimes PSI approaches are only secure in the semi-honest model, but some solutions also provide security in the malicious model (see Appendix B). Additionally, we observe a number of protocols that focus on verifiable, delegated PSI, i.e., when the intersection is executed by an external party and should be verifiable by other parties, e.g., [1, 134]. For a more extensive overview, we refer to [103].

Functional Encryption (FE) Functional Encryption (FE) is a cryptographic primitive that allows the decryptor of a ciphertext to decrypt it to a specific function evaluation on the underlying plaintext. FE is a generic class that also contains, e.g., attribute- and identity-based encryption.

In FE, there are generally three party types: authority, encryptor, and decryptor. The authority performs the system setup and generates a public encryption key and master secret key. The owner of the master secret key, i.e., the authority, can, for any function f from a given family, generate a function-dependent secret key \(sk _f\). Now, any party can encrypt a message x using the public encryption key, and given a ciphertext and \(sk _f\), a decryptor can open that ciphertext to f(x), without learning anything about x.

Already in 2012, [118] describes how one could construct a limited VC scheme from attribute-based encryption (ABE) with outsourcing, i.e., ABE in which most of the costly decryption work can be outsourced to a third party. However, this scheme is not so efficient, and leaves open many questions, regarding public verifiability and on how to deal with multiple clients. [74] therefore introduces multi-input FE, which is an FE scheme where n ciphertexts can be evaluated inside an n-ary function. This work improves upon [118], by using multi-input FE to construct multi-client FE, i.e., the ciphertext input to each function slot is encrypted under a different, client-specific encryption key. Doing so, guarantees input privacy with respect to each client’s underlying plaintext.

Whilst many more improvements followed in later years, [17] is one of the first to consider the case of a malicious authority and/or encryptor. To do so, the authors define and discuss the notion of verifiable multi-input FE. Their construction prevents dishonest, potentially colluding, encryptors and authorities from being able to generate ciphertexts or decryption keys that make the decryptor have ‘inconsistent’ results. Whilst their approach is generic, it is not very practical, due to being highly inefficient for anything other than linear and quadratic function forms. Moreover, it does not fully satisfy our definition, due to only being a multi-input, and not multi-client, scheme, meaning that some information about ciphertexts may still be leaked. Next to this, its security hinges on a number of non-standard cryptographic assumptions.

[113] and [93] do present practical verifiable FE schemes, that improve upon [17], however only support inner product arguments. Interestingly, [113] considers the setting where the setup is decentralized, thereby removing the reliance on a single trusted authority. Finally, we note that [93] considers the notion of public verifiability, by adopting \(\varSigma \)-protocols on top of an existing multi-input FE schemes, which can be used for evaluating inner-product functionalities. As far as we are aware, there is not yet any verifiable multi-client FE scheme that is efficient for generic computations, let alone practical. However, this approach does seem promising for future research in the direction of generic VPPCs on distributed data.

For an elaborate overview of different types of FE schemes, we refer to the extensive survey in [100], which also covers the topics of FE for arbitrary circuits, multi-input FE, multi-client FE, and verifiable FE.

Extensions to VPPC Next to the above solutions, we also note a number of interesting primitives that could be used to extend VPPC scheme with data authenticity properties or stronger privacy guarantees. Specifically we mention the use of private information retrieval (PIR) protocols [114] or oblivious RAM (ORAM) primitives [44]. Both of these methods allow users or algorithms to access data, in a database or in-memory, without revealing which data they accessed. We note that there already exist schemes that implement ORAM using a TEE [79]. In relation to this, we note the existence of proofs of retrievability (PoRs), which allow clients to outsource their data to an external server while ensuring the integrity, soundness and retrievability of said data, e.g., [9, 120]. The auditing capabilities of these PoRs could allow for strong input authenticity guarantees when combined with VPPC schemes.

Overall, these solutions can be used to make VPPC schemes more suitable for practical use cases. Combining the techniques used above with any of the VPPC schemes would offer an all-in-one solution for treating the entire pipeline of verifiable, privacy-preserving computations rather than treating verifiable computation as a standalone problem.

6 MPC-based VPPC

Solutions that use MPC as privacy-preserving computation mechanism for constructing a VPPC scheme can be divided in three groups. Each group uses a different mechanism for verifiability: succinct ZKPs, nonsuccinct ZKPs, TEEs, or other. The final group consists of schemes that use mechanisms that are different from all the other constructions in this class.

6.1 Non-succinct ZKP

The majority of verifiable MPC solutions uses commitments to (their shares) of input, intermediate, and output values in combination with NIZKs to guarantee security. First, we discuss solutions using non-succinct ZKPs.

6.1.1 Description

The first set of solutions (RRRK22 [124], BDO14 [23], CFY16 [58], SV15 [128]) uses standard \(\varSigma \)-protocols and the FS heuristic to obtain NIZKs from standard assumptions. BDO14 proposes the first scheme, which is based around a SPDZ-like protocol. On the other hand, SV15 is based on the CDN [57] framework, making it less efficient than BDO14. However, unlike that work SV15 does guarantee public verifiability for a secret computation result.

Like BDO14, CFY16 is also based around a SPDZ-like protocol, but greatly reduces the number of broadcast messages by introducing a commitment-enhanced secret-sharing scheme. Additionally, it improves upon BDO14 by offering public cheater identification.

RRRK22 is also based around a SPDZ-like protocol and improves upon CFY16 by adding robustness, i.e., guaranteed output delivery in the presence of malicious parties. We also note that RRRK22 makes use of PQ secure lattice-based commitments.

A downside of the regular \(\varSigma \)-protocols used in these schemes, is the large proof size and significant effort required on the verifier’s side for larger computations. Specifically, these proofs scale at least linearly in the size of the computation, i.e., the number of multiplications and participants. Whilst this complexity is the same as for the MPC portion of the computation, it makes public verifiability (and verification in general) very costly. Hence, more recent works (DGPS22 [63], PRI21 [122]) apply the more efficient compressed \(\varSigma \)-protocol theory [13], or \(\varSigma \)-bullets [40], while still relying only on standard, discrete-log type, assumptions.

Verification of all protocols in this subclass is done by verifying the entire transcript of the MPC computation. A verifier needs to check the ZKPs at each step to guarantee that each commitment to a new share is computed correctly. In the case of compressed \(\varSigma \)-protocols, the proof size and verification time scale roughly logarithmically in the computation size, which is much better than for the other schemes in this subclass.

6.1.2 Evaluation

Protocols in this subclass provide security and public verifiability even when all parties are corrupted. Moreover, all schemes are based on standard cryptographic assumptions and do not have a trusted setup. The number of honest parties required to ensure input privacy depends on the underlying MPC scheme.

The protocols relying on standard \(\varSigma \)-protocols in combination with the FS heuristic (RRRK22, BDO14, CFY16, SV15) are generally speaking more costly in verification time and proof size than the schemes relying on compressed \(\varSigma \)-protocols (DGPS22, PRI21).

The efficiency of the MPC computations and communication thereof depend mostly on the underlying protocol used. The choice of MPC scheme also depends on the amount of dishonest parties one accepts regarding the breakage of input privacy. For a comparison of these schemes with respect to their privacy guarantees and asymptotic complexities we refer to Tables 1 and 2. We note that some of the schemes did not report efficiency metrics, or only provided incomparable experimental evaluations. An open question is whether we can find a uniform way to compare these schemes with respect to their efficiencies.

Finally, there is one work, RRRK22, that uses a large number of PQ secure primitives. Although this does not make the scheme fully PQ secure, the authors speculate that it is possible to do so. Unfortunately, we found no other works in this class that discussed the risk of quantum computing on the security of their solutions. Further research is needed to determine which solutions could be made PQ secure.

6.2 Succinct ZKP

Next to schemes relying on non-succinct ZKPs, we observe solutions making use of succinct ZKPs. These schemes work similarly, but are often more efficient with respect to verification and transcript size.

6.2.1 Description

All solutions we observed within this class (Vee17 [137], KZGM21 [87], OB22 [115], SVdV16 [129]), use distributed and adaptive zk-SNARKs to assure security. Most of these solutions allow for the use of vector commitments, implying that each party publishes one commitment to all their shares. The computation can then be verified by checking the zk-SNARK proof given the parties’ commitments and the computed output.

SVdV16 adopts succinct proofs to offer a significant performance improvement over works using non-succinct ZKPs, e.g., BDO14, SV15. Vee17 improves upon SVdV16 by making it adaptive, i.e., it enables proofs over input commitments that are independent of the computation. The construction in KZGM21 offers a further improvement upon Vee17, by supporting a universal setup, i.e., a single trusted setup ceremony that is computation-independent and can be used ‘for the lifetime of a system’.

OB22 is a concurrent work to KZGM21, but takes a slightly different approach, in that the authors discuss how to combine a variety of MPC schemes with a variety of zk-SNARK schemes.

6.2.2 Evaluation

Constructions based on succinct ZKPs are in many ways similar to those based on non-succinct ZKPs. Non-succinct proofs also guarantee security and public verifiability even when all parties are corrupted. The number of honest parties needed to provide input privacy is again determined by the underlying MPC scheme.

The difference between succinct and non-succinct ZKPs lies in the trade-off between security and efficiency. Succinct ZKP-based solutions have very small proof sizes and verification times. Specifically, the constructions discussed here have constant proof size, and verification time depending only on the number of public values, i.e. the overall verification time and proof size only depend on the number of parties and public values. The only exception to the above OB22, which can be applied with any zk-SNARKs, i.e., also those with polylogarithmic proof size and verification time.

Moreover, these solutions work very efficiently with vector commitments, and do not require a verifier to check the entire transcript of the computation, reducing the communication and verification costs even more. Clearly, the downside of using zk-SNARKs is the use of non-falsifiable knowledge-type assumptions and trusted setups. It is an open question who should be involved in this trusted setup to guarantee trust in the correctness proofs.

The succinct schemes we found in our literature review compare as follows. The construction of SVdV16 only guarantees input privacy given an honest majority of computation parties. The other works can be used with any LSSS scheme, offering the users a choice in the number of parties required to be honest, making this the more flexible choice.

The main difference between the remaining schemes is the type of zk-SNARK that is used. Vee17 uses a zk-SNARK that is non-universal, i.e., a new setup is needed for a new computation, and requires a trusted setup. KZGM21 does use a universal zk-SNARK, making it possible to perform one universal trusted setup, rather than one per computation. Finally, OB22 allows for the usage of any zk-SNARK proof that can be computed in a distributed fashion. This allows for flexibility and adoption of novel schemes that may be more efficient or have better security assumptions. An open question is whether zk-SNARK schemes with a trustless setup or PQ security, e.g., Fractal [51], can be used in this way.

6.3 TEEs

Alternatively to the previous constructions, there are also schemes that realize MPC by means of TEEs. The attestation mechanism of TEEs can easily be used to guarantee verifiability in the presence of malicious parties. Interestingly, and unlike other paradigms in this class, most constructions that realize verifiable MPC by means of TEEs also offer some form of fairness.

6.3.1 Description

To the best of our knowledge, PST17 [119] is the first to take an extensive formal approach towards MPC by means of TEEs. The authors of PST17 arrive at several relevant theoretical results. They come to the conclusion that universally composable (UC)-secure 2-party MPC protocols are impossible when at least one party is not equipped with a TEE. Next, they show that UC-secure 2-party MPC can be constructed when both parties have TEEs, by relying on secure key exchange, e.g., Diffie-Hellman. Finally, they show how to achieve a protocol with \(\varDelta \)-fairness, when both parties in addition have access to trusted clocks (inside the TEE). Contrary to regular fairness, where either every or no party is able to obtain the result of a computation at the same time, in \(\varDelta \)-fairness some parties may be able to obtain the result in round r, whereas the rest is only guaranteed to obtain the result by round \(2\varDelta \).

Bah+17 [18] designs a more practical scheme. Their construction forsakes universal composability for the sake of efficiency. Specifically, they show how to construct an efficient MPC protocol using a single, potentially untrusted, compute party that holds a TEE. The protocol consists of two phases. In the preparation phase, all parties establish their own secure connection with the TEE and send their inputs along this secure channel, i.e., the inputs are sent in encrypted form. During the online phase, the input values are decrypted inside the TEE, the computation is evaluated and the encrypted results are sent back to the parties, who can decrypt them on their own. The specific contribution of this construction, is that it proposes a method to make users learn that the key exchange to set up their secure channel went correctly, and do not learn anything about the other parties’ key exchanges. Moreover, Bah+17 enables a party to guarantee that the other parties involved in the computation are the parties they expected to do a computation with.

Concurrently, Cho17+ [53] takes an alternative approach. The authors propose a protocol that transforms any MPC scheme that supports efficient, symmetric encryption, into a fair one, using TEEs. Rather than using the traditional MPC scheme to evaluate \(f(x_1,\ldots ,x_n)\) directly on its inputs, the MPC scheme is used to evaluate \(f'(x_1,\ldots ,x_n) = \textsf {Enc}(f(x_1,\ldots ,x_n))\) instead. I.e., we evaluate the encrypted output, rather than the output itself. The secret decryption key is secret shared between all parties and is loaded inside each parties’ TEE. After receiving the encrypted result, each party uploads a secret ‘release token’, which is related to the key shares, onto a public bulletin board. Only when all release tokens are uploaded onto the bulletin board, can parties use their TEEs to decrypt the result, thereby guaranteeing complete fairness.

SGK19 [131] improves upon Cho17+, by requiring only a certain number of participants, to own a TEE. This number of participants is equal to the corruption threshold t, i.e., the system can withstand up to t corruptions. It relies on a DLT to guarantee fairness of the computation, and to enforce so-called history-based policies, but does not store any inputs, outputs, or intermediate computation states on this ledger, unlike DLT-based approaches like Che+19 [48]. These history-based policies, are any type of predicates that need to hold for the inputs, outputs, or computations themselves, e.g., that inputs are only provided by users enrolled to the computation and that all valid inputs are used to compute the result. The computation itself happens independently from this ledger, and is handled by all users submitting their data directly to the compute enclave(s), who determines the result. A different enclave, possibly at the same party, is used to enforce the history-based policies. Correctness is guaranteed through remote attestation. Fairness is guaranteed by sharing the decryption key for the result using a t-out-of-n secret sharing key. The parties then use a TEE to obtain the full key and decrypted result in a fair way.

Finally, we observe that there are more works that rely on TEEs to realize multi-party computations, some of which are included in the other classes, as their main computation primitive was either HE or DLT. Some other works were considered out of scope, due to having no focus on verifiability, not offering support for generic computations, or not considering a setting with distributed data.

6.3.2 Evaluation

Compared to more traditional MPC paradigms, constructions based on TEEs take a different approach. They allow for computations to take place at one central untrusted party, by assuming trust in the secure TEE hardware that that party holds. However, as described above, there are different ways to apply TEEs to realize MPC, and all have different properties.

The contributions of PST17 are mostly theoretical, but give a good intuition into the possibilities of realizing VC using MPC from TEEs. Especially, UC-security is highly relevant in proving the security of larger protocols. Moreover, the authors consider fair MPC, which is not often discussed in other VPPC approaches, but can definitely be desirable in some use cases. A clear downside of this approach is that each client is required to own trusted hardware, which is not always realistic in practice.

Bah+17 shows that their construction is considerably more efficient than the more traditional actively secure MPC-paradigm, such as SPDZ [61] or ABY [62]. This does come at the cost of assuming trust in TEE hardware, which not every party will have in practice.

Clearly, Bah+17 is more efficient than PST17, at the sake of foregoing universal composability. However, as also mentioned in PST17, composability might still be achievable by introducing an independent set-up for each independent protocol that is executed. In many practical settings, protocols are not run only one time, making this independent setup negligible in such contexts. In practice, it is clear that Bah+17 offers increased efficiency, and the privacy and security guarantees are sufficiently strong for most purposes.

Cho17+ is much less efficient than Bah+17, in the sense that it requires the use of a traditional MPC scheme to perform the actual computation, which is much slower than doing the computation inside a TEE. Moreover, it is also slower than doing a computation using traditional MPC schemes, since this construction evaluates a function that returns the symmetrically encrypted version of the result, rather than the result itself. Given that symmetric encryption schemes can require quite a large number of multiplications, this difference is noticeable. One major advantage of this scheme compared to the others, however, is its modularity, i.e., it could also be combined with any other MPC scheme that relies on ZKPs to guarantee (public) verifiability. The added value compared to that scheme would then be the fact that fairness is provided. One question that should be raised is, whether, when each party has access to a TEE, it would not make more sense to reduce the computational load, by moving the actual computation (partially) inside this same TEE.

Finally, SGK19 offers a similar approach, however only requires a certain number of people to own a a TEE, thereby making it more practical than Cho17+. Next to this, SGK19 is more flexible, in that it does consider different ways for evaluating the function on the inputs, i.e., by means of traditional MPC schemes, or inside a TEE. Moreover, unlike the DLT-based schemes, it does not require storage of in- our outputs on the shared ledger, but does offer the option to specify and verify long-term policies on executed computations.

6.4 Other

Below, we describe the remaining solutions that did not fit any other category.

6.4.1 Description

JNO14 [84] uses MACs to design a scheme where n clients verifiably outsource a computation to m workers. However, only clients can verify the computation, and at least one worker needs to be honest to guarantee security and input privacy. The authors argue that this restriction can be lifted using the techniques of BDO14.

BOSS20 [24] presents an efficient, constant round, publicly verifiable MPC protocol that only makes black-box use of cryptographic primitives. The scheme uses BMR-type garbled circuits [26] and requires black-box access to a statistically secure oblivious transfer (OT) scheme and a cryptographic hash function. Notably, security and public verifiability only hold when at least one party is honest. BOSS20 focuses on adding cheater detection (as in CFY16), however unlike CFY16 this is achieved in constant rounds, but with higher overall cost.

Alternatively, SBDB18 [125] relies on pairing-based function-dependent commitments, leading to efficient verification of correctness. Contrary to prior works that use succinct ZKPs, this construction relies solely on standard assumptions. Furthermore, this work aims to reduce the clients’ cost of SV15 by allowing for all computations on secret shares to be done by one central server, rather than the individual clients. However, this all comes at a big cost: the construction only supports linear functions.

RCPT19 [123] presents an MPC protocol making use of an SWHE scheme that supports a single multiplication. Verifiable key switching, with NIZKs, is used to support computations with an arbitrary number of multiplications. The advantage of using this kind of scheme is that distributed key generation is a lot cheaper than compared to what was required in prior works (BDO14, SV15), however prior works do it in the offline phase, making those schemes more practical.

Finally, BKZZ20 [19] offers a different perspective to other works, by relying on crowd verifiable ZKPs. Meaning that a prover tries to convince a predefined group of verifiers (‘crowd’), where each verifier only provides a bounded amount of randomness, thus requiring a mostly honest crowd.

6.4.2 Evaluation

At first glance, the approach of JNO14 seems more (asymptotically) efficient than the other MPC schemes. But, we observe that this scheme only provides input privacy and security in case of at least one honest client. Ergo, it provides similar guarantees as most regular MPC schemes that are secure against a dishonest majority. This is not the type of security guarantee we aim for to achieve verifiable, privacy-preserving computations.

BOSS20 provides similar security guarantees, but also adds public verifiability, which is a feature not found in regular MPC schemes. It is therefore an interesting approach, but we deem it only useful for the particular use case where one assumes that at least one of the computing parties is honest. This assumption seems undesirable in most use cases where public verifiability is required.

Since SBDB18 only supports the computation of linear functions, we deem it too limited to be useful in practice. It is doubtful that this approach can be adapted to support non-linear function without fundamental changes.

RCPT19 is an interesting approach, in that it provides input privacy given one honest party and public verifiability even when all parties are malicious, i.e., more guarantees cannot be given. The use of verifiable key-switching in combination with an SWHE scheme is more or less similar to that of the SPDZ-like protocols discussed above. However, the SPDZ-like protocols move the computationally costly steps to an offline preprocessing phase, whereas they happen in the online phase here. The authors of RCPT19 do not describe the (asymptotic) complexity of their work. Nonetheless, we observe that the above mentioned SPDZ-like protocols have a significantly more efficient online phase, making them more practical.

The use of crowd verifiable ZKPs in BKZZ20 is different from any of the other MPC constructions. It is suitable only for a very particular use case, where there is a select, predefined group of verifiers that wants to check the correctness of a computation. Even though, designated-verifier ZKPs could be more efficient than publicly verifiable ZKPs, we do not expect these differences to be significant. Moreover, the protocol of BKZZ20 requires interaction with the verifiers, which is a big downside in most use cases, especially when compared to other ZKP solutions that are non-interactive. We therefore expect the other ZKP solutions to be more useful in practice. Moreover, developments in that direction are applicable to a much wider variety of VPPC use cases than is the case for BKZZ20.

6.5 Comparison

We see that the most actively researched direction for MPC-based VPPC schemes consists of schemes relying on ZKPs to guarantee security and/or public verifiability. Specifically, we see two flavors of these constructions: schemes using zk-SNARKs and schemes using ZKPs from standard cryptographic assumptions. Both types make a different trade-off between efficiency and security. However, regarding non-succinct ZKPs, non-succinct schemes based on non-compressed \(\varSigma \)-protocols have a clearly worse practical performance than their compressed counterparts. Especially, when it concerns communication size and verification time. Therefore, the direction of using compressed \(\varSigma \)-protocols seems more promising. However, the latter category is more novel, leading to a smaller number of works in the category, even though the efficiency is much better, and security guarantees are similar. We expect constructions based on compressed \(\varSigma \)-protocols to become more popular, but further research is needed.

The remaining choice, between succinct and non-succinct proofs, mainly depends on whether constant proof size and verification time really weighs against the use of non-falsifiable assumptions. Given that MPC already requires quite some communication by itself, we suspect that using succinct ZKPs only pays off when public verifiability plays a very big role, i.e., verifying a large number of proofs, or verification by many users.

Additionally, we note that within constructions relying on zk-SNARKs, there is only little focus on dealing with the problematic trusted setup, which is required by the specific zk-SNARKs that are used. However, to obtain true public verifiability, one would require external parties to also be involved in this trusted setup, which is a big inconvenience in practice.

Alternatively, we find a number of solutions that rely on TEEs to guarantee privacy and security. The biggest advantage of these methods is their efficiency, especially when the computation itself can be fully executed inside of the TEE, which offers a considerable performance improvement. Moreover, TEEs are an efficient way to offer fairness, and are interesting to be considered in combination with other VPPC schemes purely for adding fairness, as proposed in Cho17+. A potential disadvantage of using TEEs is the requirement of trusted hardware for a majority of the clients, or the fact that clients need to trust an untrusted worker not to have meddled with said hardware. Whilst this may not be convenient in every use case, we do observe an increase in real-world usage of TEEs for realizing secure computations. A further downside of using TEEs, is that public verifiability can be very difficult, or not be guaranteed in the long term, making this unsuitable in situations where public verifiability is a strong requirement.

Open challenges

  • VPPC from only PQ secure primitives, to guarantee security even in the presence of a quantum adversary.

  • Combining vector commitments and ZKPs in an efficient way, for example through commit-and-prove-SNARKs [42].

  • More efficient verification and proof sizes in constructions based on standard assumptions.

  • Succinct ZKP-based constructions without a trusted setup.

  • Modular solutions; due to the developments in both MPC constructions and especially ZKPs schemes, modular solutions could more quickly take advantage of novel and improved schemes.

  • Almost all solutions assume that the provided inputs of the parties are honest, however in practice this need not be the case. Further research into dealing with this problem, e.g., by means of authenticated inputs, is needed.

  • Realizing schemes with fairness, without relying on TEEs.

7 HE-based VPPC

HE-based schemes can be divided in four groups, based on the mechanism used for verifiability: homomorphic MACs, succinct ZKPs, TEEs, or other. The final group consists of solutions that rely on mechanisms that are different from all other works in this class.

7.1 Homomorphic MACs

We observe three different approaches in which homomorphic ciphertexts and homomorphic MACs are combined, thereby agreeing with [139]: (1) Encrypt-then-MAC; (2) Encrypt-and-MAC; (3) MAC-then-Encrypt.

7.1.1 Description

In the Encrypt-then-MAC method, one first homomorphically encrypts each input and only then computes a homomorphic MAC on each ciphertext. Therefore, the MAC does not have to be hiding. The function is then computed by evaluating it on both the input ciphertexts and corresponding MACs. A correct computation can be verified before decryption, by checking that the resulting MAC matches the resulting ciphertext.

Encrypt-and-MAC approaches have a homomorphic MAC and ciphertext that are independent of one another. Both are computed directly from the plaintext. The MAC, thus, needs to be both hiding and support the ciphertext operations required for the HE scheme. Also here, the requested function is computed on both the input MACs and the input ciphertexts. A correct computation can be checked before decryption, by verifying that the resulting MAC matches the resulting ciphertext.

We found the most occurring approach of the three to be MAC-then-Encrypt, in which the homomorphic MAC is computed on the plaintext and concatenated to the plaintext before encryption. This removes the need for the MAC to be either hiding or support complex ciphertext maintenance operations. The MAC now only needs to support operations in the plaintext domain. In this case, the function is computed by executing it only on the input ciphertexts. A correct computation, however, can only be verified after decryption, by verifying that the decrypted MAC matches the decrypted result.

7.1.2 Evaluation

We only found one occurrence of the Encrypt-then-MAC approach dating from 2014: FGP14 [67]. The presented solution is rather limited, in that it only supports the computation of quadratic circuits. Due to the lack of other, more recent, works using this approach, it seems doubtful whether this approach can be improved to support general circuits. One would need to design a MAC scheme that supports all operations of the underlying HE scheme, including ciphertext maintenance operations. It is unclear whether this is possible at all.

LWZ18 [95] uses the Encrypt-and-MAC approach, to have more flexibility than in FGP14’s setting. However, this approach suffers from similar issues. While the homomorphic MAC of LWZ18 does support any amount of additions and multiplications, it does not support ciphertext maintenance operations, thus making bootstrapping impossible. This severely limits the multiplicative depth of the function to be computed, if one wants the HE computations to remain practical. To overcome this problem one would need to design a hiding, homomorphic MAC supporting these more complex HE operations.

The most promising, and occurring, approach seems to be MAC-then-Encrypt, since in this case the homomorphic MAC only needs to support plaintext addition and multiplication. The first solutions of this type were still limited by technical constraints.

The first known approach, GW13 [72], verifies evaluations of boolean circuits. However, this approach was no more efficient than running the computation itself. An improvement upon this scheme, with more efficient verification of HE computations on arithmetic circuits, is given by CF13 [43]. CF13 is however limited to arithmetic circuits of bounded depth. CKPH22 [45], builds upon the techniques of CF13 by using a different encoding of the plaintext data. Because of this, CKPH22 does not have any of the prior limitations and can be used to verify HE evaluations of any arithmetic circuit.

Whilst the MAC-then-Encrypt approach is the most practical of the three, it should be noted that, contrary to the other methods, the MAC can now only be checked after decryption, and not before. This implies that some information could be leaked if the computation is incorrect. Finally, we note that none of these schemes are publicly verifiable, due to requiring the full secret key to verify the MAC.

7.2 ZKPs

Only in recent years the first solutions that combine ZKPs with HE ciphertexts have started to appear.

7.2.1 Description

All solutions that we observed make use of succinct ZKPs, specifically zk-SNARKs, to achieve verifiability. The (distributed) computation is executed directly on the homomorphic ciphertexts. The resulting (output) ciphertext is accompanied by a zk-SNARK proof, verifying that it is indeed the result of evaluating the function on the input ciphertexts.

The first solutions (FNP20 [68], BCFK20 [36]), intended to improve upon FGP14, in that they add public verifiability and support the evaluation of more than quadratic functions. Both require homomorphic hashing to make the ciphertexts fit inside the groups of the specific zk-SNARKs used. To do so, FNP20 requires very specific setup parameters for the HE scheme, leading to inefficient HE operations. BCFK20 improves on FNP20 by not restricting this parameter space.

However, both solutions are constrained to ‘simpler’ FHE schemes, i.e., without complex ciphertext maintenance operations, since these are not supported by the homomorphic hashing scheme used. GNS21 [69] improves upon these approaches, by proposing a method that does not require homomorphic hashes, but rather uses a zk-SNARK that natively operates on ring elements, i.e., HE ciphertexts.

An alternative approach to eliminate homomorphic hashing, is to encode HE operations directly into an arithmetic circuit suitable for any circuit-zk-SNARK. Experimental results for this approach are discussed for VKH23 [139].

7.2.2 Evaluation

Two of the early solutions (FNP20, BCFK20), suffer from drawbacks that make them highly impractical to use. The restriction on the HE parameters in FNP20 makes the HE computations too slow to be practical. And the lack of support of ciphertext maintenance operations in both schemes, puts an implicit bound on the multiplicative depths of the functions that can be computed.

The alternative solution, of translating the HE operations, including ciphertext maintenance operations directly into an arithmetic circuit (VKH23) makes it possible to support modern, efficient HE schemes. However, the complexity of HE operations, and the fact that HE ciphertexts do not naturally fit zk-SNARK groups, makes proof generation costly, and impractical for realistic use cases.

The most promising solution using ZKPs is GNS21, a zk-SNARK that natively operates on HE ciphertexts (ring elements), and thereby significantly reduces the prover time for complex computations [139]. However, an open question for GNS21 is how to do ciphertext maintenance operations. This generally necessitates operations not supported by rings or switching between different rings, which is not supported by GNS21.

An advantage of zk-SNARK-based approaches is the fact that the proofs are publicly verifiable, unlike homomorphic MACs or TEE attestation. Moreover, proof sizes are succinct, verification times small, and correctness of the resulting ciphertext can be verified before decryption. Actually, for all schemes discussed here, proof sizes are constant and verification times only depend on the public inputs. However, given the size of homomorphic ciphertexts this seems like the smaller issue, and a reduction in prover time will be more important. A downside of the used zk-SNARKs is their reliance on non-falsifiable knowledge assumptions, and the fact that most schemes require a trusted setup.

7.3 TEEs

HE computations can also be verified by executing them inside a TEE and using remote attestation to guarantee security.

7.3.1 Description

NLDD21 [111] presents such a construction for FHE-inside-TEE. Clients send their encrypted data to a TEE in the cloud, and only need to attest that the TEE performs exactly the desired computation. This verification can take place before decrypting the results, allowing all parties to check correctness beforehand.

7.3.2 Evaluation

The biggest advantage of using a TEE is the low computation time, since TEEs natively support most operations needed for HE computations. Even though such operations will be slower than on a regular CPU, they are faster than computing, e.g., a zero-knowledge proof. Optimizing FHE computations for TEEs can lead to notable performance improvements [139], making the need for research into this topic clear.

The use of a TEE guarantees input privacy and correctness of the computation, given that one trusts the specialized hardware being used. It is however unclear how to achieve public verifiability, i.e., how attestation can be (efficiently) performed by external parties, or long after the computation has been executed.

7.4 Other

Below, we describe the remaining solutions that did not fit any of the above categories.

7.4.1 Description

Lou+23 [99] uses an approach for verifiable FHE called ‘blind hashing’, where the hash of the plaintext is appended to the plaintext before encryption, i.e., it is similar to MAC-then-Encrypt. The authors aim to improve upon all other HE-based paradigms, by offering a universal, scalable model with low overhead and without requiring specific hardware. The hash is verified by computing the hash of the decrypted result and checking that it equals the hash that was included in the ciphertext.

GGP10 [70] can be seen as the first approach to combine verifiability with HE. It uses Yao’s GC to realize one-time verifiable computations between a client and server. Then, GGP10 adds FHE on top, to transform this one-time approach into a reusable one, allowing multiple evaluations of the same function on different input data. It should be noted that a different computation needs a new setup, and computing the setup is very inefficient.

7.4.2 Evaluation

Lou+23 is a preliminary publication that lacks a security proof/reasoning, and only explains how to perform a single matrix multiplication. It is unclear how this approach extends to other operations, such as needed for ciphertext maintenance and more complex circuits. We deem an actual MAC-then-Encrypt approach more viable than this approach purely based on hashing.

Being published in 2010, makes GGP10 the oldest verifiable HE approach we found. Whilst this scheme is efficient (Definition 13), the fact that each different computation requires the computation of a rather inefficient setup, makes it impractical for real-world usage. Our literature search showed no later work using a similar approach. All together this makes us conclude that this approach is not as viable as other HE-based solutions.

7.5 Comparison

Out of the various solution approaches for HE-based VPPC schemes we note three predominant categories, that either rely on homomorphic MACs, zk-SNARKs, or TEEs.

Of the homomorphic MAC approaches, the MAC-then-Encrypt approach seems to be the most promising, due to it putting the least requirements on the homomorphic MAC used. A downside compared to the other methods is that one does need to decrypt the ciphertext before verifying the MAC; it is an open question how we can resolve this issue. Another problem with MAC-based approaches is the fact that one needs to know the secret key to allow for verification of the MAC, making public verifiability not directly possible. We expect it to be possible to solve this issue using other cryptographic techniques, such as ZKPs or MPC. But, further research into this topic is needed.

zk-SNARKs-based approaches do offer public verifiability. However, current solutions suffer from efficiency issues, making them impractical for larger computations; especially when ciphertext maintenance operations are required. The overhead caused by proof generation is much larger than that of homomorphic MACs or TEEs attestation. However, when public verifiability is a requirement, zk-SNARKs are currently the only solution. We do expect that further improvements in zk-SNARKs allow for more efficient proof generation, making this method catch up regarding efficiency.

Another downside of zk-SNARKs with respect to MACs is that zk-SNARKs often require a trusted setup and are based on non-standard assumptions. Moreover, we speculate that current (succinct and non-succinct) ZKPs schemes, based on standard assumptions, require too large computational efforts from the prover to be feasible. Further research into efficient versions of those schemes with respect to HE operations could make such solutions possible. Given the fact that HE ciphertexts are already large by themselves, we expect that sacrificing proof size and verification time for reduced prover time is an interesting direction to investigate. Specifically, looking into non-succinct, prover-optimal, ZKP schemes that are compatible with HE schemes seems like a promising direction, given the problems of current solutions in this direction.

TEE-based solutions seem to be the most practical of the three, especially with optimizations in place. A downside of TEE-based solutions, with respect to the other solutions, is the requirement to trust specific hardware. This trust may not always be present in practice. Another downside of TEE-based approach with respect to zk-SNARKs is the lack of public verifiability.

All in all, if public verifiability is required, currently zk-SNARKs seem to be the best solution. However, further research into the other directions is expected to lead to more efficient alternatives. When public verifiability is not a requirement, the main trade-off is between trust in specific hardware and efficiency. This trade-off will be use case dependent and cannot be made in general.

Open challenges

  • Solutions relying on ZKPs based on standard cryptographic assumptions and without a trusted setup.

  • More efficient ZKPs, especially on the prover side, that natively support ring operations; with a specific focus on ciphertext maintenance operations and ring switching operations.

  • Public verifiability for homomorphic MACs or TEEs.

  • Realization of MAC-then-Encrypt constructions for which the MAC can be verified before decrypting the results.

  • Optimization of HE operation inside TEEs.

8 DLT-based VPPC

VPPC schemes that use DLT as basis for their distributed computations can be divided in three groups, based on the mechanism used for verifiability.

8.1 Succinct ZKPs

While most DLT applications using succinct ZKPs focus purely on financial transactions, e.g., Zerocash [29], we also identified works that focus on privacy and verifiability for smart contracts.

8.1.1 Description

Kos+16 [92] is a smart contract system that can be initialized over any decentralized cryptocurrency. It has a basic protocol for transferring coins to another party anonymously in the same fashion as Zerocash. However, it also allows for combining a coin transfer with programmable logic, in which the function to be computed is provided as a smart contract. To achieve this, all users participating do not directly spend a private coin, but commit both a coin and their private function input to a smart contract, along with a zk-SNARK proof of correct construction. Each user also includes an opening for the function input encrypted under the public key of a trusted manager. The actual computation of the function takes place off-chain by this trusted manager, who first opens the commitments to all inputs. The manager then computes the function and publishes the output, along with a zk-SNARK proof and a spending distribution of the committed coins. Thereby completing both the transaction and computation, whilst keeping the inputs private, i.e., only known by the trusted manager.

Bow+20 [39] uses a different construction, called decentralized private computation (DPC), that improves upon Kos+16 in multiple ways. It does not rely on a trusted manager for input privacy, and keeps the function itself private too. In a DPC each blockchain entry is a record containing input data, a birth predicate, and a death predicate. The birth predicate defines under which conditions this record was constructed. The death predicate defines under which conditions a record can be consumed. Any user of the blockchain system can consume any records for which the death predicates can be satisfied and use these to create new records, with new payload and predicates, i.e., perform a state transition. Since death predicates can put conditions on all details of the newly created records and on the users that can perform a state transition, we can use this system to model a smart contract. Bow+20 guarantees anonymity of the payload, or input data, by only adding commitments to the records on-chain. Any valid transaction that consumes and creates records, has to add a zk-SNARK proof attesting to the correct creation of the new records, and the fact that the user also knows a valid proof for the predicates of each consumed record.

8.1.2 Evaluation

By using zk-SNARKs, Kos+16 offers public verifiability of correctness of the executed computations, given that the used zk-SNARK is secure, irrespective of the behavior of the trusted manager. Additionally, since zk-SNARKs have very small proof sizes and verification times, they are suitable to use in a DLT environment. Privacy of function inputs is guaranteed, but is dependent upon the trusted manager not revealing any of these inputs, which is unrealistic in practice.

Bow+20 guarantees input privacy without such a trusted manager. Moreover, it also adds function privacy to any other observer on the blockchain. Only the party consuming a record needs to know the functions, or predicates, used to create this record. A downside is that, where in Kos+16 a complete computation is performed at once, in Bow+20 a computation is performed per party, given the (intermediate) state. This leads to longer computation times. Moreover, Bow+20 does not keep record data private from the party consuming the record, by default. One would still have to rely on HE- or MPC-style computations to achieve this.

Public verification of the correctness of the executed computations is very efficient. Next to that, the actual function can be computed locally. However, most computation time will be consumed by proof generation. In the case of Bow+20, waiting on verification of previous blocks with input records for the next step might lead to a large amount of latency on top of this.

Lastly, we feel that the reliance on non-standard cryptographic assumptions, that comes with the use of zk-SNARKs, is sensible for DLTs. The need for constant proof size and computation independent verification time is essential to keep DLTs scalable and practical, and without this, generic computations seem impossible in this paradigm. However, the use of a trusted setup within a DLT feels contradictory. Oftentimes, a DLT is used to distribute trust as much as possible, a trustless zk-SNARK seems more compatible with this philosophy.

8.2 Non-succinct ZKPs

We also found one solution using non-succinct ZKPs, making a different trade-off between security and efficiency.

8.2.1 Description

BAZB20 [40] presents a privacy-preserving payment system for smart contract platforms using \(\varSigma \)-bullets. It improves upon the approach of Kos+16, by offering a completely trustless mechanism, but this comes at the cost of supporting less generic computations. \(\varSigma \)-bullets combine the optimizations of bulletproofs [41] with classic \(\varSigma \)-protocol theory, to create more efficient \(\varSigma \)-style proofs. Where non-private cryptocurrencies require access to the transaction details to verify each transaction, BAZB20 only adds commitments to the transaction details on the public ledger. These commitments are accompanied by \(\varSigma \)-bullets, that attest to exactly the predicates that are normally checked in verification. Rather than checking these predicates directly, any verifier can now check correctness of the provided proof to obtain the same guarantees.

8.2.2 Evaluation

The advantage of \(\varSigma \)-bullets over zk-SNARKs is that they only rely on discrete-log type cryptographic assumptions and do not require a trusted setup, leading to stronger privacy and security guarantees, whilst still providing public verifiability. This comes at the cost of computation dependent verification time and logarithmic proof size. BAZB20 only offers private coin transfer, and does not support generic computations.

While \(\varSigma \)-bullets can be used for generic computations [13], this would lead to verification times that are likely too large to be used in a DLT setting. The logarithmically scaling proof sizes need not be a problem on smaller DLT systems, but might cause scaling issues. It is doubtful whether non-succinct ZKPs are suitable for generic computations in a DLT setting.

8.3 TEEs

Che+19 [48] is a blockchain solution where smart contracts over private data are executed off-chain. This work aims to improve upon the prior ZKP-based approaches, e.g., Kos+16, BAZB20, by having a much lower overhead.

8.3.1 Description

In Che+19, specialized compute nodes with TEEs execute a smart contract and compute the results. This is done, by letting the clients put their inputs on-chain, and assigning them to a smart contract. Next, the specialized compute nodes take care of the contract’s execution and obtaining the results. The consensus nodes use remote attestation to verify the results provided by the compute nodes and update the smart contract results on-chain accordingly.

8.3.2 Evaluation

Rather than relying on expensive cryptographic machinery or trusted parties, Che+19 uses trusted hardware to guarantee correctness, whilst maintaining privacy. Whilst TEEs are slower than computations on regular CPUs, they are multiple orders of magnitudes faster than generating zero-knowledge proofs [139]. This comes however at the cost of relying upon the privacy and security guarantees of the trusted hardware. If the hardware is compromised or faulty, privacy and security could be broken as a whole. To what extent one can and should trust TEE hardware is an open question and might require the technology to be around for longer to gain more trust.

Another open question is that of public verifiability. Different TEEs have different remote attestation mechanisms, making this a complicated issue. It is unclear whether one can achieve public verifiability, or even verify correctness long after the computation has taken place. Moreover, since not all parties may have the means or access to perform remote attestation, Che+19 puts trust in a select group of nodes to perform verification. This may not be desirable in all cases.

8.4 Comparison

DLT applications often require very small verification times and message sizes to make block verification practical, and keep the size of the ledger manageable. In virtually all use cases, this clearly outweighs the downside of using non-standard cryptographic assumptions. Due to the larger verification time and proof sizes of non-succinct ZKP approaches, we do not expect them to be suitable for usage in a DLT setting for generic computations. When considering ZKPs, zk-SNARKs therefore seem the logical choice. Specifically, those with constant proof size and near-constant verification time.

The DPC approach as described by Bow+20 looks very promising. Next to offering input privacy and public verifiability, Bow+20 also guarantees function privacy, which has not been observed in any other VPPC scheme. Open questions, however, on how to improve the efficiency of composable zk-SNARKs and how to remove trusted setups, remain. Moreover, DPC-based approaches do not hide the input data (or function to be computed), from the other parties involved in the computation. This could be solved using, e.g., an MPC-based computation. However, direct application of MPC, without DLT, is likely more efficient in that case. I.e., there should be clear, additional benefits of using DLTs before choosing it over other approaches.

An alternative to a ZKP-based approach, is to perform computations using a TEE. This is more efficient than using ZKPs, but does require the user to have trust in and access to specific trusted hardware. It is doubtful whether this is practical in a DLT setting, mostly due to the fact that TEE hardware is often only available centrally, thereby defeating the purpose of DLT.

Each promising approach requires trust in something other than standard cryptographic assumptions. We do not expect this to be circumventable in the near future. The current requirements on message sizes and verification times in DLT settings give no other reason, then to accept such otherwise less desirable trust assumptions.

Open challenges

  • More efficient, composable zk-SNARK constructions.

  • Schemes using ZKPs without trusted setup.

  • Efficient solutions to keep data and/or functions private from other computation parties.

9 DP-based VPPC

All VPPC schemes based on DP make use of ZKPs in some form. Most of the schemes below aim to provide security and privacy in a different setting. Therefore, the works can be seen as alternatives for different settings, rather than improvements upon one another, unless otherwise specified.

9.1 Description

The first verifiable DP approach we found in our search, was NFPH15 [110]. In NFPH15, a trusted curator holds all data of the individual parties, and publishes a public commitment to the dataset. A vetted analyst can study the data and determine a DP-private query it wants to evaluate. The curator confirms that the query is indeed differentially private and subsequently generates the corresponding keys for the ZKP generation. Next, the analyst executes the query and publishes the DP result along with a publicly verifiable ZKP, which can be verified by any external party.

TM23 [136] uses a similar setting, but improves upon NFPH15 by ensuring that the input data is kept private from the analyst. However, the authors do not provide sufficient details to determine which concrete building blocks could be used to realize this scheme.

BC23 [32] presents another scheme for verifiable DP in the trusted curator model, but offers an alternative solution in which the clients secret share their data over multiple servers that together emulate the trusted curator using MPC, thereby increasing the privacy guarantees. Rather than relying on NIZK proofs as in the previous two constructions, BC23 takes a completely different approach and relies on interactive \(\varSigma \)-protocols to convince a public verifier of the validity of the results. If this public verifier is a trusted party, BC23 additionally obtains public verifiability by releasing the transcripts of an interaction.

As an alternative to the trusted curator setting, MMT23 [105] presents a framework for verifiable LDP in a DLT setting. They show how a client holding certain private attributes, that are committed to on a public blockchain, can reveal a verifiably DP version of an attribute to an analyst. Client and analyst first perform a protocol to generate true randomness together. At a later point in time, the client uses this randomness to randomize a selected attribute verifiably. Public verifiability is guaranteed using zk-SNARKs. While verification of the subsequent computations by the analyst are not discussed, we expect that this could be done using similar techniques.

In the same LDP setting, KCY21 [88] makes use of \(\varSigma \)-protocols and techniques from cryptographic randomized response [7], to ensure that clients provide honestly randomized values to the analyst. They describe how to verifiable execute three state-of-the-art LDP protocols on discrete, private inputs.

Table 3 High-level comparison of solutions paradigms
Table 4 Summary of open challenges per solution paradigm

9.2 Evaluation

The field of DP-based VPPC schemes is still rather novel and there seems to be much potential for improvements and alternative constructions, to provide stronger privacy and security guarantees or be applicable on more diverse settings. We observe that the LDP methods (MMT23, KCY21) provide stronger privacy guarantees for the clients’ data. However, both works only present ways to protect the privacy for binary values or values in a small discrete range, using a specific type of randomized response.

Extensions to other settings, or alternative perturbation methods would allow for applicability in a much wider range of settings. Moreover, these methods focus purely on verifiable randomization of the input data, but do not provide similar guarantees for the function that is evaluated on that data. Regarding, verifiability they use a very different approaches, MMT23 uses non-interactive zk-SNARKs with constant proof size and near-constant verification time, whereas KCY21 uses short, but interactive proofs.

Given that LDP systems generally have one server and a large number of clients, it seems essential to keep proofs and verification times as small and non-interactive as possible. The statement size for these works is rather small, so the distinction is less clear, but we suspect that, for more complex randomization mechanisms, succinct ZKPs will have a clear benefit over non-succinct ones. Specifically, we expect the advantages of ZKPs with a constant proof size to outweigh the drawbacks of those with non-falsifiable assumptions in practical systems, with tens of thousands of clients.

For solutions in the trusted curator model (TM23, NFPH15, BC23), we do see verification of the function evaluation. However, this comes at the cost of a trusted curator, who gets access to the private input data. Fortunately, we see possibilities for providing even stronger privacy guarantees in this setting, by using MPC or HE to protect the private data. This would be an improvement upon BC23 which already distributes the curator role.

Next, we observe that TM23 and NFPH15 both rely on zk-SNARKs to support generic circuit computations. However, given that the function evaluation is often only performed for one specific query of a specific client, we question whether the advantage of having a constant proof size and near-constant verification times is worth the large setup costs. We suspect that using a zk-SNARK with a universal setup would be a better choice, than the currently adopted computation-specific ones. Moreover, one could remove the need for a trusted setup, by using a trustless zk-SNARKs, since a trusted setup seems rather incompatible with most DP use cases in the trusted curator setup.

The approach adopted by BC23, using interactive \(\varSigma \)-protocols, seems more sensible for this setting. DP queries are often relatively simple, and interaction between client and curator is already needed when the client asks their query, or for generating randomness. Moreover, \(\varSigma \)-protocols do not require any non-standard cryptographic assumptions, and are much easier to analyze cryptographically.

In general, we observe that verifiable LDP methods require communication between client and server to generate randomness. This can be become a burden when the number of clients increases. In order to make these methods practical for real world applications, a reduction in the amount of communication would be needed. We expect that using novel, specialized approaches for generating honest randomness will be an important first step in this direction.

Open challenges

  • Verifiable LDP for real-valued input data.

  • Stronger privacy guarantees in the trusted curator model.

  • Reducing communication between clients and curator, especially with respect to generating randomness.

10 Future directions

In the discussions above, we have classified and described the state-of-the-art in VPPC schemes on distributed data. Furthermore, we have compared different paradigms and identified open challenges and limitations of existing technologies (see Tables 3 and 4).

Based on the identified challenges and comparison of paradigms, we now take a critical look at which future research directions seem most promising or are required to further advance this field.

Post-quantum Security As observed in Section 5.3, the PPC approach itself, e.g, MPC, HE, is in general post-quantum secure. However, this is not the case for the other primitives used in VPPC constructions. In particular, this mostly concerns commitment schemes and ZKPs, for which post-quantum secure solutions do exist, but were only rarely observed in state-of-the-art VPPC schemes.

For commitments we see promise in lattice-based constructions [22] in combination with lattice-based HE or MPC as in RRRK22 [124]. Lattice-based commitments also form a major building block in lattice-based \(\varSigma \)-protocols (non-succinct ZKPs), such as [14], which could potentially be applied in novel VPPC schemes. For succinct ZKPs, we see a lot of promise in recent improvements on Fractal [51], such as [76]. More research into these directions is needed to improve practical efficiency and be able to compete with non-post-quantum secure alternatives.

Modularity VPPC schemes generally consist of several building blocks. Therefore, it would be an important practical improvement, to design modular schemes, allowing for easy exchange of primitives, by novel, more efficient ones. Designing ‘Lego’-like constructions for each of the paradigms, e.g., MPC + ZKP or HE + MAC, would be a major advancement in this field, following trends in related fields, e.g., for stand-alone ZKPs [42].

Dishonest inputs Dealing with dishonest inputs is not the most difficult technical challenge, however, its practical importance is often overlooked in current solutions. We speculate that existing building blocks, such as digital signatures and commitments should be sufficient to guarantee honest usage of inputs. Two important directions for future research are: (1) how to combine these techniques efficiently with verifiability approaches; and (2) how to design the threat model. Designing alternatives to, e.g.,  [16], that can be used in alternate settings and threat models, is expected to greatly improve practical adoption of VPPC schemes.

Zero-Knowledge Proofs Improving the efficiency and decreasing the proof size of ZKPs, as well as dealing with trusted setups, is a very active field of research delivering improvements at an impressive rate. However, the majority of this research focuses on improvements for generic proof systems, which works for many VPPC schemes, but not all.

Specifically, the emerging fields of HE-tailored ZKPs has shown potential to considerably outperform prior approaches: GNS21 [69] shows very promising results for ring-based HE. Therefore, we signal the development and improvement ZKPs for ring-, lattice-, code-based HE as an important research direction. Moreover, such ZKPs can lead to plausibly, post-quantum secure constructions, due to the underlying structures used. We also strongly recommend looking into prover-optimal (often non-succinct) ZKPs on HE ciphertexts, since current solutions mostly suffer from prover-side computation costs. Due to the large size of HE ciphertexts, the increased proof size and verification time will be less noticeable.

Homomorphic MACs Homomorphic MACs seem to be the most efficient way, by a large margin, to attain verifiable HE. However, to become completely practical two limiting problems have to be overcome.

Firstly, the design of homomorphic MACs that can deal with the ciphertext maintenance operations of HE schemes is essential in realizing MAC-then-Encrypt or Encrypt-and-MAC constructions. CKPH22 [45] shows how to overcome this problem for some HE schemes, but further research on extending and improving this method is needed.

Second, in MAC-then-Encrypt schemes, public verifiability is hindered by the fact that verification can only be done after decryption of the result. However, this result is not always intended for public release, making this paradigm unsuitable in such situations. We suspect that this problem can be overcome using techniques from MPC, e.g., masking, secret sharing, or partial decryptions, to evaluate the MAC, without revealing the result.

Public verifiability in TEE We have observed that public verifiability is oftentimes practically complicated, or not discussed, in TEE-based approaches, making them less suitable for a variety of use cases. Especially hindering is the fact that it is hard for a member of the general public to verify that the software running inside a TEE indeed performs the intended computation, without any leakage. Moreover, public verifiability most often still relies on having additional trust in the specific hardware vendor, which is not always desirable. We do not directly see an approach for solving the problem of public verifiability in TEEs, but encourage further research in this direction.

DP-based VPPC The class of DP-based VPPC is less developed than the other classes, and runs into different challenges. For future research, we recommend focusing on providing solutions for different settings, rather than optimizing existing approaches. Different settings can be centered around, e.g., (not) having a trusted curator, verifiable DP in the shuffle model, or different trust models. We specifically note that we have not observed any VPPC solutions dealing with real-valued data, or complex (possibly stateful) randomizers, whereas these are already being used in practice. Therefore, offering verifiable solutions for these approaches seems a vital first step.

Functional Encryption As discussed in Section 5.4, FE shows potential to be used as a basis building block for VPPC schemes. Specifically, the more recent works on verifiable multi-client FE, distributed authorities and public verifiability, seems to warrant further investigation. Whilst practical, verifiable FE for generic computations is not achieved at this moment. Admittedly, there are some promising works regarding verifiable FE for, e.g., inner products computations. Further research is needed to see if these paradigms can be efficiently extended towards general computations.

11 Conclusion

In this study, we presented a systematic overview of VPPC schemes, applicable to settings with distributed data and identified four main classes: MPC-, HE-, DLT, and DP-based. Each class is best suited to different scenarios, depending on the available resources. Next, we analyzed the solutions in each class, by dividing the classes based on the used verifiability paradigm and identified the most pressing open challenges. A high-level summary is depicted in Tables 3 and 4.

Based on our analysis of the literature, our high-level conclusions can be summarized as follows. DP-based VPPC schemes are generally the most efficient, but often offer a weaker form of privacy. Moreover, these schemes are rather novel and require more research to be suitable for practical use cases. For DLT-based approaches the use of succinct ZKPs for verifiability seemed the most promising approach, given the constraints on verification time and proof size. MPC-based solutions predominantly use ZKPs for verifiability. Those using succinct ZKPs are significantly more efficient than those based on non-succinct ZKPs. However, this comes at the cost of non-standard security assumptions and trusted setups and is not worth it in all use cases. Finally, for HE-based approaches, constructions using TEEs or homomorphic MAC-then-Encrypt seem most promising, but still suffer from practical limitations.

Following our discussion on the limitations of current solutions, we have defined several promising areas for future research. Out of these directions, in our eyes, practical adoption would benefit most from developing modular solutions, as well as preventing malicious parties from providing dishonest inputs. Finally, we emphasize the need for designing future proof, post-quantum secure solutions.