Memory leak when parsing a protobuf message with duplicate fields

Hello,

While fuzzing a project that relies on Nanopb to parse (untrusted) user input, I found a memory leak which is triggered by sending some message where fields are duplicated.

**Steps to reproduce the issue**

In order to test this memleak on several versions of Nanopb (and several Linux distributions), I have written the following script:
```sh
#!/bin/sh
# Reproduce a memory leak issue in nanopb parser
#
# Dependencies on Debian: sudo apt install clang git protobuf-compiler python3 python3-protobuf
set -e -x

# Clone nanopb
if ! [ -d nanopb ] ; then
    git clone https://github.com/nanopb/nanopb
fi

# Create a protobuf file for some message with a header
cat > mypackage.proto << EOF
syntax = "proto3";
package mypackage;

import "nanopb.proto";

message HeaderField {
  bytes mydata = 1 [(nanopb).type = FT_POINTER];
}

message Header {
  option (nanopb_msgopt).anonymous_oneof = true;
  oneof one {
    HeaderField field = 1;
  }
}

message MessageWithHeader {
  Header head = 1;
}
EOF

# Create a fuzzer on this message
cat > fuzz_decode_message.c << EOF
#include <stdint.h>
#include <stdio.h>

#include <pb_decode.h>
#include "mypackage.pb.h"

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    mypackage_MessageWithHeader req = {};

    pb_istream_t is = pb_istream_from_buffer(data, size);
    if (!pb_decode(&is, mypackage_MessageWithHeader_fields, &req)) {
        printf("Failed to decode input: %s\n", PB_GET_ERROR(&is));
        return 0;
    }
    printf("Parsing ok, req.head.which_one = %u\n", req.head.which_one);
    pb_release(mypackage_MessageWithHeader_fields, &req);
    return 0;
}
EOF

# Compile the .proto and the fuzzer
protoc \
    -Inanopb/generator \
    -Inanopb/generator/proto \
    -I. \
    --plugin=protoc-gen-nanopb=nanopb/generator/protoc-gen-nanopb \
    --nanopb_opt= \
    --nanopb_out=. \
    mypackage.proto

clang -g -ggdb -O1 -fsanitize=fuzzer,address,undefined \
    -Wall -Wextra -Inanopb -DPB_ENABLE_MALLOC -DPB_FIELD_32BIT \
    -o fuzz_decode_message.out \
    fuzz_decode_message.c mypackage.pb.c nanopb/pb_decode.c nanopb/pb_common.c

# Run on a test case that leaks some bytes
python3 -c 'import sys;sys.stdout.buffer.write(bytes.fromhex("0a06 0a020a00 0a00"))' > memleak_message
./fuzz_decode_message.out memleak_message
```

**What happens?**

On a up-to-date Debian 10 machine, this leads to the following output:
```text
./fuzz_decode_message.out: Running 1 inputs 1 time(s) each.
Running: memleak_message
Parsing ok, req.head.which_one = 1
Parsing ok, req.head.which_one = 1

=================================================================
==3937==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x4f25a2 in realloc (/fuzz_decode_message.out+0x4f25a2)
    #1 0x536a80 in allocate_field /nanopb/pb_decode.c:581:11
    #2 0x533f3a in pb_dec_bytes /nanopb/pb_decode.c:1479:14
    #3 0x52ed88 in decode_pointer_field /nanopb/pb_decode.c
    #4 0x525632 in pb_decode_inner /nanopb/pb_decode.c:1083:14
    #5 0x5359cd in pb_dec_submessage /nanopb/pb_decode.c:1589:18
    #6 0x52d008 in decode_static_field /nanopb/pb_decode.c:532:20
    #7 0x525632 in pb_decode_inner /nanopb/pb_decode.c:1083:14
    #8 0x5359cd in pb_dec_submessage /nanopb/pb_decode.c:1589:18
    #9 0x52cea9 in decode_static_field /nanopb/pb_decode.c
    #10 0x525632 in pb_decode_inner /nanopb/pb_decode.c:1083:14
    #11 0x526c24 in pb_decode /nanopb/pb_decode.c:1159:14
    #12 0x52143d in LLVMFuzzerTestOneInput /fuzz_decode_message.c:11:10
    #13 0x42edfa in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/fuzz_decode_message.out+0x42edfa)
    #14 0x422003 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/fuzz_decode_message.out+0x422003)
    #15 0x426b31 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/fuzz_decode_message.out+0x426b31)
    #16 0x44a3f2 in main (/fuzz_decode_message.out+0x44a3f2)
    #17 0x7fd6f93e609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).

INFO: a leak has been found in the initial corpus.
```

With my program, `0a06 0a020a00 0a00` leaks 4 bytes, `0a0a 0a020a00 0a020a00 0a00` leaks 8 bytes, etc.

**What should happen?**

I believe that parsing untrusted input should not leak allocated memory. You might disagree with this belief, in which case it would be nice to indicate in https://github.com/nanopb/nanopb/security/policy that Nanopb may leak memory when parsing untrusted data which was maliciously crafted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory leak when parsing a protobuf message with duplicate fields #615

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory leak when parsing a protobuf message with duplicate fields #615

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions