Table 1 Summary of data used for training and testing the LungCNN-Histo and the TMB classification models.

From: Comparative analysis of machine learning approaches to classify tumor mutation burden in lung adenocarcinoma using histopathology images

Model

LungCNN-Histo

WS-S1

LungCNN-TMB and WS-TMB

Dataset name

Histo-Train

Histo-Test

WS-Train

TMB-Train

TMB-Test

Data source

TCGA LUAD

TCGA LUAD

DS2

TCGA LUAD

TCGA LUAD

TCGA LUAD (unseen sites)

TCGA LUAD (seen sites)

Number of cases (number of slides)

64 (68)

38 (40)

50 (50)

317 (261)

242 (295)

84 (84)

88 (93)

Number of tissue source sites

22

20

N/A

23

21

10

17

Age range (median)

38–84

(67.5)

48–83

(68.5)

40–70

(56)

33–87

(67)

33–87

(67)

42–84

(64)

41–88

(69)

Pathologic stage

I

37

20

35

148

138

40

56

II

15

11

9

60

54

27

17

III

8

6

6

37

35

10

12

IV

4

1

0

15

15

7

3

N/A

0

0

0

1

0

0

0

Sex

Female

40

17

17

144

134

44

49

Male

24

21

33

117

108

40

39

Smoking status

Non-smoker

26

17

0

96

89

38

36

Smoker

30

21

0

157

153

46

52

N/A

8

0

50

8

0

0

0

TMB status

Low

47

28

N/A

182

167

61

59

High

17

10

N/A

79

75

23

29