{getrect}
is a solution to the problem of partitioning a matrix into
rectangular regions - each region only enclosing a single unique value.
partition_matrix()
partition a matrix into rectanglesplot_rects()
Plot the result ofpartition_matrix()
max_rect_under_histogram()
calculates the maximum area rectangle under a sequence of values representing bar heights of a histogramplot_hist_max_rect()
plot the bar heights and best rectangle
- There are faster algorithms for ‘max area under a histogram’ than what I’ve included here.
- When only rectangles with area = 1 remain, could short-circuit the iteration and add them all in bulk.
You can install from GitHub with:
# install.package('remotes')
remotes::install_github('coolbutuseless/getrect')
Partition a matrix into rectangular regions where:
- Each region contains only a single unique value
- An effort is made to minimise the total number of rectangles
- For each unique value:
- Greedily find the biggest single-valued rectangle at each step
Given a sequence of bar heights (representing a histogram), what is the maximum sized rectangle which can be inscribed?
library(getrect)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Define bar heights
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hts <- c(4, 8, 3, 1, 3, 3, 0, 1)
plot_hist_max_rect(hts)
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Calculate maximum sized rectangle
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
rect <- max_rect_under_histogram(hts)
plot_hist_max_rect(hts, rect)
mat <- matrix(
c(1,1,1,2,3,3,3,
1,1,2,2,2,3,3,
1,2,2,2,2,2,3,
2,2,2,2,2,2,2),
nrow = 4, ncol = 7, byrow = TRUE
)
mat
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1 1 1 2 3 3 3
#> [2,] 1 1 2 2 2 3 3
#> [3,] 1 2 2 2 2 2 3
#> [4,] 2 2 2 2 2 2 2
Consider just the value 2
in the matrix.
Create a working matrix s
with a 1
where the target value exists,
and 0
elsewhere.
target <- 2
s <- mat
s[] <- mat == 2
s
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0 0 0 1 0 0 0
#> [2,] 0 0 1 1 1 0 0
#> [3,] 0 1 1 1 1 1 0
#> [4,] 1 1 1 1 1 1 1
Calculate the vertical runs of 1
at every row based upon the rows
above
s <- apply(s, 2, \(x) ave(x, cumsum(x == 0), FUN = cumsum))
s
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0 0 0 1 0 0 0
#> [2,] 0 0 1 2 1 0 0
#> [3,] 0 1 2 3 2 1 0
#> [4,] 1 2 3 4 3 2 1
row_max_rects <- apply(s, 1, \(x) as.data.frame(max_rect_under_histogram(x)))
row_max_rects <- do.call(rbind, row_max_rects)
row_max_rects
#> area area_alt ar ar_alt xmin xmax h w
#> 1 1 1 1.0 1.0 4 4 1 1
#> 2 3 3 3.0 3.0 3 5 1 3
#> 3 6 6 1.5 1.5 3 5 2 3
#> 4 10 10 2.5 2.5 2 6 2 5
idx <- which.max(row_max_rects$area)
best_rect <- row_max_rects[idx, ]
# Calculate the y extents of the rectangle
best_rect$ymax <- idx
best_rect$ymin <- idx - best_rect$h + 1
best_rect$target <- target
plot_rects(best_rect, mat)
set.seed(2)
w <- 15
h <- 15
mat <- matrix(sample(c(1, 2), w*h, replace = TRUE, prob = c(4, 1)), h, w)
mat
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
#> [1,] 1 2 1 1 1 1 2 1 1 1 1 1 2
#> [2,] 1 2 1 2 2 1 1 2 1 1 1 1 1
#> [3,] 1 1 2 1 1 2 1 1 2 2 2 2 2
#> [4,] 1 1 2 1 1 1 1 1 2 1 1 1 1
#> [5,] 2 1 1 2 2 1 1 1 1 1 1 1 1
#> [6,] 2 1 1 1 1 1 1 1 1 1 1 2 2
#> [7,] 1 1 2 1 1 1 1 1 1 1 2 1 1
#> [8,] 2 2 1 1 1 1 1 1 1 1 1 1 1
#> [9,] 1 1 1 2 1 1 1 1 1 1 1 2 1
#> [10,] 1 1 1 1 1 1 1 2 1 1 1 1 1
#> [11,] 1 1 2 2 1 2 1 2 1 1 2 1 1
#> [12,] 1 1 1 1 1 2 1 2 2 1 2 1 1
#> [13,] 1 1 1 2 1 1 2 1 1 1 1 1 2
#> [14,] 1 2 1 1 1 1 2 2 2 1 1 1 1
#> [15,] 1 1 2 1 1 1 1 1 1 1 1 1 1
#> [,14] [,15]
#> [1,] 1 2
#> [2,] 1 1
#> [3,] 1 2
#> [4,] 1 1
#> [5,] 1 2
#> [6,] 2 1
#> [7,] 1 1
#> [8,] 1 1
#> [9,] 2 1
#> [10,] 1 1
#> [11,] 2 1
#> [12,] 1 2
#> [13,] 1 1
#> [14,] 1 1
#> [15,] 1 1
# Partition
rects <- partition_matrix(mat)
plot_rects(rects)
Same matrix as above, but penalise the selection of rectangles with high
aspect ratio by setting ar_penalty
.
rects <- partition_matrix(mat, ar_penalty = 0.9)
plot_rects(rects)