这是indexloc提供的服务,不要输入任何密码

Rethinking VAE: From Continuous to Discrete Representations Without Probabilistic Assumptions

Songxuan Shi
Department of Applied Physics
Beijing University of Technology
Beijing 100124, China
shisongxuan@emails.bjut.edu.cn
Abstract

This paper explores the generative capabilities of Autoencoders (AEs) and establishes connections between Variational Autoencoders (VAEs) and Vector Quantized-Variational Autoencoders (VQ-VAEs) through a reformulated training framework. We demonstrate that AEs exhibit generative potential via latent space interpolation and perturbation, albeit limited by undefined regions in the encoding space. To address this, we propose a new VAE-like training method that introduces clustering centers to enhance data compactness and ensure well-defined latent spaces without relying on traditional KL divergence or reparameterization techniques. Experimental results on MNIST, CelebA, and FashionMNIST datasets show smooth interpolative transitions, though blurriness persists. Extending this approach to multiple learnable vectors, we observe a natural progression toward a VQ-VAE-like model in continuous space. However, when the encoder outputs multiple vectors, the model degenerates into a discrete Autoencoder (VQ-AE), which combines image fragments without learning semantic representations. Our findings highlight the critical role of encoding space compactness and dispersion in generative modeling and provide insights into the intrinsic connections between VAEs and VQ-VAEs, offering a new perspective on their design and limitations.

More qualitative analysis and theoretical derivations are missing and will be supplemented in subsequent versions.

1 Introduction

In recent years, models based on VQVAE and VAE have achieved remarkable success in the field of image generation. van den Oord et al. (2017); Kingma and Welling (2022) However, few research have explicitly explored the connections between them.

In VAE, the generation process is first viewed as fitting the data probability distribution, and the variational lower bound (ELBO) can be derived. The objective of minimizing ELBO thus becomes maximizing the KL divergence. Encoder of VAE outputs two parameters: mean and variance, which are constrained to follow a Gaussian distribution with zero mean and unit variance. Finally, the latent variables are sampled via the reparameterization and fed into the decoder.

VAE-generated results tend to be blurry, as studied in beta-VAE Higgins et al. (2017). This blurriness stems from two factors: (1) the pixel-wise (or MSE-based) reconstruction loss, and (2) the over-regularization caused by the KL divergence. Consequently, follow-up works have proposed alternative regularization schemes based on Wasserstein distance and Jensen divergence to mitigate these issues.Tolstikhin et al. (2019); Deasy et al. (2021)

For a long time, we have been interpreting VAEs using the language of probability theory, but the transition to its seemingly "discrete counterpart" (VQVAE) has not felt entirely natural. In this article, we will reconstruct the VAE framework from a new perspective—one that can be naturally extended to VQVAE. Our research in this study aims solely to provide a fresh perspective on the generative capabilities of VAEs, not to introduce a novel model.

This reformulation also leads to an insight: The generative capability of VAEs stems from the KL divergence constraint, which essentially compresses the data manifold and entangles it in a way that enables smooth interpolation. The role of reparameterization is to make the sampling space more continuous rather than leaving it undefined.

2 Autoencoders possess generative capabilities

As the predecessor of Variational Autoencoders (VAEs), the Autoencoder (AE) has been regarded in some studies as a nonlinear extension of Principal Component Analysis (PCA). Similar to classification tasksRifai et al. (2011), AE projects data onto a new manifoldLee (2023), reshaping the structure of the original distribution. While AE is not conventionally treated as a generative model, prior work has suggested that it can generate new samples through latent space interpolation.Berthelot et al. (2018); Sainburg et al. (2019) Similar with the operation in StyleGAN.Karras et al. (2019, 2018) This section will further explore and validate the generative potential of AEs from this perspective.

2.1 Perturbation and Interpolation of Latent Codes in Autoencoders

Refer to caption
Figure 2.1: Add perturbation on the first 8 dimension of the latent space.From +0.1 to +15 Both the encoder and decoder are implemented as 64-dimensional MLPs, 3 layers,with a 128-dimensional latent (encoding) space.
Refer to caption
Figure 2.2: Linear interpolation of number 7 and 8, 3 and 8.

Here, we first explore perturbations in the latent space of an autoencoder. As shown in Figure 2.1, small perturbations do not significantly alter the decoder’s output. However, under large perturbations, the generated images degrade and may even collapse into merely pure black. Next, we perform linear interpolation between the latent codes of two images Figure 2.2. The interpolation is defined as:

𝐳=(1α)𝐳1+α𝐳2,α[0,1]formulae-sequence𝐳1𝛼subscript𝐳1𝛼subscript𝐳2𝛼01\mathbf{z}=(1-\alpha)\mathbf{z}_{1}+\alpha\mathbf{z}_{2},\quad\alpha\in[0,1]bold_z = ( 1 - italic_α ) bold_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_α bold_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_α ∈ [ 0 , 1 ]

where 𝐳𝟏𝐳𝟏\mathbf{z1}bold_z1 and 𝐳𝟐𝐳𝟐\mathbf{z2}bold_z2are the latent codes of two input images, and α𝛼\alphaitalic_α controls the interpolation weight. The decoder produces smoothly transitioning outputs, demonstrating continuous variation between the original images.

From this simple experiment, we can conclude that AEs possess a certain degree of generative capability. However, sampling needs to occur within a defined data space; randomly adding perturbations will only yield meaningless results.

This leads us to wonder: what if we introduce "constraints" in another way?

A common example is classification problems. Image classification is a process that makes data linearly separable, essentially learning a low-dimensional manifold of the data; otherwise, divergent encodings would be unusable. If we train an image classification network before training the AE, it might make the encoding space more compact. Rifai et al. (2011); Connor et al. (2021); Leeb et al. (2023)

As illustrated in Figure 2.3 and Figure 2.4, the classification network’s middle section is a 3-layer, 64-dimension MLP. After training the classifier, we freeze its parameters and feed the intermediate dimension data to the decoder for reconstruction. At this point, applying perturbations to the image no longer results in nearly pure black outputs. Instead, we observe a tendency for the images to transform into other digits, with subtle, though not very pronounced, changes in style.The biggest perturbation here is +50, which is much larger than the +15 from the previous experiment. Simultaneously, linear interpolating within this classification network yielded the same results as with the AE.

Refer to caption
Figure 2.3: Add perturbation on the first dimension of the latent space.From +0.1 to +50. Number 4, 5, 7 and 1.
Refer to caption
Figure 2.4: Linear interpolation of number 5 to 3, 3 to7, 7 to3, 3 to 2, 2 to 9.

3 A New VAE Training Method

In the previous section, we discussed how data compactness and definition within the encoding space determine whether a model possesses generative capabilities (continuous transitions between two samples). In this section, we will explore a new VAE-like training method and its connection to VQVAE.

3.1 Methods

Refer to caption
Figure 3.1: Model Architecture

To introduce "oscillation" during the AE training process, allowing the encoding space to spread across the entire space and avoid undefined points, and to some extent ensure that the manifold boundaries are not separated like in a classification task, we introduce the concept of "clustering centers." The model structure is shown in Figure 3.1

First, we select N images from the dataset and pass them through the same encoder to produce vectors. This batch is used to train for reconstruction, maintaining the overall reconstruction capability of the network.

Simultaneously, we have a learnable vector. We then take M additional images from the dataset and find the image that has the minimum Mean Squared Error (MSE) with this learnable vector (dot product or cosine similarity could also be used). We then constrain the encoded result of this chosen image to have the minimum MSE with this learnable vector, while also requiring that this vector, when passed to the decoder, can reconstruct that specific image. (This term is optional and was not included in subsequent experiments, as we observed it spontaneously decreases when the first two terms are constrained.)

Therefore, the total loss for the entire model is:

totalsubscripttotal\displaystyle\mathcal{L}_{\text{total}}caligraphic_L start_POSTSUBSCRIPT total end_POSTSUBSCRIPT =reconstruction+λ1encoder_pull(+λ2decoder_reconstruct_center)absentsubscriptreconstructionsubscript𝜆1subscriptencoder_pullsubscript𝜆2subscriptdecoder_reconstruct_center\displaystyle=\mathcal{L}_{\text{reconstruction}}+\lambda_{1}\mathcal{L}_{% \text{encoder\_pull}}(+\lambda_{2}\mathcal{L}_{\text{decoder\_reconstruct\_% center}})= caligraphic_L start_POSTSUBSCRIPT reconstruction end_POSTSUBSCRIPT + italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT caligraphic_L start_POSTSUBSCRIPT encoder_pull end_POSTSUBSCRIPT ( + italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT caligraphic_L start_POSTSUBSCRIPT decoder_reconstruct_center end_POSTSUBSCRIPT )
reconstructionsubscriptreconstruction\displaystyle\mathcal{L}_{\text{reconstruction}}caligraphic_L start_POSTSUBSCRIPT reconstruction end_POSTSUBSCRIPT =1Ni=1NxiD(E(xi))2absent1𝑁superscriptsubscript𝑖1𝑁superscriptnormsubscript𝑥𝑖𝐷𝐸subscript𝑥𝑖2\displaystyle=\frac{1}{N}\sum_{i=1}^{N}\|x_{i}-D(E(x_{i}))\|^{2}= divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∥ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_D ( italic_E ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
encoder_pullsubscriptencoder_pull\displaystyle\mathcal{L}_{\text{encoder\_pull}}caligraphic_L start_POSTSUBSCRIPT encoder_pull end_POSTSUBSCRIPT =E(xj)v2absentsuperscriptnorm𝐸superscriptsubscript𝑥𝑗𝑣2\displaystyle=\|E(x_{j}^{*})-v\|^{2}= ∥ italic_E ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) - italic_v ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
decoder_reconstruct_centersubscriptdecoder_reconstruct_center\displaystyle\mathcal{L}_{\text{decoder\_reconstruct\_center}}caligraphic_L start_POSTSUBSCRIPT decoder_reconstruct_center end_POSTSUBSCRIPT =xjD(v)2(Optional)absentsuperscriptnormsuperscriptsubscript𝑥𝑗𝐷𝑣2𝑂𝑝𝑡𝑖𝑜𝑛𝑎𝑙\displaystyle=\|x_{j}^{*}-D(v)\|^{2}(Optional)= ∥ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_D ( italic_v ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_O italic_p italic_t italic_i italic_o italic_n italic_a italic_l )

where xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is an input image, E()𝐸E(\cdot)italic_E ( ⋅ ) is the encoder, D()𝐷D(\cdot)italic_D ( ⋅ ) is the decoder, xjsuperscriptsubscript𝑥𝑗x_{j}^{*}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is the closest image, and v𝑣vitalic_v is the learnable vector.

The reason we divided the dataset’s batch extraction into M and N here is to prevent skewed batch distribution from affecting the convergence speed and makes experimental debugging more convenient. In practical applications, this division is optional.

3.2 Experiments

Here, the learnable encoding acts as a clustering center. Initially, this vector is meaningless, and the parameters within both the encoder and decoder are randomly initialized. Consequently, the entire vector oscillates very violently. We were inspired by the noise2noise mean regression instinctLehtinen et al. (2018), this vector will eventually converge to the clustering center of the entire dataset amidst this violent oscillation. In this way, we achieve both constraint and a thorough definition of points across the entire data space.

Refer to caption
Figure 3.2: Different perturbations (vertical axis) were applied across different dimensions (horizontal axis), with perturbations ranging from +3 to -3.
Refer to caption
Figure 3.3: Horizontal axis: Training Epoch Vertical axis: Euclidean Distance Difference of Encoding Vectors

This means the model, on one hand, must maintain its AE functionality, and on the other hand, it must find a central point in the data and gravitate towards it. Experimental results, as shown in Figure 3.2, were conducted on the MNIST, CelebA, and FashionMNIST datasets, demonstrating a perfectly smooth transition. We achieved this training without using any reparameterization or KL divergence constraints. The encoder and decoder here are both convolutional networks, and they were trained for 20 epochs.

However, the blurriness issue still persists. Then we also tracked the Euclidean distance of this learnable encoding from its previous state, Figure 3.3 which reveals a process of decreasing distance from high to low, thus implicitly achieving annealing.

Next, we expanded the number of learnable vectors, transforming it into a codebook design, similar to VQVAE.

On the MNIST/FasionMINIST dataset, we observed that after 20 epochs of training, these vectors landed in different cluster centers. At this point, this constraint was relaxed, and similar to VQVAE, degeneration still occurred. Some of these vectors eventually converged to the same location, or simply collapsed into an identical solution, as shown in Figure 3.4

Refer to caption
Figure 3.4: The left image shows a collapse scenario during training on FashionMNIST, while the right image shows normal behavior without collapse on MNIST. These two datasets were chosen to better illustrate the experimental results because FashionMNIST is more prone to collapse, whereas MNIST is difficult to collapse. The difference in linear separability for the 10 classes in the two figures is due to the inherent nature of the datasets themselves, and not related to the model.

In Figure 3.5. We directly fed these 20 trained vectors into the decoder to see what they had learned. It’s evident that some learned similar features, some are scattered and largely meaningless, while others are clear numerical images. As for Token 17, it’s notably blurry, very similar to the decoder’s output during the early stages of VAE training.

Refer to caption
Figure 3.5: The 20 vectors trained on MNIST

Simply constraining the MSE to a specific vector cannot achieve generative capability. We then abandoned the previous network design. What we used was a basic Autoencoder, but we added an MSE constraint to the intermediate latent vector, pushing it towards an all-zero vector. The results, as shown in Figure 3.6, indicate severe collapse. This highlights the importance of reparameterization and the dynamic matching mechanism in our new method.

Refer to caption
Figure 3.6: 20 epochs, replacing only the KL divergence constraint with an MSE constraint. No reparameterization.

4 Training with Multiple Vectors

4.1 Training with multiple vectors makes the data space more flexible.

Here, we focus on using a codebook rather than a single vector for training. We previously observed that FashionMNIST is highly prone to collapsp, ultimately becoming indistinguishable from a single vector.

In VQ-VAE, one of the strategies used is Exponential Moving Average (EMA)van den Oord et al. (2017). EMA reduces the oscillation of individual vectors through momentum updates, helping to prevent them from collapsing into the same solution. The core formula for updating a codebook vector eksubscript𝑒𝑘e_{k}italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT using EMA is denoted as follows:

𝐞k𝐞k+α(𝐳q𝐞k)subscript𝐞𝑘subscript𝐞𝑘𝛼subscript𝐳𝑞subscript𝐞𝑘\mathbf{e}_{k}\leftarrow\mathbf{e}_{k}+\alpha(\mathbf{z}_{q}-\mathbf{e}_{k})bold_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ← bold_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_α ( bold_z start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT - bold_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT )

Where: 𝐞ksubscript𝐞𝑘\mathbf{e}_{k}bold_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is thek𝑘kitalic_k-th vector in the codebook. 𝐳qsubscript𝐳𝑞\mathbf{z}_{q}bold_z start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT is the latent vector. α𝛼\alphaitalic_α is the momentum (or learning) rate.

The final t-SNE is shown in LABEL:fig12. The data space also becomes more flexible, as depicted in LABEL:fig11. Here, we applied a +300 additive perturbation, which caused a T-shirt to become longer and its sleeve length to change. Additionally, distortion also occurred.

Refer to caption
(a)
Refer to caption
(b)
Figure 4.1: Experimental results on FashionMNIST. LABEL:fig11 shows the reconstructions from 50 trained vectors, while LABEL:fig12 displays the t-SNE visualization of 20 trained vectors and the encoder output on the test dataset.

4.2 Impact of Encoder Capacity on Learnable Vector Quantity

In LABEL:fig12, we observed something interesting: with 50 learnable vectors, only a small number were properly utilized and attracted data.

To address this, we expanded both the encoder and decoder, making them deeper and incorporating residual connections. The results, shown in Figure 4.2, are promising. This time, with 500 vectors, visibly more of them were correctly "attracted" to data clusters. We then randomly selected 200 of these vectors and fed them into the decoder for visualization. Figure 4.3

Refer to caption
Figure 4.2: The t-SNE visualization displays the 500 trained vectors and the encoder output from the test dataset. Notably, a large number of these trained vectors still haven’t been correctly attracted to data clusters.
Refer to caption
(a)
Refer to caption
(b)
Figure 4.3: Experimental results on FashionMNIST with an expanded model. 3(a) shows reconstructions from randomly selected 100 correctly learned vectors. 3(b) displays reconstructions from many incorrectly learned vectors, where numerous meaningless white ones are present.

Conclusion We have now constructed a preliminary version of a VQVAE in continuous space. It’s worth noting that if we require reconstruction to be performed from the codebook rather than the encoder, and if the encoder outputs multiple vectors instead of just one, then this model becomes entirely equivalent to a standard VQVAE.The capacity of the encoder directly impacts whether enough learnable vectors can be allocated. This "continuous VQVAE" still possesses a certain degree of generative capability, although it requires larger perturbations.

4.3 VQ-AutoEncoder

Given that VQVAE modifies our decoder input to come from a codebook rather than directly from the encoder’s output, let’s try to generalize this. We’ll allow the decoder to accept the encoder’s input (following our previous experimental setup), and at this point, the convolutional encoder network will output multiple vectors instead of single.

What we observe then is that:This generative model completely degenerates into an autoencoder, but with its output constrained by the codebook. Any image processed through the decoder will necessarily result in a combination of vectors from that codebook

Refer to caption
Figure 4.4: From top to bottom:Original Image,Encoder-Decoder Direct Output,Quantized Output (Codebook to Decoder)
Refer to caption
Figure 4.5: Reconstruction of images generated by randomly combining vectors sampled from the codebook and fed into the decoder.
Refer to caption
Figure 4.6: Additive perturbations of +1, +5, +10, +15, +30, +50, and +100 to the first dimension of the first vector in the encoder-to-decoder output.
Refer to caption
(a)
Refer to caption
(b)
Figure 4.7: Reconstruction effect when codebook vectors are selected and passed to the decoder via broadcasting7(a) and zero-padding.7(b)
Refer to caption
Figure 4.8: Randomly replace the encoder’s encoding with a vector from the codebook.

From Figure 4.6, we observe that at this point, no matter how much perturbation is added, the model does not gain a transition capability. Instead, it only increases distortion. In Figure 4.5, completely random combinations yield only meaningless results. In Figure 4.8, arbitrarily replacing one of the vectors with another from the codebook also introduces distortion. In Figure 4.7, the reconstruction results of the codebook vectors themselves are indistinguishable. In Figure 4.4, it is visible that although the model’s codebook vectors cannot learn meaningful semantics like in VQVAE, they form the original image through correct combinations, and these are indeed random combinations, not a degeneration into an AE. (The codebook indices used for this set of images are in the Appendix. A.1)

Therefore, we can conclude that in such circumstances, the model tends to become a discrete encoder, which can be referred to as VQ-AE. Although each vector in the codebook lacks semantic meaning, it can generate relatively clear original images through combination.

5 Conclusion

Up to this point, following our new construction method, VQVAE naturally establishes a connection with VAE, enabling image generation even without using the original training approach. The key to image generation lies in the compactness of the encoding space and its sufficient dispersion to ensure all points are well-defined. When learnable encodings, acting as cluster constraint centers, become dispersed, larger perturbations are required for smooth image transitions. However, when we extend the encoder’s output from a single encoding to multiple (7x7), the model degenerates into a discrete Autoencoder. In this case, it doesn’t learn semantics similar to VQVAE but rather learns image fragments. The encoder and decoder merely learn how to combine these fragments. The underlying reasons for this behavior are currently unclear to us.

6 Related Work

Recent developments in generative modeling have explored the interplay between discrete and continuous latent spaces, latent compactness, and their implications for unsupervised learning. Discrete latent models, such as DVAE++ Vahdat et al. (2018), address the challenge of smoothing the optimization landscape in non-continuous spaces, which directly relates to our exploration of bridging VAE and VQ-VAE from a non-probabilistic perspective. Similarly, Vector Quantized Wasserstein Autoencoder (VQ-WAE) Vuong et al. (2021) integrates optimal transport objectives into quantized models, offering additional insights into the stability and structure of learned codebooks.

Several works also incorporate clustering and latent representation learning into the generative framework. For instance, Deep Generative Clustering with VAEs Adipoetra and Martin (2021) and Variational Clustering Prasad et al. (2021) show how VAEs can naturally extend to unsupervised clustering tasks by shaping the latent manifold. These methods highlight the dual role of VAEs as both generative models and latent space organizers—supporting our hypothesis that generative quality is a function of manifold compression.

Additionally, works like Joint Optimization of Autoencoders for Clustering and Embedding Fard et al. (2020) propose combined objectives for both embedding and clustering, providing architectural cues for learning compact, semantically meaningful representations. Our work builds upon these ideas by exploring how constraining the encoder space, even through deterministic mechanisms like classifiers, can result in more structured and semantically meaningful generation.

References

  • Adipoetra and Martin [2021] Michael Adipoetra and Ségolène Martin. Deep generative clustering with vaes and expectation-maximization. arXiv preprint arXiv:2103.10365, 2021.
  • Berthelot et al. [2018] David Berthelot, Colin Raffel, Aurko Roy, and Ian J. Goodfellow. Understanding and improving interpolation in autoencoders via an adversarial regularizer. CoRR, abs/1807.07543, 2018. URL http://arxiv.org/abs/1807.07543.
  • Connor et al. [2021] Marissa C. Connor, Gregory H. Canal, and Christopher J. Rozell. Variational autoencoder with learned latent structure, 2021. URL https://arxiv.org/abs/2006.10597.
  • Deasy et al. [2021] Jacob Deasy, Nikola Simidjievski, and Pietro Liò. Constraining variational inference with geometric jensen-shannon divergence, 2021. URL https://arxiv.org/abs/2006.10599.
  • Fard et al. [2020] Arash Fard, Thibaut Thonet, and Emilie Morvant. Joint optimization of an autoencoder for clustering and embedding. In Asian Conference on Machine Learning, pages 124–139, 2020.
  • Higgins et al. [2017] Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. beta-VAE: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=Sy2fzU9gl.
  • Karras et al. [2018] Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation, 2018. URL https://arxiv.org/abs/1710.10196.
  • Karras et al. [2019] Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks, 2019. URL https://arxiv.org/abs/1812.04948.
  • Kingma and Welling [2022] Diederik P Kingma and Max Welling. Auto-encoding variational bayes, 2022. URL https://arxiv.org/abs/1312.6114.
  • Lee [2023] Yonghyeon Lee. A geometric perspective on autoencoders, 2023. URL https://arxiv.org/abs/2309.08247.
  • Leeb et al. [2023] Felix Leeb, Stefan Bauer, Michel Besserve, and Bernhard Schölkopf. Exploring the latent space of autoencoders with interventional assays, 2023. URL https://arxiv.org/abs/2106.16091.
  • Lehtinen et al. [2018] Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. Noise2noise: Learning image restoration without clean data, 2018. URL https://arxiv.org/abs/1803.04189.
  • Prasad et al. [2021] Vignesh Prasad, Dipanjan Das, and Brojeshwar Bhowmick. Variational clustering: Leveraging variational autoencoders for image clustering. In Proceedings of the International Conference on Pattern Recognition (ICPR), 2021.
  • Rifai et al. [2011] Salah Rifai, Yann Dauphin, Pascal Vincent, Y. Bengio, and Xavier Muller. The manifold tangent classifier. Advances in Neural Information Processing Systems, 01 2011.
  • Sainburg et al. [2019] Tim Sainburg, Marvin Thielk, Brad Theilman, Benjamin Migliori, and Timothy Gentner. Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions, 2019. URL https://arxiv.org/abs/1807.06650.
  • Tolstikhin et al. [2019] Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, and Bernhard Schoelkopf. Wasserstein auto-encoders, 2019. URL https://arxiv.org/abs/1711.01558.
  • Vahdat et al. [2018] Arash Vahdat, William G. Macready, and Zhengbing Bian. Dvae++: Discrete variational autoencoders with overlapping transformations. arXiv preprint arXiv:1802.04920, 2018.
  • van den Oord et al. [2017] Aäron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. Neural discrete representation learning. CoRR, abs/1711.00937, 2017. URL http://arxiv.org/abs/1711.00937.
  • Vuong et al. [2021] Tung-Long Vuong, Trung Le, and He Zhao. Vector quantized wasserstein auto-encoder. arXiv preprint arXiv:2104.06872, 2021.

Appendix A Appendices

A.1 Codebook Indices Used in the Experiment

In our experiment, the encoder’s output convolution map has dimensions of 7x7 with 512 channels. Below are the codebook indices matched to each convolutional map location (with channels unfolded as vectors, consistent with VQVAE operations).

[9101010101010474444447264374744442630963247267225891194187169331819142016161687197737503737][9172131367910475824603955847445489233758474459605738584744596074385847445960573858197549457387][91721803179171925065650691406500895391406500899328696900338128169650422913233103381]matrix9101010101010474444447264374744442630963247267225891194187169331819142016161687197737503737matrix9172131367910475824603955847445489233758474459605738584744596074385847445960573858197549457387matrix91721803179171925065650691406500895391406500899328696900338128169650422913233103381\begin{bmatrix}9&10&10&10&10&10&10\\ 47&44&44&44&72&64&37\\ 47&44&44&26&30&96&32\\ 47&26&72&25&89&11&94\\ 18&71&69&3&3&1&8\\ 19&14&20&16&16&16&87\\ 19&7&7&37&50&37&37\end{bmatrix}\quad\begin{bmatrix}9&17&21&31&36&79&10\\ 47&58&24&60&3&95&58\\ 47&44&54&89&23&37&58\\ 47&44&59&60&57&38&58\\ 47&44&59&60&74&38&58\\ 47&44&59&60&57&38&58\\ 19&7&54&94&57&38&7\end{bmatrix}\quad\begin{bmatrix}9&17&21&80&31&79&17\\ 19&25&0&65&65&0&6\\ 91&40&65&0&0&89&53\\ 91&40&65&0&0&89&93\\ 28&69&69&0&0&33&81\\ 28&1&69&65&0&42&2\\ 91&32&33&1&0&33&81\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 44 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 64 end_CELL start_CELL 37 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 44 end_CELL start_CELL 26 end_CELL start_CELL 30 end_CELL start_CELL 96 end_CELL start_CELL 32 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 72 end_CELL start_CELL 25 end_CELL start_CELL 89 end_CELL start_CELL 11 end_CELL start_CELL 94 end_CELL end_ROW start_ROW start_CELL 18 end_CELL start_CELL 71 end_CELL start_CELL 69 end_CELL start_CELL 3 end_CELL start_CELL 3 end_CELL start_CELL 1 end_CELL start_CELL 8 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 14 end_CELL start_CELL 20 end_CELL start_CELL 16 end_CELL start_CELL 16 end_CELL start_CELL 16 end_CELL start_CELL 87 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 37 end_CELL start_CELL 50 end_CELL start_CELL 37 end_CELL start_CELL 37 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 21 end_CELL start_CELL 31 end_CELL start_CELL 36 end_CELL start_CELL 79 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 58 end_CELL start_CELL 24 end_CELL start_CELL 60 end_CELL start_CELL 3 end_CELL start_CELL 95 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 54 end_CELL start_CELL 89 end_CELL start_CELL 23 end_CELL start_CELL 37 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 59 end_CELL start_CELL 60 end_CELL start_CELL 57 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 59 end_CELL start_CELL 60 end_CELL start_CELL 74 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 59 end_CELL start_CELL 60 end_CELL start_CELL 57 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 54 end_CELL start_CELL 94 end_CELL start_CELL 57 end_CELL start_CELL 38 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 21 end_CELL start_CELL 80 end_CELL start_CELL 31 end_CELL start_CELL 79 end_CELL start_CELL 17 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 25 end_CELL start_CELL 0 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 0 end_CELL start_CELL 6 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 40 end_CELL start_CELL 65 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 89 end_CELL start_CELL 53 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 40 end_CELL start_CELL 65 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 89 end_CELL start_CELL 93 end_CELL end_ROW start_ROW start_CELL 28 end_CELL start_CELL 69 end_CELL start_CELL 69 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 33 end_CELL start_CELL 81 end_CELL end_ROW start_ROW start_CELL 28 end_CELL start_CELL 1 end_CELL start_CELL 69 end_CELL start_CELL 65 end_CELL start_CELL 0 end_CELL start_CELL 42 end_CELL start_CELL 2 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 32 end_CELL start_CELL 33 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 33 end_CELL start_CELL 81 end_CELL end_ROW end_ARG ]
[91785363679104726270226458472627006458472627411164584726306611658472630810658192630938067][917171017791019292382593269143336536673606865656574736068686565947360686068659476393939393975][945773636771719353535953550477062353535504726358787645847263587876458477358787355819723587878737]matrix91785363679104726270226458472627006458472627411164584726306611658472630810658192630938067matrix917171017791019292382593269143336536673606865656574736068686565947360686068659476393939393975matrix945773636771719353535953550477062353535504726358787645847263587876458477358787355819723587878737\begin{bmatrix}9&17&85&36&36&79&10\\ 47&26&27&0&22&64&58\\ 47&26&27&0&0&64&58\\ 47&26&27&41&11&64&58\\ 47&26&30&66&11&6&58\\ 47&26&30&81&0&6&58\\ 19&26&30&93&80&6&7\end{bmatrix}\quad\begin{bmatrix}9&17&17&10&17&79&10\\ 19&29&23&82&59&32&6\\ 91&43&3&3&65&3&66\\ 73&60&68&65&65&65&74\\ 73&60&68&68&65&65&94\\ 73&60&68&60&68&65&94\\ 76&39&39&39&39&39&75\end{bmatrix}\quad\begin{bmatrix}9&45&77&36&36&77&17\\ 19&35&35&35&95&35&50\\ 47&70&62&35&35&35&50\\ 47&26&35&87&87&64&58\\ 47&26&35&87&87&64&58\\ 47&7&35&87&87&35&58\\ 19&72&35&87&87&87&37\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 85 end_CELL start_CELL 36 end_CELL start_CELL 36 end_CELL start_CELL 79 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 27 end_CELL start_CELL 0 end_CELL start_CELL 22 end_CELL start_CELL 64 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 27 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 64 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 27 end_CELL start_CELL 41 end_CELL start_CELL 11 end_CELL start_CELL 64 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 30 end_CELL start_CELL 66 end_CELL start_CELL 11 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 30 end_CELL start_CELL 81 end_CELL start_CELL 0 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 26 end_CELL start_CELL 30 end_CELL start_CELL 93 end_CELL start_CELL 80 end_CELL start_CELL 6 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 10 end_CELL start_CELL 17 end_CELL start_CELL 79 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 29 end_CELL start_CELL 23 end_CELL start_CELL 82 end_CELL start_CELL 59 end_CELL start_CELL 32 end_CELL start_CELL 6 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 43 end_CELL start_CELL 3 end_CELL start_CELL 3 end_CELL start_CELL 65 end_CELL start_CELL 3 end_CELL start_CELL 66 end_CELL end_ROW start_ROW start_CELL 73 end_CELL start_CELL 60 end_CELL start_CELL 68 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 74 end_CELL end_ROW start_ROW start_CELL 73 end_CELL start_CELL 60 end_CELL start_CELL 68 end_CELL start_CELL 68 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 94 end_CELL end_ROW start_ROW start_CELL 73 end_CELL start_CELL 60 end_CELL start_CELL 68 end_CELL start_CELL 60 end_CELL start_CELL 68 end_CELL start_CELL 65 end_CELL start_CELL 94 end_CELL end_ROW start_ROW start_CELL 76 end_CELL start_CELL 39 end_CELL start_CELL 39 end_CELL start_CELL 39 end_CELL start_CELL 39 end_CELL start_CELL 39 end_CELL start_CELL 75 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 45 end_CELL start_CELL 77 end_CELL start_CELL 36 end_CELL start_CELL 36 end_CELL start_CELL 77 end_CELL start_CELL 17 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 35 end_CELL start_CELL 35 end_CELL start_CELL 35 end_CELL start_CELL 95 end_CELL start_CELL 35 end_CELL start_CELL 50 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 70 end_CELL start_CELL 62 end_CELL start_CELL 35 end_CELL start_CELL 35 end_CELL start_CELL 35 end_CELL start_CELL 50 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 35 end_CELL start_CELL 87 end_CELL start_CELL 87 end_CELL start_CELL 64 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 35 end_CELL start_CELL 87 end_CELL start_CELL 87 end_CELL start_CELL 64 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 7 end_CELL start_CELL 35 end_CELL start_CELL 87 end_CELL start_CELL 87 end_CELL start_CELL 35 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 72 end_CELL start_CELL 35 end_CELL start_CELL 87 end_CELL start_CELL 87 end_CELL start_CELL 87 end_CELL start_CELL 37 end_CELL end_ROW end_ARG ]
[9104536510104744728066585847447206658584744700493758474429653238584726306536581974011347][91017777717104744726646584726291184493719593656536691673658938128553656565219259494969693][9101010101010474444267442647267716786711871840006351800566363221950121214141419777777]matrix9104536510104744728066585847447206658584744700493758474429653238584726306536581974011347matrix91017777717104744726646584726291184493719593656536691673658938128553656565219259494969693matrix9101010101010474444267442647267716786711871840006351800566363221950121214141419777777\begin{bmatrix}9&10&45&36&5&10&10\\ 47&44&72&80&66&58&58\\ 47&44&72&0&66&58&58\\ 47&44&70&0&49&37&58\\ 47&44&29&65&32&38&58\\ 47&26&30&65&3&6&58\\ 19&7&40&1&1&34&7\end{bmatrix}\quad\begin{bmatrix}9&10&17&77&77&17&10\\ 47&44&72&66&4&6&58\\ 47&26&29&11&84&49&37\\ 19&59&3&65&65&3&66\\ 91&67&3&65&89&3&81\\ 28&55&3&65&65&65&2\\ 19&25&94&94&96&96&93\end{bmatrix}\quad\begin{bmatrix}9&10&10&10&10&10&10\\ 47&44&44&26&7&44&26\\ 47&26&7&71&67&86&71\\ 18&71&84&0&0&0&63\\ 51&80&0&56&63&63&22\\ 19&50&12&12&14&14&14\\ 19&7&7&7&7&7&7\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 45 end_CELL start_CELL 36 end_CELL start_CELL 5 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 80 end_CELL start_CELL 66 end_CELL start_CELL 58 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 0 end_CELL start_CELL 66 end_CELL start_CELL 58 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 70 end_CELL start_CELL 0 end_CELL start_CELL 49 end_CELL start_CELL 37 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 29 end_CELL start_CELL 65 end_CELL start_CELL 32 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 30 end_CELL start_CELL 65 end_CELL start_CELL 3 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 40 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 34 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 17 end_CELL start_CELL 77 end_CELL start_CELL 77 end_CELL start_CELL 17 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 66 end_CELL start_CELL 4 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 29 end_CELL start_CELL 11 end_CELL start_CELL 84 end_CELL start_CELL 49 end_CELL start_CELL 37 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 59 end_CELL start_CELL 3 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 3 end_CELL start_CELL 66 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 67 end_CELL start_CELL 3 end_CELL start_CELL 65 end_CELL start_CELL 89 end_CELL start_CELL 3 end_CELL start_CELL 81 end_CELL end_ROW start_ROW start_CELL 28 end_CELL start_CELL 55 end_CELL start_CELL 3 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 2 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 25 end_CELL start_CELL 94 end_CELL start_CELL 94 end_CELL start_CELL 96 end_CELL start_CELL 96 end_CELL start_CELL 93 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 44 end_CELL start_CELL 26 end_CELL start_CELL 7 end_CELL start_CELL 44 end_CELL start_CELL 26 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 7 end_CELL start_CELL 71 end_CELL start_CELL 67 end_CELL start_CELL 86 end_CELL start_CELL 71 end_CELL end_ROW start_ROW start_CELL 18 end_CELL start_CELL 71 end_CELL start_CELL 84 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 63 end_CELL end_ROW start_ROW start_CELL 51 end_CELL start_CELL 80 end_CELL start_CELL 0 end_CELL start_CELL 56 end_CELL start_CELL 63 end_CELL start_CELL 63 end_CELL start_CELL 22 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 50 end_CELL start_CELL 12 end_CELL start_CELL 12 end_CELL start_CELL 14 end_CELL start_CELL 14 end_CELL start_CELL 14 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ]
[91021804917104726422806584744432803858474470328038584744723222385847447232273858197724932387][91010101017454744268671847547264160557819463753228067568562232807380808062272291121414501446][9453622273117192911001341927116322081916911222202280656980049280650003276743939246332]matrix91021804917104726422806584744432803858474470328038584744723222385847447232273858197724932387matrix91010101017454744268671847547264160557819463753228067568562232807380808062272291121414501446matrix9453622273117192911001341927116322081916911222202280656980049280650003276743939246332\begin{bmatrix}9&10&21&80&49&17&10\\ 47&26&4&22&80&6&58\\ 47&44&4&32&80&38&58\\ 47&44&70&32&80&38&58\\ 47&44&72&32&22&38&58\\ 47&44&72&32&27&38&58\\ 19&7&72&49&32&38&7\end{bmatrix}\quad\begin{bmatrix}9&10&10&10&10&17&45\\ 47&44&26&86&71&84&75\\ 47&26&4&1&60&55&78\\ 19&4&63&75&3&22&80\\ 67&56&8&56&22&32&80\\ 73&80&80&80&62&27&22\\ 91&12&14&14&50&14&46\end{bmatrix}\quad\begin{bmatrix}9&45&36&22&27&31&17\\ 19&29&11&0&0&1&34\\ 19&27&11&63&22&0&81\\ 91&69&11&22&22&0&2\\ 28&0&65&69&80&0&49\\ 28&0&65&0&0&0&32\\ 76&74&39&39&24&63&32\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 21 end_CELL start_CELL 80 end_CELL start_CELL 49 end_CELL start_CELL 17 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 4 end_CELL start_CELL 22 end_CELL start_CELL 80 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 4 end_CELL start_CELL 32 end_CELL start_CELL 80 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 70 end_CELL start_CELL 32 end_CELL start_CELL 80 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 32 end_CELL start_CELL 22 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 32 end_CELL start_CELL 27 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 72 end_CELL start_CELL 49 end_CELL start_CELL 32 end_CELL start_CELL 38 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 17 end_CELL start_CELL 45 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 26 end_CELL start_CELL 86 end_CELL start_CELL 71 end_CELL start_CELL 84 end_CELL start_CELL 75 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 4 end_CELL start_CELL 1 end_CELL start_CELL 60 end_CELL start_CELL 55 end_CELL start_CELL 78 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 4 end_CELL start_CELL 63 end_CELL start_CELL 75 end_CELL start_CELL 3 end_CELL start_CELL 22 end_CELL start_CELL 80 end_CELL end_ROW start_ROW start_CELL 67 end_CELL start_CELL 56 end_CELL start_CELL 8 end_CELL start_CELL 56 end_CELL start_CELL 22 end_CELL start_CELL 32 end_CELL start_CELL 80 end_CELL end_ROW start_ROW start_CELL 73 end_CELL start_CELL 80 end_CELL start_CELL 80 end_CELL start_CELL 80 end_CELL start_CELL 62 end_CELL start_CELL 27 end_CELL start_CELL 22 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 12 end_CELL start_CELL 14 end_CELL start_CELL 14 end_CELL start_CELL 50 end_CELL start_CELL 14 end_CELL start_CELL 46 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 45 end_CELL start_CELL 36 end_CELL start_CELL 22 end_CELL start_CELL 27 end_CELL start_CELL 31 end_CELL start_CELL 17 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 29 end_CELL start_CELL 11 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 34 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 27 end_CELL start_CELL 11 end_CELL start_CELL 63 end_CELL start_CELL 22 end_CELL start_CELL 0 end_CELL start_CELL 81 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 69 end_CELL start_CELL 11 end_CELL start_CELL 22 end_CELL start_CELL 22 end_CELL start_CELL 0 end_CELL start_CELL 2 end_CELL end_ROW start_ROW start_CELL 28 end_CELL start_CELL 0 end_CELL start_CELL 65 end_CELL start_CELL 69 end_CELL start_CELL 80 end_CELL start_CELL 0 end_CELL start_CELL 49 end_CELL end_ROW start_ROW start_CELL 28 end_CELL start_CELL 0 end_CELL start_CELL 65 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 32 end_CELL end_ROW start_ROW start_CELL 76 end_CELL start_CELL 74 end_CELL start_CELL 39 end_CELL start_CELL 39 end_CELL start_CELL 24 end_CELL start_CELL 63 end_CELL start_CELL 32 end_CELL end_ROW end_ARG ]
[91785138379104759368682338472974656574619306865656034194360686889539140696060429391434646463053][91717171717171948767677878912722220007622220000732780000807322226322222491375050373737][910101010101047444477267472672580786619708030001618436332631805146202046164619777777]matrix91785138379104759368682338472974656574619306865656034194360686889539140696060429391434646463053matrix91717171717171948767677878912722220007622220000732780000807322226322222491375050373737matrix910101010101047444477267472672580786619708030001618436332631805146202046164619777777\begin{bmatrix}9&17&85&13&83&79&10\\ 47&59&3&68&68&23&38\\ 47&29&74&65&65&74&6\\ 19&30&68&65&65&60&34\\ 19&43&60&68&68&89&53\\ 91&40&69&60&60&42&93\\ 91&43&46&46&46&30&53\end{bmatrix}\quad\begin{bmatrix}9&17&17&17&17&17&17\\ 19&4&87&67&67&78&78\\ 91&27&22&22&0&0&0\\ 76&22&22&0&0&0&0\\ 73&27&80&0&0&0&80\\ 73&22&22&63&22&22&24\\ 91&37&50&50&37&37&37\end{bmatrix}\quad\begin{bmatrix}9&10&10&10&10&10&10\\ 47&44&44&7&7&26&7\\ 47&26&7&25&80&78&66\\ 19&70&80&30&0&0&16\\ 18&43&63&32&63&1&80\\ 51&46&20&20&46&16&46\\ 19&7&7&7&7&7&7\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 85 end_CELL start_CELL 13 end_CELL start_CELL 83 end_CELL start_CELL 79 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 59 end_CELL start_CELL 3 end_CELL start_CELL 68 end_CELL start_CELL 68 end_CELL start_CELL 23 end_CELL start_CELL 38 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 29 end_CELL start_CELL 74 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 74 end_CELL start_CELL 6 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 30 end_CELL start_CELL 68 end_CELL start_CELL 65 end_CELL start_CELL 65 end_CELL start_CELL 60 end_CELL start_CELL 34 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 43 end_CELL start_CELL 60 end_CELL start_CELL 68 end_CELL start_CELL 68 end_CELL start_CELL 89 end_CELL start_CELL 53 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 40 end_CELL start_CELL 69 end_CELL start_CELL 60 end_CELL start_CELL 60 end_CELL start_CELL 42 end_CELL start_CELL 93 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 43 end_CELL start_CELL 46 end_CELL start_CELL 46 end_CELL start_CELL 46 end_CELL start_CELL 30 end_CELL start_CELL 53 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 4 end_CELL start_CELL 87 end_CELL start_CELL 67 end_CELL start_CELL 67 end_CELL start_CELL 78 end_CELL start_CELL 78 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 27 end_CELL start_CELL 22 end_CELL start_CELL 22 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 76 end_CELL start_CELL 22 end_CELL start_CELL 22 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 73 end_CELL start_CELL 27 end_CELL start_CELL 80 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 80 end_CELL end_ROW start_ROW start_CELL 73 end_CELL start_CELL 22 end_CELL start_CELL 22 end_CELL start_CELL 63 end_CELL start_CELL 22 end_CELL start_CELL 22 end_CELL start_CELL 24 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 37 end_CELL start_CELL 50 end_CELL start_CELL 50 end_CELL start_CELL 37 end_CELL start_CELL 37 end_CELL start_CELL 37 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 44 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 26 end_CELL start_CELL 7 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 26 end_CELL start_CELL 7 end_CELL start_CELL 25 end_CELL start_CELL 80 end_CELL start_CELL 78 end_CELL start_CELL 66 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 70 end_CELL start_CELL 80 end_CELL start_CELL 30 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 16 end_CELL end_ROW start_ROW start_CELL 18 end_CELL start_CELL 43 end_CELL start_CELL 63 end_CELL start_CELL 32 end_CELL start_CELL 63 end_CELL start_CELL 1 end_CELL start_CELL 80 end_CELL end_ROW start_ROW start_CELL 51 end_CELL start_CELL 46 end_CELL start_CELL 20 end_CELL start_CELL 20 end_CELL start_CELL 46 end_CELL start_CELL 16 end_CELL start_CELL 46 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ]
[94585138351719433658334915636865853475069686893504758696868935847726968689358197219494937][91010101010104744442686378647442672486932192686872289018874880801805114202020162019777777][91717171717171871717171717880000001406986969896690000016910111151141414121212]matrix94585138351719433658334915636865853475069686893504758696868935847726968689358197219494937matrix91010101010104744442686378647442672486932192686872289018874880801805114202020162019777777matrix91717171717171871717171717880000001406986969896690000016910111151141414121212\begin{bmatrix}9&45&85&13&83&5&17\\ 19&43&3&65&8&3&34\\ 91&56&3&68&65&8&53\\ 47&50&69&68&68&93&50\\ 47&58&69&68&68&93&58\\ 47&72&69&68&68&93&58\\ 19&72&1&94&94&93&7\end{bmatrix}\quad\begin{bmatrix}9&10&10&10&10&10&10\\ 47&44&44&26&86&37&86\\ 47&44&26&72&48&69&32\\ 19&26&86&87&22&89&0\\ 18&87&48&80&80&1&80\\ 51&14&20&20&20&16&20\\ 19&7&7&7&7&7&7\end{bmatrix}\quad\begin{bmatrix}9&17&17&17&17&17&17\\ 18&71&71&71&71&71&78\\ 80&0&0&0&0&0&1\\ 40&69&8&69&69&8&96\\ 69&0&0&0&0&0&1\\ 69&1&0&1&1&1&1\\ 51&14&14&14&12&12&12\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 45 end_CELL start_CELL 85 end_CELL start_CELL 13 end_CELL start_CELL 83 end_CELL start_CELL 5 end_CELL start_CELL 17 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 43 end_CELL start_CELL 3 end_CELL start_CELL 65 end_CELL start_CELL 8 end_CELL start_CELL 3 end_CELL start_CELL 34 end_CELL end_ROW start_ROW start_CELL 91 end_CELL start_CELL 56 end_CELL start_CELL 3 end_CELL start_CELL 68 end_CELL start_CELL 65 end_CELL start_CELL 8 end_CELL start_CELL 53 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 50 end_CELL start_CELL 69 end_CELL start_CELL 68 end_CELL start_CELL 68 end_CELL start_CELL 93 end_CELL start_CELL 50 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 58 end_CELL start_CELL 69 end_CELL start_CELL 68 end_CELL start_CELL 68 end_CELL start_CELL 93 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 72 end_CELL start_CELL 69 end_CELL start_CELL 68 end_CELL start_CELL 68 end_CELL start_CELL 93 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 72 end_CELL start_CELL 1 end_CELL start_CELL 94 end_CELL start_CELL 94 end_CELL start_CELL 93 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 44 end_CELL start_CELL 26 end_CELL start_CELL 86 end_CELL start_CELL 37 end_CELL start_CELL 86 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 26 end_CELL start_CELL 72 end_CELL start_CELL 48 end_CELL start_CELL 69 end_CELL start_CELL 32 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 26 end_CELL start_CELL 86 end_CELL start_CELL 87 end_CELL start_CELL 22 end_CELL start_CELL 89 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 18 end_CELL start_CELL 87 end_CELL start_CELL 48 end_CELL start_CELL 80 end_CELL start_CELL 80 end_CELL start_CELL 1 end_CELL start_CELL 80 end_CELL end_ROW start_ROW start_CELL 51 end_CELL start_CELL 14 end_CELL start_CELL 20 end_CELL start_CELL 20 end_CELL start_CELL 20 end_CELL start_CELL 16 end_CELL start_CELL 20 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL start_CELL 17 end_CELL end_ROW start_ROW start_CELL 18 end_CELL start_CELL 71 end_CELL start_CELL 71 end_CELL start_CELL 71 end_CELL start_CELL 71 end_CELL start_CELL 71 end_CELL start_CELL 78 end_CELL end_ROW start_ROW start_CELL 80 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 40 end_CELL start_CELL 69 end_CELL start_CELL 8 end_CELL start_CELL 69 end_CELL start_CELL 69 end_CELL start_CELL 8 end_CELL start_CELL 96 end_CELL end_ROW start_ROW start_CELL 69 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 69 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 51 end_CELL start_CELL 14 end_CELL start_CELL 14 end_CELL start_CELL 14 end_CELL start_CELL 12 end_CELL start_CELL 12 end_CELL start_CELL 12 end_CELL end_ROW end_ARG ]
[91017562171047447289573758474472895738584744706516584744596506584744546574385819754157387][91045133651047447289605358474454896834584744546060345847445960685358474429606053581972516161258]matrix91017562171047447289573758474472895738584744706516584744596506584744546574385819754157387matrix91045133651047447289605358474454896834584744546060345847445960685358474429606053581972516161258\begin{bmatrix}9&10&17&56&2&17&10\\ 47&44&72&89&57&37&58\\ 47&44&72&89&57&38&58\\ 47&44&70&65&1&6&58\\ 47&44&59&65&0&6&58\\ 47&44&54&65&74&38&58\\ 19&7&54&1&57&38&7\end{bmatrix}\quad\begin{bmatrix}9&10&45&13&36&5&10\\ 47&44&72&89&60&53&58\\ 47&44&54&89&68&34&58\\ 47&44&54&60&60&34&58\\ 47&44&59&60&68&53&58\\ 47&44&29&60&60&53&58\\ 19&7&25&16&16&12&58\end{bmatrix}[ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 17 end_CELL start_CELL 56 end_CELL start_CELL 2 end_CELL start_CELL 17 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 89 end_CELL start_CELL 57 end_CELL start_CELL 37 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 89 end_CELL start_CELL 57 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 70 end_CELL start_CELL 65 end_CELL start_CELL 1 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 59 end_CELL start_CELL 65 end_CELL start_CELL 0 end_CELL start_CELL 6 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 54 end_CELL start_CELL 65 end_CELL start_CELL 74 end_CELL start_CELL 38 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 54 end_CELL start_CELL 1 end_CELL start_CELL 57 end_CELL start_CELL 38 end_CELL start_CELL 7 end_CELL end_ROW end_ARG ] [ start_ARG start_ROW start_CELL 9 end_CELL start_CELL 10 end_CELL start_CELL 45 end_CELL start_CELL 13 end_CELL start_CELL 36 end_CELL start_CELL 5 end_CELL start_CELL 10 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 72 end_CELL start_CELL 89 end_CELL start_CELL 60 end_CELL start_CELL 53 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 54 end_CELL start_CELL 89 end_CELL start_CELL 68 end_CELL start_CELL 34 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 54 end_CELL start_CELL 60 end_CELL start_CELL 60 end_CELL start_CELL 34 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 59 end_CELL start_CELL 60 end_CELL start_CELL 68 end_CELL start_CELL 53 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 47 end_CELL start_CELL 44 end_CELL start_CELL 29 end_CELL start_CELL 60 end_CELL start_CELL 60 end_CELL start_CELL 53 end_CELL start_CELL 58 end_CELL end_ROW start_ROW start_CELL 19 end_CELL start_CELL 7 end_CELL start_CELL 25 end_CELL start_CELL 16 end_CELL start_CELL 16 end_CELL start_CELL 12 end_CELL start_CELL 58 end_CELL end_ROW end_ARG ]