[TF2.0] Broadcasting

Hello everyone,

I would like resuscitate very old issue. Actually, it is so old that even github's autocompletion doesn't offer it after typing "#" - https://github.com/tensorflow/tensorflow/issues/216. This request was raised several times, but still it hasn't been resolved.

In short, *broadcasting* interface is not "good enough" in TensorFlow :)

Lets first check how broadcasting works in `numpy`:

```python
In [1]: import numpy as np
In [2]: a = np.random.rand(2, 3, 4)
In [3]: b = np.random.rand(4, 5)
In [4]: a @ b
Out[4]:
array([[[1.42709275, 1.40630067, 0.46525725, 0.68734581, 0.65227036],
        [2.01336504, 1.59980866, 0.93739699, 0.63190484, 0.92472892],
        [1.82979902, 1.46193243, 0.85498406, 0.5994646 , 0.77767957]],

       [[1.83010035, 1.49088728, 0.76694665, 0.65568003, 0.89110954],
        [2.12214864, 1.41728107, 1.04566743, 0.60652825, 0.97115822],
        [2.32478779, 2.06297214, 1.02016205, 0.81821249, 1.02604722]]])
```

Now, let's check what the TF is offering:

```python
In [25]: a = tf.random.normal((2, 3, 4))
In [26]: b = tf.random.normal((4, 5))
In [27]: a @ b
... InvalidArgumentError: In[0] is not a matrix. Instead it has shape [2,3,4] [Op:MatMul] name: matmul/
```

Ouch! The "correct" way of doing it in the TF (of course there are other) is:

```python
In [26]: a = tf.random.normal((2, 3, 4))
In [27]: b = tf.random.normal((4, 5))
In [28]: a @ tf.broadcast_to(b, tf.concat([a.shape[:-2], b.shape], axis=0))
<tf.Tensor: id=87, shape=(2, 3, 5), dtype=float32, numpy=
array([[[ 1.1977772 , -1.363074  ,  1.8021748 ,  0.1448586 , -0.6269997 ],
        [ 1.2322128 , -2.1586194 ,  0.09486479,  0.02937585, 0.9694344 ],
        [ 0.5580032 ,  6.11664   , -0.24535722,  0.16691092, -2.2263217 ]],
       [[-0.7386743 ,  1.2142425 ,  1.1371945 , -1.2736351 , -2.971829  ],
        [-1.9222848 , -0.7198772 , -0.9807504 ,  0.02805561, 1.0210879 ],
        [ 1.8334148 ,  0.80895233,  1.2308785 , -0.23910654, -1.5128168 ]]], dtype=float32)>
```

You can see how much effort it requires to make operations broadcastable for two distinct tensors: extract leading shape from the left tensor, extract shape from the right tensors, concatenate these shapes with correct axis, call `tf.broadcast_to`...

The same applies to cholesky, triangular solve and other operations. That is very upsetting that such a crucial feature isn't available out of the box.

Another concern is the performance of these "solutions". E.g. memory consumption for tiling and `broadcast_to` operations, as they simply copy the tensor to match leading dimensions. Of course, native TensorFlow broadcasting implementation would be preferable in this case.

Kind regards,
Artem Artemev




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TF2.0] Broadcasting #26204

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[TF2.0] Broadcasting #26204

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions