-
Notifications
You must be signed in to change notification settings - Fork 339
training on own data, and RMSE is nan #25
Description
hey @okuchaiev
I have been trying to train on my own data.
Dataset consists of 539278 user_ids and 1551731 items. Data is super sparse.
While training my RMSE: nan. Should I take absolute value of mseloss?
I have PyTorch 0.4, Cuda 9.0. Training on gtx 1080ti.
Using GPUs: [0] Doing epoch 0 of 12 run.py:198: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number t_loss += loss.data[0] [0, 0] RMSE: 8.0848995 run.py:212: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number total_epoch_loss += loss.data[0] [0, 1000] RMSE: nan [0, 2000] RMSE: nan [0, 3000] RMSE: nan [0, 4000] RMSE: nan [0, 5000] RMSE: nan [0, 6000] RMSE: nan [0, 7000] RMSE: nan [0, 8000] RMSE: nan Total epoch 0 finished in 1966.838391304016 seconds with TRAINING RMSE loss: nan run.py:74: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number total_epoch_loss += loss.data[0] run.py:75: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number denom += num_ratings.data[0] Epoch 0 EVALUATION LOSS: nan Saving model to model_save/model.epoch_0 Doing epoch 1 of 12 [1, 0] RMSE: nan [1, 1000] RMSE: nan [1, 2000] RMSE: nan [1, 3000] RMSE: nan [1, 4000] RMSE: nan [1, 5000] RMSE: nan [1, 6000] RMSE: nan [1, 7000] RMSE: nan [1, 8000] RMSE: nan
Could you please help me out?