Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[manos branch] DGL Node Dataloader is not Compatible with Stratified Sampling #1

Open
manoskary opened this issue Aug 20, 2021 · 0 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@manoskary
Copy link
Owner

NodeDataLoader results to error when inhereting Stratified Sampler from pytorch Dataloader.
It seems to be an internal conflict from the dgl dataloader.

To reproduce the error in [manos] branch go to src/models/rcgn-homo and run:

python entity_classify_mp.py --dataset cora --num-of-epochs 30 --gpu -1

The sampler class which creates the issue :

import torch
from sklearn.model_selection import StratifiedKFold

class StratifiedSampler:
    """Stratified batch sampling
    Provides equal representation of target classes in each batch
    """
    def __init__(self, y, batch_size, shuffle=True):
        if torch.is_tensor(y):
            y = y.numpy()
        assert len(y.shape) == 1, 'label array must be 1D'
        n_batches = int(len(y) / batch_size)
        self.skf = StratifiedKFold(n_splits=n_batches, shuffle=shuffle)
        self.X = torch.randn(len(y),1).numpy()
        self.y = y
        self.shuffle = shuffle

    def __iter__(self):
        if self.shuffle:
            self.skf.random_state = torch.randint(0,int(1e8),size=()).item()
        for train_idx, test_idx in self.skf.split(self.X, self.y):
            yield test_idx

    def __len__(self):
        return len(self.y)

The last produced error :

python entity_classify_mp.py --dataset cora --num-epochs 100 --gpu -1 --inductive --batch-size 40

  NumNodes: 2708
  NumEdges: 10556
  NumFeats: 1433
  NumClasses: 7
  NumTrainingSamples: 140
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
torch.Size([140])
(140,)
Traceback (most recent call last):
  File "entity_classify_mp.py", line 249, in <module>
    run(args, device, data)
  File "entity_classify_mp.py", line 123, in run
    for step, (input_nodes, seeds, blocks) in enumerate(dataloader):
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\dgl\dataloading\pytorch\dataloader.py", line 322, in __next__
    result_ = next(self.iter_)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__
    data = self._next_data()
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data
    return self._process_data(data)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data
    data.reraise()
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\torch\_utils.py", line 429, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\torch\utils\data\_utils\worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\torch\utils\data\_utils\fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\dgl\dataloading\pytorch\dataloader.py", line 280, in collate
    result = super().collate(items)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\dgl\dataloading\dataloader.py", line 453, in collate
    items = _prepare_tensor(self.g, items, 'items', self._is_distributed)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\dgl\dataloading\dataloader.py", line 369, in _prepare_tensor
    return F.tensor(data) if is_distributed else utils.prepare_tensor(g, data, name)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\dgl\utils\checks.py", line 38, in prepare_tensor
    data = F.tensor(data)
  File "C:\Users\melki\Desktop\JKU\codes\musym-GDL\env\lib\site-packages\dgl\backend\pytorch\tensor.py", line 46, in tensor
    return th.as_tensor(data, dtype=dtype)
TypeError: only integer tensors of a single element can be converted to an index
@manoskary manoskary added bug Something isn't working help wanted Extra attention is needed labels Aug 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant