provided example does not use GPU #95

ekg · 2023-02-24T17:24:51Z

I'm following your example https://github.com/jerryji1993/DNABERT#22-model-training. I did not use apex as I am unable to compile it under the python3.6 environment. Otherwise, I've exactly followed the provided code.

I do not appear to have any GPU utilization. My system has two V100s and I can confirm that they are functioning based on other tests.

The process has been running nearly a day now, and says "Epoch: 1%". I did not expect the example training to be so slow and not GPU driven...

Is this normal? If not, what am I doing wrong?

CandideThunder · 2023-03-01T07:48:18Z

I had a similar issue. Please try a:
echo "import torch;print (torch.cuda.is_available())"|python
that should return 'True'
In my case, it did not. I updated cudatoolkit to the newest version, than everything worked (on gpu).:

 2058  conda remove pytorch torchvision cudatoolkit
 2059  conda install pytorch torchvision cudatoolkit -c pytorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provided example does not use GPU #95

provided example does not use GPU #95

ekg commented Feb 24, 2023

CandideThunder commented Mar 1, 2023

provided example does not use GPU #95

provided example does not use GPU #95

Comments

ekg commented Feb 24, 2023

CandideThunder commented Mar 1, 2023