wandb logging PermissionError and OSError

Description: When running experiments using Weights and Biases (wandb), I occasionally get a PermissionError for Python's logging library and OSError for accessing the TLS CA cert. I had...

wandb - access logged values during runtime

How can I retrieve a logged value from wandb before the run was finished? ``` import os import wandb wandb.init(project='someproject') def loss_a(): # do_stuff and log: ...

wandb - how to get it really silent (weights and biases)

Working with Anaconda-Spyder (python 3.7), I installed the latest release of wandb (0.10.7) and try to use it with tensorflow (2.1.0) and keras (2.3.1). Since then, my console is polluted with...

Permission denied (Errno 13) - wandb init - In docker container

I am running a docker container on a cluster where I try to run an open source code : https://github.com/xinntao/BasicSR. However I encounter the following error: As I can see it's the package...

How to use wandb with the gpt-2-simple package?

I am using the gpt-2-simple package on Google Colab to fine-tune gpt-2 on my own data set. I would like to monitor the training loss and the validation loss using wandb; how can I do that? I am...

Weight & Biases Detectron2 Google Colab - wandb: ERROR Unable to log event [Errno 95] Operation not supported

I am training a Faster-RCNN model by Detectron2 in Google Colab. I would like to track my experiments with Weights and Biases (WandB). My dataset is uploaded to Google Drive and mounted to the...

WandB - Why are "step" ticks in diagrams calculated differently from given? (Python)

I evaluate a deep learning model (StyleGan2-Ada-Pytorch). The procedure is to calculate the metrics afterwards based on a log file (this takes a long time, therefore not done in training). The...

W&B Keras callback TypeError: ufunc 'isfinite' not supported for the input types

I'm trying a TF model where the input is string tensor as input, my model contains a TextVectorization layer for text processing which is available in TF 2.2. The training fails in W&B callback...

wandb: get a list of all artifact collections and all aliases of those artifacts

The wandb documentation doesn't seem to explain how to do this - but it should be a fairly common use case I'd imagine? I achieved mostly (but not completely) what I wanted like this, but it seems...

wandb - RuntimeError: CUDA out of memory

I'm trying to run the model given in the hyperparameter optimization example from the simple transformers documentation, but while searching for hyperparameters after a certain number of...

PyTorch Lightning wants create a folder on import due to usage of wandb, which raises error on AWS Lambda

So I want to build a Docker image with PyTorch Lightning that can be used with AWS lambda. However, when the function is invoked it raises an OS Error, that claims it uses a Read-only file system...

How to get multiple lines exported to wandb

I am using the library weights and biases. My model outputs a curve (a time series). I'd like to see how this curve changes throughout training. So, I'd need some kind of slider where I can...

Weights and Biases watch log causing CUDA out of memory

I am trying to use WandB gradient visualization to debug the gradient flow in my neural net on Google Colab. Without WandB logging, the training runs without error, taking up 11Gb/16GB on the p100...

Weights and Biases: Login and network errors

I recently installed Weights and Biases (wandb) for recording the metrics of my machine learning projects. Everything worked fine when connected to wandb cloud instance or when I used a local...

Control the logging frequency and contents when using wandb with HuggingFace

I am using the wandb with my HuggingFace code. I would like to log the loss and other metrics. Now I have two questions - How does wandb decide when to log the loss? Is this decided by...

I am getting attribute Error while running train.py in YOLOV5. Can anyone help me with this?

When I run python train.py --img 640 --batch 4 --epochs 5 --data training/dataset.yaml --cfg training/yolov5l.yaml --weights yolov5l.pt for YOLO V5 in my system I get the following nd why is...

What do the charts in the System Panels signify in Wandb (PyTorch)

I recently started using the wandb module with my PyTorch script, to ensure that the GPU's are operating efficiently. However, I am unsure as to what exactly the charts indicate. I have been...

Hyperparam search on huggingface with optuna fails with wandb error

I'm using this simple script, using the example blog post. However, it fails because of wandb. It was of no use to make wandb OFFLINE as well. ``` from datasets import load_dataset,...

YoloV5 killed at first epoch

I'm using a virtual machine on Windows 10 with this config: Memory 7.8 GiB Processor Intel® Coreā„¢ i5-6600K CPU @ 3.50GHz × 3 Graphics llvmpipe (LLVM 11.0.0, 256 bits) Disk Capcity 80.5...

When is one supposed to run wandb.watch so that weights and biases tracks params and gradients properly?

I was trying out the wandb library and I run wandb.watch but that doesn't seem to work on my code. It's not supposed to be anything to complicated so I am puzzled why it's not...

weights and biases: binarized PR curve for multi-class problem

I'm using w&b to visualize results from experiments I'm running from simple transformers wrapper. It's a multi-class label. W&B displays PR curves by class but I need a binarized PR curve for...

using Weights and Biases for logging from multiple Ray detached actors

I'm using Ray detached actors for distributing Apex-DQN workers. A replay server, learner, and multiple workers are launched as detached actors in the main script, and are tracked by ray as...

wandb.wandb_agent - ERROR - Detected 5 failed runs in a row, shutting down

While trying to setup wandb, I am facing the following error: ``` wandb: WARNING Calling wandb.login() after wandb.init() has no effect. ...

AttributeError: 'NoneType' object has no attribute '_global_run_stack'

Description I am using PTAN library with an A3C model and I am trying to work with wandb sweep but I've encountered some weird problems, and I am not sure if it's a bug regarding sweep (because if...

System Memory keeps increasing in deep learning using pytorch & pytorch lighning (kaggle kernel)

My kaggle kernel's system memory just keeps growing during GPU training and I can't find where the problem is. Log information on Wandb shows that System Memory keeps going up and finally reaches...

Weights&Biases Sweep - Why might runs be overwriting each other?

I am new to ML and W&B, and I am trying to use W&B to do a hyperparameter sweep. I created a few sweeps and when I run them I get a bunch of new runs in my project (as I would expect): Image: New...

trainer.train() in Kaggle: StdinNotImplementedError: getpass was called, but this frontend does not support input requests

When saving a version in Kaggle, I get StdinNotImplementedError: getpass was called, but this frontend does not support input requests whenever I use the Transformers.Trainer class. The general...

Deep Learning, NLP, Valuenet: TypeError: string indices must be integers

I'm getting this error while trying to implement the project Valuenet <https://github.com/brunnurs/valuenet> on my laptop and I don't understand what exactly it means. I'm getting this error after...

How to plot confidence intervals for different training samples

I am working on running training with different divisions of a training set. The plots that I get (using wandb) are fine, but not quite informative in my opinion and high in variance. Is there a...

Weights & Biases sweep cannot import modules with pytorch lightning

I am training a variational autoencoder, using pytorch-lightning. My pytorch-lightning code works with a Weights and Biases logger. I am trying to do a parameter sweep using a W&B parameter...