Why does java app crash in gdb but runs normally in real life?

Attempting to run java app from gdb results in segfault, yet running app alone does not. This app is a .JAR which uses JOGL and a bit of memory-mapping to talk to the GPU. Stacktrace below hints...

Continuous Integration Service for GPU package?

Continuous integration services are wonderful for continually testing updates to packages for various languages. These include services like Travis-CI, Jenkins, and Shippable among many others. ...

Why are OpenGL and CUDA contexts memory greedy?

I develop software which usually includes both OpenGL and Nvidia CUDA SDK. Recently, I also started to seek ways to optimize run-time memory footprint. I noticed the following (Debug and Release...

Android Studio 3.x: Emulator very slow (especially rendering)

I've got a problem on my Dell Inspiron 7000 Gaming (i7-7700HQ, 16GB RAM, Nvidia Geforce GTX 1050 Ti, Windows 10) and self-build desktop pc (i5-3350P, 16GB RAM, AMD Radeon HD 7870, Windows 10)...

Best practice for upgrading CUDA and cuDNN for tensorflow

I'm currently in charge of getting tensorflow-gpu 1.8 to work on my machine. I've been using tf-gpu 1.2 until now, but due to some required features, I have to upgrade my installation. Before...

CUDA issue - how to clean install CUDA in Win 10 to resolve cudaGetDevice() failed

I have previously had CUDA 9.x running on this Win 10 64-bit Home system (targeting 1080Ti card), but need to update to CUDA 10.0 for TensorFlow 2. I initially thought TF2 was OK with CUDA 10.1...

GOP setting is not honored by Intel H264 hardware MFT

Problem statement: Intel hardware MFT is not honoring the GOP setting, resulting in more bandwidth consumption in realtime applications. The same code works fine on Nvidia hardware...

Vulkan API : max MSAA samples supported is VK_SAMPLE_COUNT_8_BIT

I am writing Vulkan API based renderer. Currently I am trying to add MSAA for color attachment. I was pretty sure I could use VK_SAMPLE_COUNT_16_BIT ,but limits.framebufferColorSampleCounts...

Training multiple neural networks asynchronously in parallel

The problem I am currently working on a project that I sadly can't share with you. The project is about hyper-parameter optimization for neural networks, and it requires that I train multiple...

Training object detection with model_main.py fails with Windows fatal exception: access violation

I am trying to train an object detection model with model_main.py file. I can train this on ubuntu environment without any problem, but now moved to win 10 (because I have in that PC a GeForece...

Python code to automatically update CC Cleaner & Adobe Flash Player

Recently in the last few months my computer has been constantly telling me to update the flash player, CCleaner is Always out of date & there is usually an update to be done to Bleachbit or...

Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.6.0; Ubuntu 18.04

I am trying to address the issue in the title: Loaded runtime CuDNN library: 7.1.2 but source was compiled with: 7.6.0. CuDNN library major and minor version needs to match or have higher minor...

Google Kubernetes Engine Stackdriver logging/monitoring is gone at gke version 1.15

I'm using GKE for more than year and i never had any problems with stackdriver logging/monitoring. But when i created new cluster with version 1.15.9-gke.26 i don't see any logs in stackdriver...

GPU Passthrough using IOMMU

So im attempting to setup a VM on my debian(buster) linux distro currently when attempting to isolate my nvidia gtx 760 so ive currently followed these steps i ran "lspci -nnk" anf got my gpus...

Tenserflow hangs when running inference with GPU enabled

I am new to AI and TensorFlow and I am trying to use the TensorFlow object detection API on windows. My current goal is to do real time human detection in a video stream. For this I modified a...

Gathering half-float values using AVX

Using AVX/AVX2 intrinsics, I can gather sets of 8 values, either 1,2 or 4 byte integers, or 4 byte floats using: _mm256_i32gather_epi32() _mm256_i32gather_ps() But currently, I have a case where I...

Vulkan: Dynamic buffer size for building/updating acceleration structures (for VK_KHR_ray_tracing)

I'm working on a Vulkan application which should benchmark a few algorithms. For this purpose I want to implement a relatively low performance algorithm as well - but still as optimal as possible...

psplash-write does not work when psplash is started from the init script (PID=1)

I have integrated psplash to a custom Yocto layer for NVIDIA Jetson Nano. I want to run psplash from the very first init script (PID=1). The reason is to cover the time spent by systemd to load...

How to execute parallel compute shaders across multiple compute queues in Vulkan?

Update: This has been solved, you can find further details here: https://stackoverflow.com/a/64405505/1889253 A similar question was asked previously, but that question was initially focused...

Using Blender's GUI application from WSL2

Recently a fellow intern finished his internship. For this internship, he wrote some automated fluid simulations in Blender on Linux. I am a Windows user, and my only option for this is running...

Illegal instruction(core dumped) error on Jetson Nano

Sorry if my description is long and boring but I want to give you most important details to solve my problem. Recently I bought a Jetson Nano Developer Kit with 4Gb of RAM, finally!, and in order...

Getting error "CUDA backend requires cuDNN" when configuring OpenCV cmake build with cuda backend turned on

My goal: My goal is to configure a build of OpenCV 4.5.1-dev with support for cuda, tesseract and QT without any cmake error.. The problem I am having: I am getting the following error when I...

How to automatically select idle GPU for model traning in tensorflow?

I am using nvidia prebuilt docker container NVIDIA Release 20.12-tf2 to run my experiment. I am using TensorFlow Version 2.3.1. Currently, I am running my model on one of GPU, I still have 3 more...

How to choose GPU when running phoronix-test-suite benchmark?

I am new to Phoronix Test Suite and ran my first test with phoronix-test-suite benchmark testname. This ran the test for one of my GPUs but not the other. How can I choose which GPU to use for the...

Visual Studio CPU Profiler / No Data

Today my CPU Usage Performance profiler in Visual Studio stopped working. I start it, it seems like it's working, you can see the graph of the CPU going up and down but when it finishes I get no...

tensorflow - how to use 16 bit precision float

Question float16 can be used in numpy but not in Tensorflow 2.4.1 causing the error. Is float16 available only when running on an instance with GPU with 16 bit support? Mixed precision Today,...

How to use MAGMA with NVIDIA GPU card instead of CPU LAPACKE to inverse large matrix

I need to inverse large matrices and I would like to modify my current LAPACKE version routine in order to exploit the powerfull of a GPU NVIDIA Card. Indeed, my LAPACKE routines works well for...

GPU memory is empty, but CUDA out of memory error occurs

During training this code with ray tune(1 gpu for 1 trial), after few hours of training (about 20 trials) CUDA out of memory error occurred from GPU:0,1. And even after terminated the training...

Python quits silently upon importing tensorflow, no error/logs

I currently have a problem with tensorflow that I've been stuck with for almost 2 weeks now. No matter what Python script I run, every time tensor flow is imported import tensorflow Python just...

What is fastest way to read files line by line?

I have written a code in Python to read a file line by line and perform some averaging and summation operations. I need suggestions in speeding it up. The number of lines in the pressurefile is...