博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
段错误 核心已转储尝试解决
阅读量:6905 次
发布时间:2019-06-27

本文共 5413 字,大约阅读时间需要 18 分钟。

1.在进行

gdb pythonr XX.pywhere

调试时,报出以下错误:

1)每次运行都开38个线程,是否是线程超载[New Thread 0x7ffff2fd2700 (LWP 7415)]

[New Thread 0x7ffff27d1700 (LWP 7416)][New Thread 0x7fffeffd0700 (LWP 7417)][New Thread 0x7fffeb7cf700 (LWP 7418)][New Thread 0x7fffe8fce700 (LWP 7419)][New Thread 0x7fffe67cd700 (LWP 7420)][New Thread 0x7fffe3fcc700 (LWP 7421)][New Thread 0x7fffe17cb700 (LWP 7422)][New Thread 0x7fffdefca700 (LWP 7423)][New Thread 0x7fffdc7c9700 (LWP 7424)][New Thread 0x7fffd9fc8700 (LWP 7425)][New Thread 0x7fffd77c7700 (LWP 7426)][New Thread 0x7fffd4fc6700 (LWP 7427)][New Thread 0x7fffd27c5700 (LWP 7428)][New Thread 0x7fffcffc4700 (LWP 7429)][New Thread 0x7fffcd7c3700 (LWP 7430)][New Thread 0x7fffcafc2700 (LWP 7431)][New Thread 0x7fffc87c1700 (LWP 7432)][New Thread 0x7fffc5fc0700 (LWP 7433)][New Thread 0x7fffc37bf700 (LWP 7434)][New Thread 0x7fffc0fbe700 (LWP 7435)][New Thread 0x7fffbe7bd700 (LWP 7436)][New Thread 0x7fffbbfbc700 (LWP 7437)][New Thread 0x7fffb97bb700 (LWP 7438)][New Thread 0x7fffb6fba700 (LWP 7439)][New Thread 0x7fffb47b9700 (LWP 7440)][New Thread 0x7fffb1fb8700 (LWP 7441)][New Thread 0x7fffaf7b7700 (LWP 7442)][New Thread 0x7fffacfb6700 (LWP 7443)][New Thread 0x7fffaa7b5700 (LWP 7444)][New Thread 0x7fffa7fb4700 (LWP 7445)][New Thread 0x7fffa57b3700 (LWP 7446)][New Thread 0x7fffa2fb2700 (LWP 7447)][New Thread 0x7fffa07b1700 (LWP 7448)][New Thread 0x7fff9dfb0700 (LWP 7449)] [New Thread 0x7fff9b7af700 (LWP 7450)][New Thread 0x7fff98fae700 (LWP 7451)][New Thread 0x7fff967ad700 (LWP 7452)][New Thread 0x7fff93fac700 (LWP 7453)]

 

2)现在报出:

ERROR (theano.gpuarray): Could not initialize pygpu, support disabled。。。  File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init  File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_initGpuArrayException: cuDeviceGet: CUDA_ERROR_INVALID_DEVICE: invalid device ordinal

先不解决这个,先尝试测试一下:

发现,在import keras,也会报上述同样的错误!

conda install mklconda install mkl-service#使用以上两句均显示:# All requested packages already installed.conda install blas

 依旧不可以导入keras包。

 3)将原有的conda环境删除,又新创建了环境,用conda安装了mkl之后,尝试import keras之后,仍然报错:

Using Theano backend.~/lib/python2.7/site-packages/theano/gpuarray/dnn.py:184: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to a version >= v5 and <= v7.  warnings.warn("Your cuDNN version is more recent than "ERROR (theano.gpuarray): Could not initialize pygpu, support disabledTraceback (most recent call last):  File "~/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in 
use(config.device) File "~/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "~/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_initGpuArrayException: cuDeviceGet: CUDA_ERROR_INVALID_DEVICE: invalid device ordinal

在我的.theanorc配置文件中,是这么写的:

[global]floatX = float32device =cuda1

 

尝试去掉cuda编号?居然成功了!

Using Theano backend.~/.conda/envs/xhs/lib/python2.7/site-packages/theano/gpuarray/dnn.py:184: UserWarning: Your cuDNN version is more recent than Theano.  If you encounter problems, try updating Theano or downgrading cuDNN to a version >= v5 and <= v7.  warnings.warn("Your cuDNN version is more recent than "Using cuDNN version 7201 on context NoneMapped name None to device cuda: GeForce GTX 1080 Ti (0000:03:00.0)

 

接下来尝试解决 上述的用户警告。

由于theano已经是1.0.4最新版本,无法再进行更新,只能尝试将cuDNN版本降级。

但是使用conda list查看所有安装的包:

cudnn                     6.0.21                cuda8.0_0    https://mirrors.tuna.tsinghua.edu.cn/a
#尝试此命令查看pygpu是否可用DEVICE="cuda" python -c "import pygpu; pygpu.test()"

 

出现以下问题:

此帮助里说,如果不是使用多个GPU可以忽略test_collectives error。

#尝试以下,python test_gpu.py~/.conda/envs/xhs/lib/python2.7/site-packages/theano/gpuarray/dnn.py:184: UserWarning: Your cuDNN version is more recent than Theano. If you encounter problems, try updating Theano or downgrading cuDNN to a version >= v5 and <= v7.  warnings.warn("Your cuDNN version is more recent than "Using cuDNN version 7201 on context NoneMapped name None to device cuda: GeForce GTX 1080 Ti (0000:03:00.0)[GpuElemwise{exp,no_inplace}(
(float32, vector)>), HostFromGpu(gpuarray)(GpuElemwise{exp,no_inplace}.0)]Looping 1000 times took 0.192847 secondsResult is [1.2317803 1.6187935 1.5227807 ... 2.2077181 2.2996776 1.623233 ]Used the gpu

 

发现其使用的cudnn版本是7.2,明明是6.0但是却调用了7.2?

查看cuda的版本信息发现:

nvcc -Vnvcc: NVIDIA (R) Cuda compiler driverCopyright (c) 2005-2017 NVIDIA CorporationBuilt on Fri_Sep__1_21:08:03_CDT_2017Cuda compilation tools, release 9.0, V9.0.176

 //发现安装cuda简直十分麻烦,所以下尝试一下运行程序。

 

Starting epoch 0...段错误 (核心已转储)

 

 

 

#查看分配占空间的大小ulimit -a#显示stack size              (kbytes, -s) 8192

 

 

 

也就仅仅8M大小,实在是太小了。

改为ulimit -s 102400,仍旧段错误。

试图将其调整为更大或者unlimit时,报错:

 

-bash: ulimit: stack size: 无法修改 limit 值: 不允许的操作

 

#使用sudo提示如下:sudo: ulimit:找不到命令

在limit.conf下加了

 

#*               soft    stack           unlimited

 再使用ulimit -s unlimited就可以用了,但是运行程序发现仍是段错误,继续修改

#max locked memory       (kbytes, -l) 64#尝试修改maxloc但是同样的方法不起作用

——————

终于解决了,在github上keras项目下发布的issue中找到了:

由于本机上的CUDA版本为9,所以又根据教程安装了CUDA8版本,以及cuDNN6.0版本,之后就可以了!!! 

就是由于CUDA9不适合theano1.0!!!所以必须将版本,降版本之后就没有上述的warning了,就可以成功跑theano后端的keras代码了。

转载于:https://www.cnblogs.com/BlueBlueSea/p/10780182.html

你可能感兴趣的文章
Google Shell Style Guide
查看>>
spring和springMVC的上下文
查看>>
create.c
查看>>
2014 蓝桥杯 预赛 c/c++ 本科B组 第九题:地宫取宝(12') [ dp ]
查看>>
手机尾号猜年龄骗局解密
查看>>
并查集加优先队列
查看>>
es6—变量的解构赋值
查看>>
腾讯app任务集市签名 php实现代码
查看>>
【kubernetes】ubuntu14.04 64位 搭建kubernetes过程
查看>>
一条命令搞定在VMware中的Ubuntu14.04 64 位安装Docker
查看>>
抽屉效果学习
查看>>
(转)Unity3D中常用的数据结构总结与分析
查看>>
创建压缩文件,并添加压缩的内容
查看>>
delegation
查看>>
mui 事件绑定(on)
查看>>
ListView 九宫格布局实现
查看>>
spring aop 声明式事务管理
查看>>
List源码学习之LinkedList
查看>>
Kotlin入门(11)江湖绝技之特殊函数
查看>>
Java生日计算年龄工具
查看>>