Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow termination; what(): std::bad_alloc #28

Open
dadatawajue opened this issue Jun 11, 2016 · 6 comments
Open

tensorflow termination; what(): std::bad_alloc #28

dadatawajue opened this issue Jun 11, 2016 · 6 comments

Comments

@dadatawajue
Copy link

Hi, I don't have nvidia so I've been trying to use TensorFlow backend with --mrf-w=0 to speed things up (Theano works but really slow), but I get this error (tested with many different images that all worked using Theano backend).. any ideas how to fix it?

xxx:~/Code/python/neural-image-analogies$ make_image_analogy.py images/1.jpg images/1.jpg images/2.jpg out/arch --mrf-w=0
Using TensorFlow backend.
Using PatchMatch model
Scale factor 0.25 "A" shape (1, 3, 603, 653) "B" shape (1, 3, 300, 225)
Building loss...
Precomputing static features...
Building and combining losses...
Start of iteration 0 x 0
Current loss value: 62929842176.0
Image saved as out/arch_at_iteration_0_0.png
Iteration completed in 1359.27 seconds
Start of iteration 0 x 1
Current loss value: 59368124416.0
Image saved as out/arch_at_iteration_0_1.png
Iteration completed in 1354.37 seconds
Start of iteration 0 x 2
Current loss value: 58041049088.0
Image saved as out/arch_at_iteration_0_2.png
Iteration completed in 1315.46 seconds
Start of iteration 0 x 3
Current loss value: 57320632320.0
Image saved as out/arch_at_iteration_0_3.png
Iteration completed in 1324.93 seconds
Start of iteration 0 x 4
Current loss value: 56854339584.0
Image saved as out/arch_at_iteration_0_4.png
Iteration completed in 990.21 seconds
/home/xxx/Code/python/neural-image-analogies/venv/local/lib/python2.7/site-packages/scipy/ndimage/interpolation.py:573: UserWarning: From scipy 0.13.0, the output shape of zoom() is calculated with round() instead of int() - for these inputs the size of the returned array has changed.
  "the returned array has changed.", UserWarning)
Scale factor 0.625 "A" shape (1, 3, 1508, 1633) "B" shape (1, 3, 751, 563)
Building loss...
Precomputing static features...
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)
@sdierauf
Copy link

sdierauf commented Aug 3, 2016

That looks like it's running out of memory when trying to initialize the larger image, try scaling down the size of your source images by 50% and see if it still works. How much memory does your system have?

@chenyuqing
Copy link

I have the same problem and my ram is 5.55GB

timchan@ubuntu:~/workspaces/dl/tf$ python3 full_code.py
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)

@fjcamillo
Copy link

Having the same problem with (100, 512, 512, 3). Will try scaling the image down but are their any other work around here?

@AyushKaul
Copy link

Try with lesser hidden layers if you are using a fully connected NN.

@tamizharasank
Copy link

same issue

terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Aborted (core dumped)

@a7month
Copy link

a7month commented Oct 10, 2018

same issue when use tensorflow-jni in java application

tensorflow version is 1.2,the java crash log shows that the last stack in ###Java_org_tensorflow_TensorFlow_registeredOpList###

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f84bdcac7c9, pid=4315, tid=140204879898368
#
# JRE version: Java(TM) SE Runtime Environment (8.0_74-b02) (build 1.8.0_74-b02)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.74-b02 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libtensorflow_jni4459560773440445764.so+0x1d017c9]

siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000000000000

Registers:
RAX=0x0000000000000000, RBX=0x00007f8277200000, RCX=0x00007f8277200001, RDX=0x000000000000008f
RSP=0x00007f83fe0f8ca0, RBP=0x00007f83fe0f8ca0, RSI=0x00007f8277200000, RDI=0x0000000000000000
R8 =0x00000000000003b5, R9 =0x00007f83f9800000, R10=0x00007f83d3a52160, R11=0x00007f83f980b7c8
R12=0x000000000281ee60, R13=0x00007f83fe0f9090, R14=0x00007f83fe0f8f50, R15=0x00007f852c93d640
RIP=0x00007f84bdcac7c9, EFLAGS=0x0000000000010206, CSGSFS=0x000000000000e033, ERR=0x0000000000000006
  TRAPNO=0x000000000000000e

Instructions: (pc=0x00007f84bdcac7c9)
0x00007f84bdcac7a9:   89 45 f0 eb 1d 48 8b 45 f8 48 8d 50 01 48 89 55
0x00007f84bdcac7b9:   f8 48 8b 55 f0 48 8d 4a 01 48 89 4d f0 0f b6 12
0x00007f84bdcac7c9:   88 10 48 8b 45 d8 48 8d 50 ff 48 89 55 d8 48 85
0x00007f84bdcac7d9:   c0 75 d2 48 8b 45 e8 5d c3 66 2e 0f 1f 84 00 00

Register to memory mapping:

RAX=0x0000000000000000 is an unknown value
RBX=0x00007f8277200000 is an unknown value
RCX=0x00007f8277200001 is an unknown value
RDX=0x000000000000008f is an unknown value
RSP=0x00007f83fe0f8ca0 is pointing into the stack for thread: 0x00007f84ca75a000
RBP=0x00007f83fe0f8ca0 is pointing into the stack for thread: 0x00007f84ca75a000
RSI=0x00007f8277200000 is an unknown value
RDI=0x0000000000000000 is an unknown value
R8 =0x00000000000003b5 is an unknown value
R9 =0x00007f83f9800000 is an unknown value
R10=0x00007f83d3a52160 is an unknown value
R11=0x00007f83f980b7c8 is an unknown value
R12=0x000000000281ee60 is an unknown value
R13=0x00007f83fe0f9090 is pointing into the stack for thread: 0x00007f84ca75a000
R14=0x00007f83fe0f8f50 is pointing into the stack for thread: 0x00007f84ca75a000
R15=0x00007f852c93d640: pthread_key_create+0 in /lib64/libpthread.so.0 at 0x00007f852c931000

Stack: [0x00007f83fdffe000,0x00007f83fe0ff000],  sp=0x00007f83fe0f8ca0,  free space=1003k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libtensorflow_jni4459560773440445764.so+0x1d017c9]
C  [libtensorflow_jni4459560773440445764.so+0x1a0c46e]
C  [libtensorflow_jni4459560773440445764.so+0x18bc6da]
C  [libtensorflow_jni4459560773440445764.so+0x18910bb]
C  [libtensorflow_jni4459560773440445764.so+0x18b9ae9]
C  [libtensorflow_jni4459560773440445764.so+0x18842ef]
C  [libtensorflow_jni4459560773440445764.so+0x1885143]
C  [libtensorflow_jni4459560773440445764.so+0x20bd47]
C  [libtensorflow_jni4459560773440445764.so+0x20bf52]
C  [libtensorflow_jni4459560773440445764.so+0x201783]  Java_org_tensorflow_Session_run+0x3f3
j  org.tensorflow.Session.run(J[B[J[J[I[J[I[JZ[J)[B+0
j  org.tensorflow.Session.access$100(J[B[J[J[I[J[I[JZ[J)[B+17
j  org.tensorflow.Session$Runner.runHelper(Z)Lorg/tensorflow/Session$Run;+336
j  org.tensorflow.Session$Runner.run()Ljava/util/List;+2
objdump -C -d --start-address=0x1a0c46e libtensorflow_jni4459560773440445764.so | egrep '>:$' -m 1
0000000001a0c46e <Java_org_tensorflow_TensorFlow_registeredOpList+0x18094ee>:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants