-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Leak when calling ConcreteFunction #477
Comments
I think the example looks ok. Can you take a heap dump with VisualVM and see what Java objects it's allocating? |
Just out of curiosity @lucaro , what happens if you don't copy the tensor values to a byte array but simple access the string in the tensor like this?
Just want to narrow down the possible source of leakage |
Also, if that may unblock you until we find the problem, you can build a |
Getting the Thanks for pointing me to the import java.util.ArrayList;
import java.util.Arrays;
import java.util.Random;
import org.tensorflow.ConcreteFunction;
import org.tensorflow.Signature;
import org.tensorflow.ndarray.Shape;
import org.tensorflow.ndarray.buffer.DataBuffer;
import org.tensorflow.ndarray.buffer.DataBuffers;
import org.tensorflow.ndarray.buffer.FloatDataBuffer;
import org.tensorflow.op.Ops;
import org.tensorflow.op.core.Placeholder;
import org.tensorflow.op.io.SerializeTensor;
import org.tensorflow.proto.framework.TensorProto;
import org.tensorflow.proto.framework.TensorShapeProto;
import org.tensorflow.proto.framework.TensorShapeProto.Dim;
import org.tensorflow.types.TFloat32;
import org.tensorflow.types.TString;
public class MemoryLeakTest {
public static Signature serializeTensor(Ops tf) {
Placeholder<TFloat32> input = tf.placeholder(TFloat32.class);
SerializeTensor output = tf.io.serializeTensor(input);
return Signature.builder().input("tensor", input).output("out", output).build();
}
public static void main(String[] args) {
ConcreteFunction function = ConcreteFunction.create(MemoryLeakTest::serializeTensor);
Random random = new Random(0);
//run same operation for many tensors
for (int loop = 0; loop < 10000; loop++) {
//generate some tensor
float[] arr = new float[512 * 512 * 512];
ArrayList<Float> farr = new ArrayList<>(512 * 512 * 512);
for (int i = 0; i < arr.length; i++) {
arr[i] = random.nextFloat();
farr.add(arr[i]);
}
FloatDataBuffer buf = DataBuffers.of(arr);
TFloat32 inputTensor = TFloat32.tensorOf(Shape.of(512, 512, 512), buf);
//serialize tensor via native tensorflow
TString outputTensor = (TString) function.call(inputTensor);
DataBuffer<byte[]> outputBuffer = DataBuffers.ofObjects(byte[].class, 1);
outputTensor.asBytes().read(outputBuffer);
byte[] serialized = outputBuffer.getObject(0);
//serialize tensor via TensorProto
TensorProto proto = TensorProto.newBuilder()
.setTensorShape(TensorShapeProto.newBuilder()
.addDim(Dim.newBuilder().setSize(512).build())
.addDim(Dim.newBuilder().setSize(512).build())
.addDim(Dim.newBuilder().setSize(512).build())
.build())
.addAllFloatVal(farr)
.build();
System.out.println("Generated serialized tensor 1 with " + serialized.length + " bytes");
byte[] serialized2 = proto.toByteArray();
System.out.println("Generated serialized tensor 2 with " + serialized2.length + " bytes");
if (Arrays.equals(serialized, serialized2)) {
System.out.println("Arrays are the same");
} else {
System.out.println("Arrays are not the same");
}
//close tensors
inputTensor.close();
outputTensor.close();
}
//close function in the end
function.close();
}
}
|
So the two missing bytes came from me forgetting to set the data type, after adding |
System information
I have an application where I need to write a large amount of data as TFRecords to be used externally. To encode these records properly, I need to invoke
tf.io.serializeTensor
inside aConcreteFunction
. When running this repeatedly, memory usage increases until the program eventually runs out of memory and crashes. When inspecting the process with VisualVM, I can see that the memory does not build up in the JVM, so it must be some sort of memory leak when calling native code. I added a minimal example below. Did I improperly close something or is this indeed a bug?Code to reproduce the issue
The text was updated successfully, but these errors were encountered: