Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serializer.Unpack Continuosly Stream of Data => High Memory Usage #356

Open
rootfixxxer opened this issue Apr 14, 2023 · 3 comments
Open

Comments

@rootfixxxer
Copy link

I'm making some tests with this package and if I have a connection/stream that it's always sending data, the memory continues to raise until the system crashes.

Investigating the issue with a memory profiler I can see that it's always allocating this objects (there are others), and never disposing them:

  • MsgPack.MessagePackString
  • System.Byte[]
  • MsgPack.MessagePackObject[]
  • System.Collections.Generic.List<MsgPack.MessagePackObject>
  • System.String
  • MsgPack.MessagePackObjectDictionary

There's something that I can do be able to use this library in the cases?
Workaround?

Current usage:

var serializer = MessagePackSerializer.Get<MyType>();
while (!cancellationToken.IsCancellationRequested)
{
   var result = serializer.Unpack(myOpenStream);
   //Do nothing with the results, and the memory increases anyway
   result = null; //JIC makes no difference
}
@yfakariya
Copy link
Member

Thank you for reporting, it looks a bug as you think.
I will confirm few things to reproduce / investigate your condition:

  • Definition of MyType. I guess you use loosely typed collection (dictionary) in your type.
  • Content of myOpenStream.

Could you give me above info?

@yfakariya yfakariya added the need-more-info Need more information (error message, stack trace, repro code etc) label Apr 17, 2023
@rootfixxxer
Copy link
Author

Hello

After some more tests I found out the problem it's with the volume of data present in the stream (Network Stream), I don't know if the handling of the data can be optimized by the library, but I have this as MyType:

  public class MsgPackArray
    {
        [MessagePackMember(0)]
        public IList<List<MessagePackObject>> ListContents { get; set; }
        [MessagePackMember(1)]
        public ulong Location { get; set; }
    }

By default, the List it's always a list with 3 items, first one it's a byte array with 16 bytes, the second one a Uint64 and the third one a MessagePackObjectDictionary, that has 4 keys and 4 values, all strings...

When I receive objects with more than 100k of items in the ListContents that's when the memory starts to raise, and it never drops...

Thanks

@yfakariya yfakariya removed the need-more-info Need more information (error message, stack trace, repro code etc) label Apr 23, 2023
@yfakariya
Copy link
Member

yfakariya commented Apr 23, 2023

Sorry, I was busy to investigate this problem, but I want to know that 1)average size of each string keys and values in dictionary and 2) bit size of your process (32 bit or 64 bit).
Because MessagePackObject must have both of byte array (un-decoded string) and string, so it requires double size for strings, and 100K sized dictionary with few kiro bytes string causes over 2GB memory size.

You can avoid this "over sized" behavior using POCO for the list because it always has three items, like following:

public class MsgPackArray
{
    [MessagePackMember(0)]
    public IList<MsgPackInnerArray> ListContents { get; set; }
    [MessagePackMember(1)]
    public ulong Location { get; set; }
}

public class MsgPackInnerArray
{
    [MessagePackMember(0)]
    public byte[] First { get; set; }
    [MessagePackMember(1)]
    public ulong Second { get; set; }
    [MessagePackMember(2)]
    public Dictionary<string, string> Third { get; set; }
}

This can deserialize following structure (represented as JSON for explanation):

[
  [ // begin ListContents
    0123456789ABCDEF0123456789ABCDEF,
    123, 
    {"A": "1", "B": "2", ... },
  ], // end ListContents
  456, // Location
]

Note that you still might face to OutOfMemoryException when you run in the 32bit process.

Or, if you want to reduce memory size anyway, you can use streaming processing using Unpacker directly.

using (var unpacker = Unpacker.Create(myStream))
{
	if (!unpacker.Read() || unpacker.LastReadData != 2 || !unpacker.IsArrayHeader) throw new Exception("Invalid input");

	var msgPackArrayUnpacker = unpacker.ReadSubtree();

	if (!msgPackArrayUnpacker.Read() || msgPackArrayUnpacker.LastReadData != 3 || !msgPackArrayUnpacker.IsArrayHeader) throw new Exception("Invalid input");

	var first = msgPackArrayUnpacker.ReadItemData().AsBinary();
	var second = msgPackArrayUnpacker.ReadItemData().AsUInt64();

	if (!msgPackArrayUnpacker.Read() || !msgPackArrayUnpacker.IsMapHeader) throw new Exception( "Invalid input" );

	var mapSize = msgPackArrayUnpacker.LastReadData.AsInt32();
	using (var mapUnpacker = msgPackArrayUnpacker.ReadSubtree())
	{
		for (var i = 0; i < mapSize; i++)
		{
			var key = mapUnpacker.ReadItemData().AsString();
			var value = mapUnpacker.ReadItemData().AsString();
			// Process key and value here...
		}
	}

	var location = msgPackArrayUnpacker.ReadItemData().AsUInt64();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants