Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NegativeArraySizeException on @Tag(Integer.MAX_VALUE) #81

Closed
alexshvid opened this issue Jan 23, 2015 · 9 comments
Closed

NegativeArraySizeException on @Tag(Integer.MAX_VALUE) #81

alexshvid opened this issue Jan 23, 2015 · 9 comments
Assignees
Labels
Milestone

Comments

@alexshvid
Copy link
Contributor

Does not work big tag number.

@Tag(Integer.MAX_VALUE)
public boolean unknown = false;

Caused by: java.lang.NegativeArraySizeException
at com.dyuproject.protostuff.runtime.MappedSchema.(MappedSchema.java:84)
at com.dyuproject.protostuff.runtime.RuntimeSchema.(RuntimeSchema.java:331)
at com.dyuproject.protostuff.runtime.RuntimeSchema.createFrom(RuntimeSchema.java:249)
at com.dyuproject.protostuff.runtime.RuntimeSchema.createFrom(RuntimeSchema.java:156)
at com.dyuproject.protostuff.runtime.IdStrategy.newSchema(IdStrategy.java:129)
at com.dyuproject.protostuff.runtime.DefaultIdStrategy$Lazy.getSchema(DefaultIdStrategy.java:630)
... 26 more

@dyu
Copy link
Member

dyu commented Jan 23, 2015

Avoid using large numbers as tags.
Internally, the lookup for the tags would create an array (used as an index on deser) where its length is max number + 1
The exception is thrown since the size of the array is now Integer.MAX_VALUE + 1, which overflows.

@alexshvid
Copy link
Contributor Author

I see implementation, it uses arrays by default, that is the wrong approach of utilizing memory. Otherwise, protostuff needs to check the largest field number with some border and use the Map algorithm.

@dyu
Copy link
Member

dyu commented Jan 23, 2015

Well you can add a new annotation which marks the entity as having "sparse" field numbers.
Then you would configure protostuff-runtime to create an int map (instead of array) for lookup in deser.
Note that your deser speed would be slower especially when the map buckets have more than one entry.

@alexshvid
Copy link
Contributor Author

There is a standard approach in building software where any dumb value have to be processed correctly. If Integer.MAX_VALUE is not correct, it have to be IllegalArgumentException, but not the NegativeArraySizeException. Second thing is the software efficiency. We know that array is faster, but for big tag numbers it consumes more memory, and users are not expecting this. For example, Java sort has two implementations depending on sorting size. Also, David, can you look on this repo https://github.com/inwebby/protostuff-runtime-proto, it is the converter from RuntimeSchema to .proto files.

@kshchepanovskyi
Copy link
Member

Protobuf documentation says:

The smallest tag number you can specify is 1, and the largest is 2^29 - 1, or 536,870,911. You also cannot use the numbers 19000 though 19999 (FieldDescriptor::kFirstReservedNumber through FieldDescriptor::kLastReservedNumber), as they are reserved for the Protocol Buffers implementation - the protocol buffer compiler will complain if you use one of these reserved numbers in your .proto.

We can optionally check this range and throw IllegalArgumentException when tag value is out of range.

@alexshvid
Copy link
Contributor Author

That will be a right approach to validate tag numbers.

@dyu
Copy link
Member

dyu commented Jan 24, 2015

The @Tag annotation was later added by an external contributor, Brice Jaglin (see CONTRIBUTORS.txt). The original implementation used arrays because the field numbers were basically sequential (you couldn't configure them). Their use-case for tags is definitely different than yours.
So if you want it to fit your usecase, submit a PR and we'll be happy to merge it.

@kshchepanovskyi
Copy link
Member

Here is another test case, when we get OutOfMemory:

Add to pom.xml

    <dependency>
      <groupId>org.openjdk.jol</groupId>
      <artifactId>jol-core</artifactId>
      <version>0.3</version>
      <scope>test</scope>
    </dependency>
package io.protostuff.runtime;

import org.junit.Test;
import org.openjdk.jol.info.GraphLayout;

import io.protostuff.Tag;

import static io.protostuff.runtime.RuntimeSchema.createFrom;
import static org.openjdk.jol.info.GraphLayout.parseInstance;

/**
 * @author Konstantin Shchepanovskyi
 */
public class RuntimeSchemaBigTagValueTest
{

    @Test
    public void testOutOfMemory() throws Exception
    {
        long sizeWhenTagIs1 = parseInstance(createFrom(A1.class)).totalSize();
        System.out.println(sizeWhenTagIs1);
        long sizeWhenTagIs1000 = parseInstance(createFrom(A1000.class)).totalSize();
        System.out.println(sizeWhenTagIs1000);
        long sizeWhenTagIs1000_000 = parseInstance(createFrom(A1000_000.class)).totalSize();
        System.out.println(sizeWhenTagIs1000_000);
        long sizeWhenTagIs1000_000_000 = parseInstance(createFrom(A1000_000_000.class)).totalSize();
        System.out.println(sizeWhenTagIs1000_000_000);
    }

    static class A1
    {
        @Tag(1)
        private int x;
    }

    static class A1000
    {
        @Tag(1000)
        private int x;
    }

    static class A1000_000
    {
        @Tag(1000_000)
        private int x;
    }

    static class A1000_000_000
    {
        @Tag(1000_000_000)
        private int x;
    }
}

Result:

1704
5704
4001704

java.lang.OutOfMemoryError: Java heap space
    at io.protostuff.runtime.MappedSchema.<init>(MappedSchema.java:82)
    at io.protostuff.runtime.RuntimeSchema.<init>(RuntimeSchema.java:339)
    at io.protostuff.runtime.RuntimeSchema.createFrom(RuntimeSchema.java:258)
    at io.protostuff.runtime.RuntimeSchema.createFrom(RuntimeSchema.java:150)
    at io.protostuff.runtime.RuntimeSchemaBigTagValueTest.testOutOfMemory(RuntimeSchemaBigTagValueTest.java:26)
    ...

@kshchepanovskyi
Copy link
Member

@dyu

Solution1:
If max tag value < 100 (some meaningful constant), then use array.
If max tag value >= 100, then use hashmap (but not JDK-s HashMap - it will kill performance). I will try to make this fix.

Solution2:
Use hashmap, always.

In any case we need fast int-to-int hashtable.

@kshchepanovskyi kshchepanovskyi added this to the v1.3.2 milestone Feb 21, 2015
@kshchepanovskyi kshchepanovskyi self-assigned this Feb 21, 2015
This was referenced Feb 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants