Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: For an array_value, subvalues must either all be indexed or all excluded from indexes. #3152

Closed
thisbejim opened this issue Mar 16, 2017 · 21 comments · Fixed by #4915
Assignees
Labels
api: datastore Issues related to the Datastore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@thisbejim
Copy link

thisbejim commented Mar 16, 2017

Getting ValueError: For an array_value, subvalues must either all be indexed or all excluded from indexes. error when querying entities with empty arrays.

Edit: I tried getting around this with projection queries where possible but I need to modify the entities after fetching them, which you can't do with the results of a projection query. So this is a pretty big issue as there is no way around it.

@dhermes dhermes added the api: datastore Issues related to the Datastore API. label Mar 16, 2017
@lukesneeringer lukesneeringer added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. status: acknowledged labels Mar 16, 2017
@lukesneeringer
Copy link
Contributor

lukesneeringer commented Mar 16, 2017

Hi @thisbejim,
Thanks for reporting this issue.

First off, a disclaimer: I am not an expert in Datastore.

That said, I am having a little trouble with a fix for your issue, and it would help if you could provide the code that you think should work. It makes sense to me that an empty array should not raise an exception complaining about mismatched index exclusion of its subelements. That said, I am having trouble making an empty array function at all.

Here is our code that checks for consistency on index exclusion:

        if is_list:
            exclude_values = set(value_pb.exclude_from_indexes
                                 for value_pb in value_pb.array_value.values)
            if len(exclude_values) != 1:
                raise ValueError('For an array_value, subvalues must either '
                                 'all be indexed or all excluded from '
                                 'indexes.')

At first I thought this was a simple fix; simply add and len(value) > 0: to the check. However (again, remember, I am unfamiliar with Datastore), when I went to write tests for it, I ran into trouble:

    def test_index_mismatch_ignores_empty_list(self):
        from google.cloud.proto.datastore.v1 import entity_pb2
        from google.cloud.datastore.helpers import _new_value_pb

        _PROJECT = 'PROJECT'
        _KIND = 'KIND'
        _ID = 1234
        entity_pb = entity_pb2.Entity()
        entity_pb.key.partition_id.project_id = _PROJECT
        entity_pb.key.path.add(kind=_KIND, id=_ID)

        array_val_pb = _new_value_pb(entity_pb, 'baz')
        array_pb = array_val_pb.array_value.values

        entity = self._call_fut(entity_pb)
        entity_dict = dict(entity)
        self.assertIsInstance(entity_dict['baz'], list)

This raises an exception when I try to run the function under test (which is entity_from_protobuf):


self = <unit_tests.test_helpers.Test_entity_from_protobuf testMethod=test_index_mismatch_ignores_empty_list>

    def test_index_mismatch_ignores_empty_list(self):
        from google.cloud.proto.datastore.v1 import entity_pb2
        from google.cloud.datastore.helpers import _new_value_pb
    
        _PROJECT = 'PROJECT'
        _KIND = 'KIND'
        _ID = 1234
        entity_pb = entity_pb2.Entity()
        entity_pb.key.partition_id.project_id = _PROJECT
        entity_pb.key.path.add(kind=_KIND, id=_ID)
    
        array_val_pb = _new_value_pb(entity_pb, 'baz')
        array_pb = array_val_pb.array_value.values
    
        # unindexed_value_pb1 = array_pb.add()
        # unindexed_value_pb1.integer_value = 10
    
>       entity = self._call_fut(entity_pb)

unit_tests/test_helpers.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
unit_tests/test_helpers.py:66: in _call_fut
    return entity_from_protobuf(val)
.tox/py35/lib/python3.5/site-packages/google/cloud/datastore/helpers.py:124: in entity_from_protobuf
    value = _get_value_from_value_pb(value_pb)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

value_pb = 

    def _get_value_from_value_pb(value_pb):
        """Given a protobuf for a Value, get the correct value.
    
        The Cloud Datastore Protobuf API returns a Property Protobuf which
        has one value set and the rest blank.  This function retrieves the
        the one value provided.
    
        Some work is done to coerce the return value into a more useful type
        (particularly in the case of a timestamp value, or a key value).
    
        :type value_pb: :class:`.entity_pb2.Value`
        :param value_pb: The Value Protobuf.
    
        :rtype: object
        :returns: The value provided by the Protobuf.
        :raises: :class:`ValueError <exceptions.ValueError>` if no value type
                 has been set.
        """
        value_type = value_pb.WhichOneof('value_type')
    
        if value_type == 'timestamp_value':
            result = _pb_timestamp_to_datetime(value_pb.timestamp_value)
    
        elif value_type == 'key_value':
            result = key_from_protobuf(value_pb.key_value)
    
        elif value_type == 'boolean_value':
            result = value_pb.boolean_value
    
        elif value_type == 'double_value':
            result = value_pb.double_value
    
        elif value_type == 'integer_value':
            result = value_pb.integer_value
    
        elif value_type == 'string_value':
            result = value_pb.string_value
    
        elif value_type == 'blob_value':
            result = value_pb.blob_value
    
        elif value_type == 'entity_value':
            result = entity_from_protobuf(value_pb.entity_value)
    
        elif value_type == 'array_value':
            result = [_get_value_from_value_pb(value)
                      for value in value_pb.array_value.values]
    
        elif value_type == 'geo_point_value':
            result = GeoPoint(value_pb.geo_point_value.latitude,
                              value_pb.geo_point_value.longitude)
    
        elif value_type == 'null_value':
            result = None
    
        else:
>           raise ValueError('Value protobuf did not have any value set')
E           ValueError: Value protobuf did not have any value set

This makes me wonder if an empty array is even a valid option. (If I add a value to the array, it works fine.)

Long story short: Can you provide code with what you are trying to do, and your expected result?

lukesneeringer pushed a commit to lukesneeringer/google-cloud-python that referenced this issue Mar 16, 2017
@thisbejim
Copy link
Author

thisbejim commented Mar 16, 2017

Hi @lukesneeringer thanks for looking in to this.

So to provide some context - I've built out an application using the node datastore client which does allow for writing and retrieving empty arrays, and I'm setting up a seperate python service just to handle my cron jobs / tasks.

The application is a micro-blogging clone, so I have a Kind of User with an array property called following which holds ids of pages a user is following. An empty following array is saved on the node server when a user creates an account . What happens when using the python client is that just retrieving a user (with a simple get or a query) raises the error above:

key = db.key('User', 'username')
user = db.get(key) # error

or

query = db.query(kind='User')
users = list(query.fetch()) # error

If I use a query with a projection to avoid retrieving the empty following array there is no error:

query = db.query(kind='User')
query.add_filter('username', '=', 'username')
query.projection = ['created']
users = list(query.fetch()) # no error

@lukesneeringer
Copy link
Contributor

Thanks. I will look into this further.

@lukesneeringer
Copy link
Contributor

@thisbejim Could you help me out by pulling this branch and telling me if it fixes your issue, or if you just get a different error? (I am asking under the assumption this is a quick check for you; if not, feel free to just say no.)

@alercunha
Copy link

I'm facing the same issue described here. I haven't tried the branch yet, but is there any workaround available?

@dhermes
Copy link
Contributor

dhermes commented May 2, 2017

@alercunha The branch may be a path to a workaround, but it seems @lukesneeringer was looking for someone to vet the fix.

@alercunha
Copy link

alercunha commented May 3, 2017

I just tested the suggested fix and in my case it does work. Any help needed promoting this fix to be merged back to master?

@arliber
Copy link

arliber commented Jul 13, 2017

It happens to me only when there is an EMPTY indexed array. As a workaround, I added a placeholder value instead of keeping it empty.

Is there any progress with fixing that?

@ishcherbakov
Copy link

Hi guys!

I'm facing the same issue and I use datastore in read-only mode so I can't do anything with original data. Any news about the fix @lukesneeringer @dhermes ?

@metekemertas
Copy link

Facing the same issue in a critical moment. Any updates @lukesneeringer @dhermes ?

@lukesneeringer
Copy link
Contributor

@MeteKem @ishcherbakov Does the branch I had in place a few months ago solve your issue? If it is actually a valid fix, then I am happy to merge it.

@alercunha
Copy link

@lukesneeringer it did fix the issue in my case.

@metekemertas
Copy link

@lukesneeringer It fixed the issue in my case too.

lukesneeringer pushed a commit to lukesneeringer/google-cloud-python that referenced this issue Aug 8, 2017
@lukesneeringer
Copy link
Contributor

Okay, it is now a pull request. :-)

lukesneeringer pushed a commit to lukesneeringer/google-cloud-python that referenced this issue Aug 8, 2017
@lukesneeringer lukesneeringer added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Aug 11, 2017
@rmanak
Copy link

rmanak commented Oct 19, 2017

same issue here, is there a fix for this that's merged into some version of datastore client library ?

@alercunha
Copy link

alercunha commented Oct 19, 2017

I've forked the project just to do merge from upstream/master to the PR branch that contains this fix. You can clone my branch and install it manually if you like.

So you would have to do:

$ git clone https://github.com/alercunha/google-cloud-python/
$ cd google-cloud-python
$ git checkout datastore-issue-3152
$ cd datastore && python setup.py install

@rmanak
Copy link

rmanak commented Oct 19, 2017

thanks @alercunha but can't seem to be able to figure out the dependencies ....
in a virtualenv after python setup.py install
getting:
ImportError: No module named 'google.api_core'
after trying to do pip install . on the parent repo level, I also need to install the google-auth, so when trying to do pip install google-auth getting this error message:
AttributeError: '_NamespacePath' object has no attribute 'sort'

The script I am trying to run is as simple as this:

import google.auth
from google.cloud import datastore
credentials, project = google.auth.default()
client = datastore.Client(project=project, credentials=credentials)
query = client.query(kind='Event')
res = query.fetch()
for item in res:
    print(list(item.items()))

any help to get this working with your patch would be greatly appreciated.

@alercunha
Copy link

Looks like some recent changes end up breaking certain dependencies.

@alercunha
Copy link

alercunha commented Oct 19, 2017

Luckily I tagged a version that was working fine a few days before.
So you can do this instead of checking out the branch:

$ git checkout tags/1.3.0-datastore-issue-3152
$ cd datastore && python setup.py install

@rmanak
Copy link

rmanak commented Oct 19, 2017

@alercunha thank you! that worked!

@yahyamortassim
Copy link

yahyamortassim commented Mar 5, 2019

Still having this error with google-cloud-datastore==1.7.3 when trying to query a key that has the format: [{"key1":"","key2":"value2","key3":[]}] via

query = datastore_client.query(kind='KIND')
query.add_filter('name', '=', 'EXAMPLE')
res = list(query.fetch())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet