Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NDB Python library #2

Closed
jturmel opened this issue Jun 3, 2013 · 32 comments
Closed

Add NDB Python library #2

jturmel opened this issue Jun 3, 2013 · 32 comments
Labels
api: datastore Issues related to the Datastore API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@jturmel
Copy link

jturmel commented Jun 3, 2013

b/11319745

We already use Google App Engine, it would be nice if the NDB library was available. Do you guys have a timeline on that?

@proppy
Copy link
Contributor

proppy commented Jun 3, 2013

As one of the Google Cloud Datastore engineer commented here, it should be possible to adapt NDB to use googledatastore instead of datastore_rpc

@Alfus
Copy link

Alfus commented Sep 15, 2013

We are actively working on this, but I can't give you a timeline.

@christopherhesse
Copy link

Is there any news on this? I'd rather not implement this myself if it can be avoided.

@proppy
Copy link
Contributor

proppy commented Jan 30, 2014

@christopherhesse the engineering team is still working on it, unfortunately we don't have any timeline to share ATM.

@procedurallygenerated
Copy link

Is it any closer to a release? Thanks

@lucemia
Copy link

lucemia commented Jul 17, 2014

How about the db library? Will it also be supported in the future?

@obeleh
Copy link

obeleh commented Jul 17, 2014

I've been wanting to open source a library we use in order to make usage of the python googledatastore library easier. It's not NDB but should save out some code. After reading a few demands for NDB I've took the time document and open source it. For me it made using GCD a lot easier. I'm interested in feedback. @proppy can you also have a look at it? Perhaps some functionality could be added to the original library?

The wrapper: https://github.com/transceptor-technology/dbWrapper

@jgeewax
Copy link

jgeewax commented Jul 17, 2014

I'd be happy to take a look too. Feel free to share.

@obeleh
Copy link

obeleh commented Jul 17, 2014

Cool to have some extra eyes on it :)

@jgeewax
Copy link

jgeewax commented Jul 17, 2014

(Sorry - looks like the link didn't come through over e-mail... seeing it only in the UI... looking now.)

@jgeewax
Copy link

jgeewax commented Jul 17, 2014

@obeleh : What are your thoughts on what we have in gcloud-python ? It seems very similar to what you're doing but follows PEP8, etc (https://github.com/GoogleCloudPlatform/gcloud-python/blob/master/gcloud/datastore/demo/demo.py).

dbWrapper

for entity in getEntities(kind='Person', height_eq=min_height):
    first_name = entity.getProperty('first_name')
    last_name = entity.getProperty('last_name')
    height = entity.getProperty('height')
    print '%s %s, %d inches tall' % (first_name, last_name, height)

gcloud.datastore

for entity in dataset.query().kind('Person').filter('height >=', min_height):
    first_name = entity['first_name']
    last_name = entity['last_name']
    height = entity['height']
    print '%s %s, %d inches tall' % (first_name, last_name, height)

@obeleh
Copy link

obeleh commented Jul 17, 2014

I haven't seen this gcloud.datastore yet. How new/old is it?

@jgeewax
Copy link

jgeewax commented Jul 17, 2014

Last commit was in May. Started this year. Adding more support as quickly
as possible.

Docs are here: http://googlecloudplatform.github.io/gcloud-python/

@obeleh
Copy link

obeleh commented Jul 17, 2014

Man I wish that library existed before I started building my own.

My current drawback is that it's all synchronous code. I'm using twisted in our backend (I hope to use tulip in the future). I recently ran into memory problems when using the python bigquery client. Combined with the fact that that is also synchronous I've started implementing the JSON api in twisted to go as async as possible. The BQ JSON Api implementation saves about 10x in memory consumption when running queries. Any chance you guys are going to build async as well?

@jgeewax
Copy link

jgeewax commented Jul 17, 2014

It's on the list:
googleapis/google-cloud-python#40 (along with
the request for this bug).

Pull requests are always welcome too !

@aliafshar
Copy link

It's not a great solution, but I swear by deferToThread for easy integration. http://unpythonic.blogspot.com/2012/07/calling-google-drive-api-and-other.html

@obeleh
Copy link

obeleh commented Jul 18, 2014

I've had some bad experiences with it in combination with googleclouddatastore because it uses httplib2. deferToThread should be a last resort. httplib2 is not thread safe. That's why I do this: https://github.com/transceptor-technology/dbWrapper/blob/master/dbWrapper.py#L37

EDIT: deferToThread is great to use if you only have to do it a few times. But spinning up hundreds of threads brings a lot of overhead. Programmatically because it uses extra memory and mostly mentally because you have to be careful with what you do in threads. Debugging with a large pool of running threads is not funny either. In the case where each request you handle requires a few GCD, GCS and/or BQ actions you will see a wild growth of threads. And since httplib2 isn't very nice when requesting large quantities of data I've had multiple occasions where linux killed my process. Okay I can allways use a larger gcs instance but that is not the point. It hinders scaling unnecessarily. The best way to work with it is to do as much batch operations as possible. But I have use cases where batching it up isn't possible.

@lucemia
Copy link

lucemia commented Jul 19, 2014

Thanks for the great works @obeleh @jgeewax,
it is much easier/fun to use cloud datastore now. :)

In terms of db/ndb of cloud-datastore,
I did some experiment based on previous comment.

Here is some results, it is not a complete implementation yet.
just a proof of concept, it did work.

while implement the orm, there are two method:

  1. monkey patch the datastore_rpc.py, async_get ... or _make_rpc_call method. however, this method need to mock rpc object. I don't have a clear thought about how to do it yet.
  2. create a new datastore_stub and register to api_proxy, just like the datastore_sqlite_stub.py or datastore_file_stub.py did in development server. This method looks like easier.

The project is still under construction.
Any feedback or comments are welcome.

@jgeewax
Copy link

jgeewax commented Jul 22, 2014

Wow @lucemia : this looks pretty awesome!

There's certainly talk happening about making it easy for ndb to be used inside gcloud.datastore.

I wonder... could you take a crack at a pull request for https://github.com/GoogleCloudPlatform/google-python that puts ndb (or db) in gcloud.datastore.ndb/gcloud.datastore.db ? (Ideally we wouldn't do any from google.ext import ndb type calls, but just from gcloud.datastore import ndb..

If you could do that, we could all work on the pull request together to try to get things working in a branch and merge to the mainline...

Let me know if I can do anything to help...

@Alfus
Copy link

Alfus commented Jul 22, 2014

FYI, datastore_rpc.py and datastore_query.py were designed to support swapping out the low level transport layer. Given an adapter and a connection that works on top of the Cloud Datastore API, ndb should work (as long as you disable caching). NDB will officially support the Cloud Datastore API in this way in the future.

@lucemia
Copy link

lucemia commented Jul 25, 2014

@jgeewax I will happy to create a pull request for it! Will work on it next week :)
Could you give me some ideas which way is preferred to handle the original ndb library? Copy files to gcloud-python or use submodule?

@lucemia
Copy link

lucemia commented Jul 28, 2014

@Alfus
At first through I want to find a way to connect cloud datastore without modify code in original GAE sdk. The datastore_rpc.py way looks like some modify the datastore.py to allow switch between original datastore & cloud datastore is need. I would like go over this method again after finish my first stable version of cluod datastore orm. Would you mind to give me some hint about this method ?

@littleq0903
Copy link

Will it be more suitable if placing google-cloud-datastore-orm into Google App Engine SDK rather than gcloud-python?

In my idea, gcloud-python is for generic python environment to use Google Cloud relevant services, and Google App Engine SDK is for GAE only, ndb-datastore-orm currently only works under GAE runtime.

Just my 2 cents:)

P.S. I'm another Cloud GDE who co-works with David.

@obeleh
Copy link

obeleh commented Oct 29, 2014

I've been building a client for the JSON api. I could use some feedback :)

https://github.com/transceptor-technology/txGoogle

@anupamme
Copy link

any timeline on when this is coming?

@lucemia
Copy link

lucemia commented Jan 16, 2015

This is what I created and currently using in production. https://github.com/lucemia/gcloud-python-orm

@anupamme
Copy link

@lucemia not clear to me what changes I need to do so that the result of this code get reflected in my datastore. e.g. I ran testInsert code expecting that it will be insert data into my datastore but it does not (I am a ndb noob here so this assumption may be stupid).

@lucemia
Copy link

lucemia commented Jan 16, 2015

@anupamme
The usage is quiet simple

Setup cloud datastore

# gcloud-datstore
from gcloud import datastore
from gcloudorm import model
dataset = datastore.get_dataset(
    [dataset_id],
    [service_account],
    [p12-key]
)

model.Model.dataset = dataset

Replace ndb with gcloudorm.model

from gcloudorm import model as ndb

class Layout(ndb.Model):
    publisher_id = ndb.IntegerProperty(indexed=False)
    slot_id = ndb.IntegerProperty(indexed=False)

@aatreya
Copy link

aatreya commented May 9, 2016

@lucemia Are you still working on gcloudorm?

Is there going to be official Google Cloud Platform support for an NDB-like interface to datastore? Should folks just use the App Engine Remote API, or is that going to be deprecated eventually?

@alexpirine
Copy link

Hi there… any way to use cloud datastore with async or should I just use another database? Not being able to run parallel lookups makes a small page fetch in 6 seconds…

@obeleh
Copy link

obeleh commented Dec 19, 2018

You could use this: https://github.com/transceptor-technology/aiogcd

elharo added a commit that referenced this issue Oct 30, 2019
@kolea2 kolea2 added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed escalated labels Apr 7, 2021
@meredithslota meredithslota added the api: datastore Issues related to the Datastore API. label Nov 24, 2021
@meredithslota
Copy link

The NDB repo is here: https://github.com/googleapis/python-ndb (it's been out for awhile but this issue didn't get updated, apologies).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the Datastore API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests