Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISCUSSION: Switch to apitools as a dependency #755

Closed
dhermes opened this issue Mar 24, 2015 · 20 comments · Fixed by #811
Closed

DISCUSSION: Switch to apitools as a dependency #755

dhermes opened this issue Mar 24, 2015 · 20 comments · Fixed by #811
Assignees
Labels
api: core api: storage Issues related to the Cloud Storage API. packaging

Comments

@dhermes
Copy link
Contributor

dhermes commented Mar 24, 2015

This was an issue before apitools was on PyPI and before it's dependencies were Py3 compatible. Now neither of these are blockers.

See #754 for context.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 24, 2015

I've tentatively started to test how python3-ready the apitools codebase is and haven't had much luck.

I couldn't even get tox to run: google/apitools#4

In addition, it seems there are some Python 2 specific imports, for example the StringIO module is used.

@dhermes
Copy link
Contributor Author

dhermes commented Mar 31, 2015

@craigcitro Can we chat about this?

@dhermes
Copy link
Contributor Author

dhermes commented Mar 31, 2015

@craigcitro Can you provide a snippet or two for

  1. How to create and use a generated client
  2. Compatibility / dependency issues to worry about (e.g. Py3)

@craigcitro
Copy link
Contributor

  1. using apitools is two steps: first, create a client, and second, load that code, then instantiate and call.

from scratch:

(sample)[~/t/storage_example] $ pip install google-apitools[cli]
... <installing> ...
(sample)[~/t/storage_example] $ gen_client --discovery_url=storage.v1 --nogenerate_cli --outdir=storage client
(sample)[~/t/storage_example] $ l
total 0
0 storage/
(sample)[~/t/storage_example] $ cd storage/
(sample)[~/t/storage_example/storage] $ l
total 120
  4 __init__.py              44 storage_v1_client.py     72 storage_v1_messages.py
(sample)[~/t/storage_example/storage] $ python
Python 2.7.9 (default, Jan  8 2015, 10:14:19)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.54)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import apitools.base.py as apitools_base
>>> import storage_v1_client as storage_client
>>> import storage_v1_messages as storage_messages
>>> client = storage_client.StorageV1()
>>> client.buckets.List(storage_messages.StorageBucketsListRequest(project='google.com:apitools'))
<Buckets
 items: [<Bucket
    ... copious output ...
 kind: u'storage#buckets'>
>>> buckets = _
>>> [bucket.name for bucket in buckets.items]
['apitools', ... <a zillion other buckets> ..., 'apitools-public']
>>> import StringIO
>>> upload = apitools_base.Upload.FromStream(StringIO.StringIO('hello world\n'), mime_type='text/plain')
>>> request = storage_messages.StorageObjectsInsertRequest()
>>> request.bucket = 'apitools'
>>> request.name = 'test_object'
>>> client.objects.Insert(request, upload=upload)
<new object metadata comes back>
>>> client.objects.Get(storage_messages.StorageObjectsGetRequest(bucket='apitools', object='test_object'), download=apitools_base.Download.FromFile('important.txt'))
Received bytes 0-11/12
Download complete
>>>
(sample)[~/t/storage_example/storage] $ cat important.txt
hello world
(sample)[~/t/storage_example/storage] $

there's plenty more in there -- i'm happy to do a little demo, if there's interest.

  1. for compatibility, there are really three questions: generated clients, generated CLIs, and the gen_client tool itself. you should only care about the first, and eventually the third.

the first should already work in py3, or if not, i want to fix it. generated CLIs are furthest out, because it involves changing the codegen to not use gflags/apputils, but i don't think you care right now. for gen_client, it's actually a quick fix (where i plan to clean up 1-2 other small things), but i'm short on time (cf the long response on this! 😏)

@dhermes
Copy link
Contributor Author

dhermes commented Apr 6, 2015

@craigcitro I'll take you up on that demo.

It seems the thorniest parts will be

  • auth
  • media/multipart/resumable uploads
  • batching

@craigcitro
Copy link
Contributor

sure -- thursday would be best for me, but we can coordinate by email. some offhand answers:

  • for auth, you can either let apitools do it, or just pass a valid oauth2client.Credentials object as the credentials argument to the constructor, and it'll use those credentials.
  • multipart and resumable are both handled for you; we choose a default based on size, but you can change it on the Upload/Download instance as desired.
  • batching works fine (though, as discussed, not with media).

@dhermes
Copy link
Contributor Author

dhermes commented Apr 7, 2015

@craigcitro RE: "(though, as discussed, not with media)." When did we discuss? Link? Media here means any files (i.e. non-metadata) at all or "media uploads"? Is it just a missing feature of the library or not supported by the backend?

@craigcitro
Copy link
Contributor

@dhermes #654 (comment)

media means anything involving upload/download of files. apitools doesn't support it (and throws an appropriate error, i believe), but there's no support on the backend.

@dhermes
Copy link
Contributor Author

dhermes commented Apr 7, 2015

Thanks for the reference! It totally slipped my mind.

@dhermes
Copy link
Contributor Author

dhermes commented Apr 8, 2015

Two quick notes:

  • The import import apitools.base.py as apitools_base is not instantaneous. Is there a way to disable whatever magic occurs?
  • client = storage_client.StorageV1() tries to get implicit credentials. Is
client = storage_client.StorageV1(credentials=FOO, get_credentials=False)

sufficient to dodge this overhead?


Overarching theme here: what implicit magic do I need to worry about to turn off?


Secondary to all this, is I noticed a big fat

Generating new OAuth credentials ...
WARNING:root:This function, oauth2client.tools.run(), and the use of the gflags library are deprecated and will be removed in a future version of the library.
Go to the following link in your browser:

Is there a corresponding bug in the repo?

@dhermes
Copy link
Contributor Author

dhermes commented Apr 8, 2015

I filed google/apitools#14 to complain about start-up / import times in apitools.

@craigcitro
Copy link
Contributor

replying to your comment above:

  • it's even simpler for the init -- if you just provide your own credentials, i won't try and fetch any. so

    client = storage.StorageV1(credentials=FOO)
    

    should do the trick. get_credentials is there for the case that you really just don't want credentials -- for instance, you can use this with storage to fetch public objects.

  • it sounds like you tracked the import slowness down to oauth2client, and are fixing it there. 👍

  • i don't think there's any real magic happening at import -- i generally lean towards good defaults over doing anything at import time. i do the usual business of friendly package imports, but i think that's standard practice; i also have a sketch of dynamically generated apitools clients, which involves sys.path munging, but i'm still kicking that one around. i do more in the generated CLIs, but you're not using those.

@dhermes
Copy link
Contributor Author

dhermes commented Apr 8, 2015

I think we may need to use the generated CLI for storage to replace _UploadConfig and _UrlBuilder.

@craigcitro
Copy link
Contributor

do you mean the generated client (eg storage_v1_client.py) or the CLI (storage_v1.py)? the latter is just a little appcommands wrapper around the client, and shouldn't have any logic you need.

@dhermes
Copy link
Contributor Author

dhermes commented Apr 8, 2015

I mean storage_v1_client.py and storage_v1_messages.py

@craigcitro
Copy link
Contributor

cool -- those don't involve any of the generated CLI code i linked above.

@dhermes
Copy link
Contributor Author

dhermes commented Apr 8, 2015

Cool, my bad on wrong word choice. Any thoughts on replacing _UploadConfig and _UrlBuilder?

@craigcitro
Copy link
Contributor

"cli" not being a subset of "client" would make my life much easier. honestly, though, i think the fact that i still "clinet" instead of "client" around 15% of the time means that something is deeply wrong with my brain. 😉

i'm looking through the code in blob -- if you're going to use a generated client, i suspect that you can drop most of it. going to make a comment on the PR shortly.

@dhermes
Copy link
Contributor Author

dhermes commented Apr 8, 2015

Saw the comment. Is the transfer.RESUMABLE_UPLOAD stuff necessary or can we rely on apitools to handle it?

@craigcitro
Copy link
Contributor

if you use a generated client, that all goes away. you'll just pass an upload with the request, and (if you set auto_transfer=False) call StreamInChunks afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: core api: storage Issues related to the Cloud Storage API. packaging
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants