Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal]: get, get_values filling empty values response from Google Sheet (with expected cells dimension) #947

Closed
ccppoo opened this issue Oct 30, 2021 · 20 comments

Comments

@ccppoo
Copy link
Contributor

ccppoo commented Oct 30, 2021

This issue is made to share clear status for solving problem

Point of Contention (1 / 2)

I wrote replies at other issue get_values does not return all data in some scenarios...

But I just want to share some opinions before making a PR.

get() method

def get(self, range_name=None, **kwargs):

and get_values() method

def get_values(self, range_name=None, **kwargs):

has no difference but returning value after wrapping with util function fill_gaps()

def fill_gaps(L, rows=None, cols=None):

so I thought "is it really necessary to separate those two methods?"

Also in Documentation,

Worksheet.get

Workshhet.get_values

has no difference in explanation, because the only difference is, as I mentioned above,

get_values() returns value got from get() after wrapping with fill_gaps()

return fill_gaps(self.get(range_name, **kwargs))

I think it's better to integrate by separating the slight difference by kwargs and it could be more clear


Point of Contention (2 / 2)

Just like issue #910, when people use method like

values = MyWorksheet.get('A1:C2')
# or
values = MyWorksheet.get_values('A1:C2')

people those who are first to use this method would have expected
2-D list A1 to C2, even when every cell is empty

just like this:

# values
[
   [ None, None, None ],
   [ None, None, None ]
]

for more clear examples, see my explanation from issue comment : #910 (comment)

I know google send just empty value, literally no values,
when cells in range is empty like this

response = self.spreadsheet.values_get(range_name, params=params)

print(response)
# {'range': 'test!A2:C2', 'majorDimension': 'ROWS'}

This could be handled like this from my comment

def getMember(self, name: str, replace_empty=None) -> Member:
        result: Cell = self.memberSheet.find(name, in_column=USER_NAME_COL)

        if not result or not result.value == name:
            return MEMBER_UNKNOWN

        # y, x
        start = rowcol_to_a1(result.row, result.col - 1)
        end = rowcol_to_a1(result.row, result.col - 1 + MEMBER_INFO_width - 1)

        # edit : in this case, it's 1-D list, see that I used [0] indexing below
        # expected column length
        data = [replace_empty  for _ in range(MEMBER_INFO_width + 1)]

        reply = self.memberSheet.get_values(
            f'{start}:{end}',
            value_render_option='UNFORMATTED_VALUE'
        )[0]

        emptyFilled = [val if val else dafault for val, dafault in zip_longest(
            reply, data, fillvalue=replace_empty)]
        return emptyFilled

Conclusion

My proposals are...

  1. integrate get() and get_value() to get()

add kwargs like maintain_shape to get expected dimension from given range

result = worksheet.get('A1:D2')

# result
[
 ['', 123 ] ,
 [ '',   ''  , 42]
]

# result(after fix), default = None
result = worksheet.get('A1:D2', maintain_shape = True)

# result
[
 [ '', 123, '',''],
 [ '', '', 42, '']
]
  1. add another kwargs option(default value) to fill empty values
result = worksheet.get('A1:D2', maintain_shape = True, default = None)
# result(after fix), default = None
[
 [ None, 123, None ,None],
 [ None, None , 42, None]
]
@ccppoo ccppoo changed the title get. get_values filling empty values response from Google Sheet (with expected cells dimension) get, get_values filling empty values response from Google Sheet (with expected cells dimension) [Proposal] Oct 30, 2021
@lavigne958
Copy link
Collaborator

Thank you @ccppoo for this very clear explanation.

I agree we only need get method, but the default behavior should be to fill the gaps with empty string. I don't see anyone using the get method and having list of different lengths to work with 😑

Remember that the method get_values includes the first row as header all the time, very useful to fill pandas.DataFrame. And a lot of the users rely on this.

@ccppoo
Copy link
Contributor Author

ccppoo commented Oct 30, 2021

@lavigne958
Just as you mentioned, I was concerned about the changes could brought by removing one of the methods
but duplication, with almost same name or ambiguous functionality , is the worst thing...

So should it be preserved? or integrated?
should this issue need other maintainer's opinion?

in case of removing one of it, for later major version, making one of method being deprecated is will be necessary


in case of pandas.DataFrame, current default value from get_values() will be safe from changes
if get_values() kwargs :: default = ''

example brought from pandas.DataFrame

values = worksheet.get_values('A1:C3')
df2 = pd.DataFrame(values, columns=['a', 'b', 'c'], dtype=pd.Int8Dtype)
# or
df2 = pd.DataFrame([[1, '', 3], [4, 5, 6], [7, 8, 9]], columns=['a', 'b', 'c'], dtype=pd.Int8Dtype)

print(df2)
#    a  b  c
# 0  1     3
# 1  4  5  6
# 2  7  8  9

would it be critical for changes?

@lavigne958
Copy link
Collaborator

I agree by merging the 2 methods into 1.

we could:

  • update get_values so it always return a matrix of the expected size (with empty values that can be configured using default:string argument value)
  • update the method get so it prints a warning that this function is deprecated and users should use get_values instead.
    this can be done using the package warning, one can use the following example

I just double checked and I was wrong sorry, the method get_values does not return the sheet header, it is the method get_all_records which has a different purpose.

Please don't hesitate to open a new PR with your changes, please read the contributing documentation first 🙂

@lavigne958 lavigne958 added this to the 5.0.0 milestone Nov 2, 2021
@lavigne958 lavigne958 removed this from the 5.0.0 milestone Nov 21, 2021
@lavigne958
Copy link
Collaborator

@ccppoo when you are ready for your dev, let me know here and I will plan which release will integrate your changes 😉

@ccppoo
Copy link
Contributor Author

ccppoo commented Dec 3, 2021

@lavigne958 I'm not sure I could finish by Dec. 20, for version 5.1.0 🤔

@lavigne958
Copy link
Collaborator

@lavigne958 I'm not sure I could finish by Dec. 20, for version 5.1.0 🤔

No worries, take you time.

@lavigne958 lavigne958 added this to the 5.2.0 milestone Jan 6, 2022
@lavigne958
Copy link
Collaborator

I planned it for the 5.2.0 release, for 1st february, but don't worry if you don't have time we can push it back to later release.
no pressure 😛

@ccppoo
Copy link
Contributor Author

ccppoo commented Jan 7, 2022

@lavigne958 I'll work on it, I will feel guilty if I delay over 2 months 🤔

@lavigne958
Copy link
Collaborator

can't plan this feature for 5.3.0 it will most probably bring breaking change (at least by the fact that the method get will disappear).

must wait for next major release

@lavigne958
Copy link
Collaborator

Hi @ccppoo are you still working on that feature?

@botcs
Copy link

botcs commented Jul 16, 2023

@lavigne958 @ccppoo thanks for the work on this!

I went down the rabbithole to check on this - it would save me quite a headache if it worked.
Can I help in any ways?

thanks for the awesome proposal!
Csabi

@alifeee
Copy link
Collaborator

alifeee commented Jul 16, 2023

Hi @botcs, thanks for the comment! By my evaluation, the work that needs to be done to add this (great) feature would be:

  • add a boolean kwarg to Worksheet.get_values, something like maintain_size
  • if maintain_size is true
    • get the size (number of rows/cols) asked for by the user from range_name
    • use utils.fill_gaps to fill the empty values and return this
  • add test, i.e., WorksheetTest.test_get_values_and_maintain_size

This would work in a similar way to the fairly new kwarg [combine_merged_cells] (added in #1215). You could also look at the abandoned attempt by @ccppoo in #985.

It would be great if you feel like contributing. I would look at #1215 for a good idea of what you'd need to do, and the Contributing Guide, and we can offer you any help you need with problems you might come across :)

@alifeee alifeee modified the milestones: 6.0.0, 5.11.0 Jul 16, 2023
@alifeee
Copy link
Collaborator

alifeee commented Aug 14, 2023

@botcs We would still love your help on this, if you are interested.

If you no longer wish to, then I will make a PR adding this feature in the coming weeks.

@alifeee alifeee self-assigned this Aug 16, 2023
@botcs
Copy link

botcs commented Aug 17, 2023

Hi,

Sorry for the long delay, it's summer schools time.

I have implemented this for my project, it's just a few lines, I can try to push a PR this week.

@alifeee
Copy link
Collaborator

alifeee commented Aug 17, 2023

No worries, there's no grand rush. That would be great, if you find any time! :)

@lavigne958
Copy link
Collaborator

Yes nobrieh if you can make it for 1st of September it goes in release v5.11 if not it goes in the next one, no worries 😉

@alifeee
Copy link
Collaborator

alifeee commented Aug 22, 2023

I started work on this in the https://github.com/burnash/gspread/tree/feature/maintain-size branch.

However, get_values uses a named range. I do not know how to turn this into rows/columns. Is there a function in gspread already to do this?

i.e., sometimes it does not include cell/column indices

image

@botcs
Copy link

botcs commented Aug 24, 2023

@alifeee hi,
I am ready with an initial PR #1281 , but couldn't pass the tox tests due to the following error:

see error ```bash $ tox -e py -vvv ROOT: 39 D setup logging to NOTSET on pid 274196 [tox/report.py:219] py: 61 I find interpreter for spec PythonSpec(path=/homes/53/csbotos/anaconda3/envs/row/bin/python) [virtualenv/discovery/builtin.py:58] py: 61 I proposed PythonInfo(spec=CPython3.11.4.final.0-64, exe=/homes/53/csbotos/anaconda3/envs/row/bin/python, platform=linux, version='3.11.4 (main, Jul 5 2023, 13:45:01) [GCC 11.2.0]', encoding_fs_io=utf-8-utf-8) [virtualenv/discovery/builtin.py:65] py: 61 D accepted PythonInfo(spec=CPython3.11.4.final.0-64, exe=/homes/53/csbotos/anaconda3/envs/row/bin/python, platform=linux, version='3.11.4 (main, Jul 5 2023, 13:45:01) [GCC 11.2.0]', encoding_fs_io=utf-8-utf-8) [virtualenv/discovery/builtin.py:67] py: 62 D filesystem is case-sensitive [virtualenv/info.py:26] .pkg: 72 I find interpreter for spec PythonSpec(path=/homes/53/csbotos/anaconda3/envs/row/bin/python) [virtualenv/discovery/builtin.py:58] .pkg: 72 I proposed PythonInfo(spec=CPython3.11.4.final.0-64, exe=/homes/53/csbotos/anaconda3/envs/row/bin/python, platform=linux, version='3.11.4 (main, Jul 5 2023, 13:45:01) [GCC 11.2.0]', encoding_fs_io=utf-8-utf-8) [virtualenv/discovery/builtin.py:65] .pkg: 72 D accepted PythonInfo(spec=CPython3.11.4.final.0-64, exe=/homes/53/csbotos/anaconda3/envs/row/bin/python, platform=linux, version='3.11.4 (main, Jul 5 2023, 13:45:01) [GCC 11.2.0]', encoding_fs_io=utf-8-utf-8) [virtualenv/discovery/builtin.py:67] .pkg: 73 W _optional_hooks> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi [tox/tox_env/api.py:427] Backend: run command _optional_hooks with args {} Backend: Wrote response {'return': {'get_requires_for_build_sdist': True, 'prepare_metadata_for_build_wheel': True, 'get_requires_for_build_wheel': True, 'build_editable': True, 'get_requires_for_build_editable': True, 'prepare_metadata_for_build_editable': True}} to /tmp/pep517__optional_hooks-_ewkmn0p.json .pkg: 106 I exit None (0.03 seconds) /homes/53/csbotos/github/gspread> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi pid=274202 [tox/execute/api.py:279] .pkg: 107 W get_requires_for_build_sdist> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi [tox/tox_env/api.py:427] Backend: run command get_requires_for_build_sdist with args {'config_settings': None} Backend: Wrote response {'return': []} to /tmp/pep517_get_requires_for_build_sdist-vcqwn1jf.json .pkg: 110 I exit None (0.00 seconds) /homes/53/csbotos/github/gspread> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi pid=274202 [tox/execute/api.py:279] .pkg: 110 W build_sdist> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi [tox/tox_env/api.py:427] Backend: run command build_sdist with args {'sdist_directory': '/homes/53/csbotos/github/gspread/.tox/.pkg/dist', 'config_settings': None} Backend: Wrote response {'return': 'gspread-5.10.0.tar.gz'} to /tmp/pep517_build_sdist-cmy59vyg.json .pkg: 128 I exit None (0.02 seconds) /homes/53/csbotos/github/gspread> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi pid=274202 [tox/execute/api.py:279] .pkg: 128 D package .tmp/package/8/gspread-5.10.0.tar.gz links to .pkg/dist/gspread-5.10.0.tar.gz (/homes/53/csbotos/github/gspread/.tox) [tox/util/file_view.py:39] py: 128 W install_package> python -I -m pip install --force-reinstall --no-deps /homes/53/csbotos/github/gspread/.tox/.tmp/package/8/gspread-5.10.0.tar.gz [tox/tox_env/api.py:427] Processing ./.tox/.tmp/package/8/gspread-5.10.0.tar.gz Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Building wheels for collected packages: gspread Building wheel for gspread (pyproject.toml) ... done Created wheel for gspread: filename=gspread-5.10.0-py3-none-any.whl size=47030 sha256=5238dd789073a1c5ecc90307b3a38f55944c058e306ff10311e4b5ce670cbd2d Stored in directory: /homes/53/csbotos/.cache/pip/wheels/3f/76/66/0170fc979387e7a07adb0663fb2c0227a2c2f85dd56f979200 Successfully built gspread Installing collected packages: gspread Attempting uninstall: gspread Found existing installation: gspread 5.10.0 Uninstalling gspread-5.10.0: Successfully uninstalled gspread-5.10.0 Successfully installed gspread-5.10.0 py: 712 I exit 0 (0.58 seconds) /homes/53/csbotos/github/gspread> python -I -m pip install --force-reinstall --no-deps /homes/53/csbotos/github/gspread/.tox/.tmp/package/8/gspread-5.10.0.tar.gz pid=274216 [tox/execute/api.py:279] py: 712 W commands[0]> pytest tests/ [tox/tox_env/api.py:427] ================================================================= test session starts ================================================================== platform linux -- Python 3.11.4, pytest-7.4.0, pluggy-1.2.0 cachedir: .tox/py/.pytest_cache rootdir: /homes/53/csbotos/github/gspread plugins: vcr-1.0.2 collected 107 items

tests/cell_test.py ....... [ 6%]
tests/client_test.py .s........ [ 15%]
tests/spreadsheet_test.py ............... [ 29%]
tests/utils_test.py ................. [ 45%]
tests/worksheet_test.py ....................................E..................... [100%]

======================================================================== ERRORS ========================================================================
__________________________________________________ ERROR at setup of WorksheetTest.test_maintain_size __________________________________________________

self = <tests.worksheet_test.WorksheetTest testMethod=test_maintain_size>, client = <gspread.client.BackoffClient object at 0x7f0d838f5410>
request = <SubRequest 'init' for >

@pytest.fixture(scope="function", autouse=True)
def init(self, client, request):
    name = self.get_temporary_spreadsheet_title(request.node.name)
  WorksheetTest.spreadsheet = client.create(name)

tests/worksheet_test.py:21:


gspread/client.py:264: in create
r = self.request("post", DRIVE_FILES_API_V3_URL, json=payload, params=params)
gspread/client.py:616: in request
return super().request(*args, **kwargs)
gspread/client.py:81: in request
response = getattr(self.session, method)(
.tox/py/lib/python3.11/site-packages/requests/sessions.py:637: in post
return self.request("POST", url, data=data, json=json, **kwargs)
.tox/py/lib/python3.11/site-packages/google/auth/transport/requests.py:549: in request
response = super(AuthorizedSession, self).request(
.tox/py/lib/python3.11/site-packages/requests/sessions.py:589: in request
resp = self.send(prep, **send_kwargs)
.tox/py/lib/python3.11/site-packages/requests/sessions.py:703: in send
r = adapter.send(request, **kwargs)
.tox/py/lib/python3.11/site-packages/requests/adapters.py:486: in send
resp = conn.urlopen(
.tox/py/lib/python3.11/site-packages/urllib3/connectionpool.py:703: in urlopen
httplib_response = self._make_request(
.tox/py/lib/python3.11/site-packages/urllib3/connectionpool.py:440: in _make_request
httplib_response = conn.getresponse(buffering=True)


self = <vcr.patch.VCRRequestsHTTPSConnection/homes/53/csbotos/github/gspread/tests/cassettes/WorksheetTest.test_maintain_size.json object at 0x7f0d83912e50>
_ = False, kwargs = {'buffering': True}

def getresponse(self, _=False, **kwargs):
    """Retrieve the response"""
    # Check to see if the cassette has a response for this request. If so,
    # then return it
    if self.cassette.can_play_response_for(self._vcr_request):
        log.info(f"Playing response for {self._vcr_request} from cassette")
        response = self.cassette.play_response(self._vcr_request)
        return VCRHTTPResponse(response)
    else:
        if self.cassette.write_protected and self.cassette.filter_request(self._vcr_request):
          raise CannotOverwriteExistingCassetteException(
                cassette=self.cassette,
                failed_request=self._vcr_request,
            )

E vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('/homes/53/csbotos/github/gspread/tests/cassettes/WorksheetTest.test_maintain_size.json') in your current record mode ('none').
E No match for the request (<Request (POST) https://www.googleapis.com/drive/v3/files?supportsAllDrives=True>) was found.
E No similar requests, that have not been played, found.

.tox/py/lib/python3.11/site-packages/vcr/stubs/init.py:263: CannotOverwriteExistingCassetteException
=================================================================== warnings summary ===================================================================
tests/cell_test.py::CellTest::test_a1_value
tests/client_test.py::ClientTest::test_access_non_existing_spreadsheet
tests/spreadsheet_test.py::SpreadsheetTest::test_add_del_worksheet
tests/worksheet_test.py::WorksheetTest::test_acell
/homes/53/csbotos/github/gspread/gspread/client.py:605: DeprecationWarning: [Deprecated][in version 6.0.0]: this class will be deprecated and moved to gspread.http_client package
warnings.warn(

tests/cell_test.py::CellTest::test_merge_cells
tests/worksheet_test.py::WorksheetTest::test_batch_clear
tests/worksheet_test.py::WorksheetTest::test_batch_get
tests/worksheet_test.py::WorksheetTest::test_copy_cut_range
tests/worksheet_test.py::WorksheetTest::test_get_values_and_combine_merged_cells
tests/worksheet_test.py::WorksheetTest::test_range_get_all_values
tests/worksheet_test.py::WorksheetTest::test_update_and_get
/homes/53/csbotos/github/gspread/gspread/worksheet.py:1114: UserWarning: [Deprecated][in version 6.0.0]: method signature will change to: 'Worksheet.update(value = [[]], range_name=)' arguments 'range_name' and 'values' will swap, values will be mandatory of type: 'list(list(...))'
warnings.warn(

tests/worksheet_test.py::WorksheetTest::test_clear_tab_color
tests/worksheet_test.py::WorksheetTest::test_update_tab_color
/homes/53/csbotos/github/gspread/gspread/worksheet.py:211: UserWarning: [Deprecated][in version 6.0.0]: color format will change to hex format "#RRGGBB".
To suppress warning, use "get_tab_color()" and convert back to dict format, use gspread.utils.convert_hex_to_colors_dict.
However, we recommend changing your code to use hex format.
warnings.warn(

tests/worksheet_test.py::WorksheetTest::test_clear_tab_color
tests/worksheet_test.py::WorksheetTest::test_update_tab_color
/homes/53/csbotos/github/gspread/gspread/worksheet.py:1541: UserWarning: [Deprecated][in version 6.0.0]: color format will change to hex format "#RRGGBB".
To suppress this warning, first convert color to hex with "gspread.utils.convert_colors_to_hex_value(color)
warnings.warn(

tests/worksheet_test.py::WorksheetTest::test_sort
tests/worksheet_test.py::WorksheetTest::test_sort
tests/worksheet_test.py::WorksheetTest::test_sort
tests/worksheet_test.py::WorksheetTest::test_sort
tests/worksheet_test.py::WorksheetTest::test_sort
/homes/53/csbotos/github/gspread/gspread/worksheet.py:1454: DeprecationWarning: [Deprecated][in version 6.0.0]: This function signature will change, arguments will swap places: sort(range, specs)
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================== short test summary info ================================================================
ERROR tests/worksheet_test.py::WorksheetTest::test_maintain_size - vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('/homes/53/csbotos/github/gspread/tests/cassettes/Worksheet...
================================================= 105 passed, 1 skipped, 20 warnings, 1 error in 0.89s =================================================
py: 1754 C exit 1 (1.04 seconds) /homes/53/csbotos/github/gspread> pytest tests/ pid=274247 [tox/execute/api.py:279]
.pkg: 1754 W _exit> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi [tox/tox_env/api.py:427]
Backend: run command _exit with args {}
Backend: Wrote response {'return': 0} to /tmp/pep517__exit-1ea9qe3l.json
.pkg: 1756 I exit None (0.00 seconds) /homes/53/csbotos/github/gspread> python /homes/53/csbotos/anaconda3/envs/row/lib/python3.11/site-packages/pyproject_api/_backend.py True flit_core.buildapi pid=274202 [tox/execute/api.py:279]
.pkg: 1760 D delete package /homes/53/csbotos/github/gspread/.tox/.tmp/package/8/gspread-5.10.0.tar.gz [tox/tox_env/python/virtual_env/package/pyproject.py:182]
py: FAIL code 1 (1.70=setup[0.66]+cmd[1.04] seconds)
evaluation failed :( (1.72 seconds)

</details>

If someone could help out with this error, that would be greatly appreciated!

@alifeee
Copy link
Collaborator

alifeee commented Aug 24, 2023

Thanks for your work here! :)

I am ready with an initial PR #1281 , but couldn't pass the tox tests due to the following error:

You need to run the test online. See the contributing guide. This comment may be helpful for the exact commands I recommend you run.

You will also want to run tox - e format

@lavigne958 lavigne958 removed this from the 5.11.0 milestone Sep 1, 2023
Repository owner deleted a comment from lavigne958 Sep 3, 2023
Repository owner deleted a comment from lavigne958 Sep 3, 2023
@alifeee alifeee changed the title get, get_values filling empty values response from Google Sheet (with expected cells dimension) [Proposal] [Proposal]: get, get_values filling empty values response from Google Sheet (with expected cells dimension) Sep 3, 2023
@alifeee
Copy link
Collaborator

alifeee commented Sep 4, 2023

superceded by #1289

@alifeee alifeee closed this as completed Sep 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants