Skip to content

Commit

Permalink
Merge pull request #544 from Terseus/contributing-guidelines
Browse files Browse the repository at this point in the history
Add Contributing guidelines
  • Loading branch information
emlazzarin authored Nov 5, 2022
2 parents de31fb0 + 23d5bd7 commit d30adbd
Show file tree
Hide file tree
Showing 12 changed files with 189 additions and 16 deletions.
158 changes: 158 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Contributing

## Set up your environment

First of all, create a virtualenv usign `python -m venv` or whatever tool you use to manage them, and install the requirements listed in the requirements files:

```bash
$ python -m venv ~/virtualenvs/pytrends
$ pip -r install requirements.txt # library requirements
$ pip -r install requirements-dev.txt # development requirements
```

## Running the tests

To run the tests, simply run `pytest` inside the project root:

```bash
$ pytest
```

## About the test suite

There are two main libraries used in the test suite:

* [VCR.py](https://github.com/kevin1024/vcrpy): Records requests and responses and replays them at every execution; we use it through [pytest-recording](https://github.com/kiwicom/pytest-recording)

* [responses](https://github.com/getsentry/responses): Mocks the `requests` library, able to reproduce edge cases and check the requests made.

If you don't know them we highly encourage you to take a peek at their README to understand what they are and the differences between them.

## VCR.py tests

VCR.py records the HTTP requests made by a test and the responses returned by the server, and save them in a YAML file called "cassette".

When a cassette exists, instead of passing the HTTP requests to the server VCR.py will catch the requests made by a test, search for it in the cassette file, and replays the recorded response for that exact request.

Use VCR.py to check the behavior of Google Trends API: check the response returned, know a specific request is valid, etc.

To use VCR.py in a test, decorate it with `pytest.mark.vcr`:

```python
@pytest.mark.vcr
def test_example():
# This test will do real requests.
pass
```

### Running a VCR.py test without cassette

The first time you execute a VCR.py test without a cassette file (e.g. a new test) you will get an error:

```
E vcr.errors.CannotOverwriteExistingCassetteException: Can't overwrite existing cassette ('/home/user/pytrends/tests/cassettes/test_request/test_name.yaml') in your current record mode ('none').
E No match for the request (<Request (GET) https://trends.google.com/?geo=US>) was found.
E No similar requests, that have not been played, found.
.venv/python-3.7.10/lib/python3.7/site-packages/vcr/stubs/__init__.py:232: CannotOverwriteExistingCassetteException
```

By default `pytest-recording` will **not** let the requests pass to prevent unintentional network requests.

To create a new cassette use the pytest parameter `--record-mode=once`, this will write a new cassette for tests that doesn't have one yet and will replay the existing cassette for tests that does have it.

You can read more about this behavior in the [pytest-recording README](https://github.com/kiwicom/pytest-recording#default-recording-mode).

### Rewriting an existing cassette

Sometimes you will change how the requests are made or want to see if the library still handles correctly the requests made.

You have two options here:

* Delete the cassette file and execute the tests with `--record-mode=once`:

```bash
# The path format is `tests/cassettes/<test file name>/<test function name>.yaml`
$ rm tests/cassettes/test_request/test_build_payload.yaml
$ pytest --record-mode=once
```

* Execute the single test you want using `-k` and `--record-mode=rewrite`:

```bash
# the format is `pytest -k <pattern>`
$ pytest -k test_build_payload --record-mode=rewrite
```

Beware, the latter will execute all the tests whose name matches the pattern and rewrite its cassette.

Please keep in mind that the Google Trends API **can change its returned data over time, even a year-old data**, this means that when you regenerate the cassette of an existing test you may also need to update the data returned by the backend, the fastest way to get the new values is using the pytest `--pdb` flag to start a pdb session when the test fails comparing the expected `pd.DataFrame`:

```bash
$ pytest -k test_interest_over_time --pdb

E AssertionError: DataFrame.iloc[:, 0] (column name="pizza") are different
E
E DataFrame.iloc[:, 0] (column name="pizza") values are different (80.0 %)
E [index]: [2021-01-01T00:00:00.000000000, 2021-01-02T00:00:00.000000000, 2021-01-03T00:00:00.000000000, 2021-01-04T00:00:00.000000000, 2021-01-05T00:00:00.000000000]
E [left]: [100, 80, 77, 50, 51]
E [right]: [100, 87, 78, 51, 52]

pandas/_libs/testing.pyx:168: AssertionError

# By default the error of an `assert_frame_equal` is raised inside the Pandas code.
# Inspect the backtrace to find the point where we made the assert and move there.
(Pdb) bt

...
-> assert_frame_equal(df_result, df_expected)
/home/user/pytrends/.venv/python-3.7.10/lib/python3.7/site-packages/pandas/_testing/asserters.py(1321)assert_frame_equal()
-> check_index=False,
/home/user/pytrends/.venv/python-3.7.10/lib/python3.7/site-packages/pandas/_testing/asserters.py(1084)assert_series_equal()
-> index_values=np.asarray(left.index),
/home/user/pytrends/pandas/_libs/testing.pyx(53)pandas._libs.testing.assert_almost_equal()
> /home/user/pytrends/pandas/_libs/testing.pyx(168)pandas._libs.testing.assert_almost_equal()
/home/user/pytrends/.venv/python-3.7.10/lib/python3.7/site-packages/pandas/_testing/asserters.py(665)raise_assert_detail()
-> raise AssertionError(msg)

(Pdb) up
> /home/user/pytrends/pandas/_libs/testing.pyx(53)pandas._libs.testing.assert_almost_equal()
(Pdb) up
> /home/user/pytrends/.venv/python-3.7.10/lib/python3.7/site-packages/pandas/_testing/asserters.py(1084)assert_series_equal()
-> index_values=np.asarray(left.index),
(Pdb) up
> /home/user/pytrends/.venv/python-3.7.10/lib/python3.7/site-packages/pandas/_testing/asserters.py(1321)assert_frame_equal()
-> check_index=False,
(Pdb) up
> /home/user/pytrends/tests/test_request.py(179)test_interest_over_time_ok()
-> assert_frame_equal(df_result, df_expected)

# Check the returned response and see if it contains valid data.
# We can use the following values to update our test and make it pass.
(Pdb) df_expected.to_dict(orient='list')
{'pizza': [100, 87, 78, 51, 52], 'bagel': [2, 2, 2, 1, 1], 'isPartial': [False, False, False, False, False]}
```

## responses tests

responses is used to monkey patch the `requests` library, intercepting requests and simulating responses from the backend without letting them pass through.

Use responses to simulate hard-to-reproduce behavior from the backend, to perform asserts on how a specific request is made, or to prevent unintended requests to be made.

To use responses in a test, make it receive the fixture `mocked_responses` and configure the mock adding the requests you expect the test to do and the response that the backend will return:

```python
def test_example(mocked_responses):
mocked_responses.add(
url="https://trends.google.com/?geo=US",
method="GET",
body=ConnectionError("Fake connection error")
)
# The next request made will throw a `ConnectionError` exception
```

The fixture `mocked_responses` is configured to always assert that all registered requests are made, otherwise it will fail:

```
E AssertionError: Not all requests have been executed [('GET', 'https://trends.google.com/trends/fake_call')]
```
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,10 @@ Returns dictionary
* It has been tested, and 60 seconds of sleep between requests (successful or not) appears to be the correct amount once you reach the limit.
* For certain configurations the dependency lib certifi requires the environment variable REQUESTS_CA_BUNDLE to be explicitly set and exported. This variable must contain the path where the ca-certificates are saved or a SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] error is given at runtime.

# Contributing

See the [CONTRIBUTING](CONTRIBUTING.md) file.

# Credits

* Major JSON revision ideas taken from pat310's JavaScript library
Expand Down
10 changes: 4 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,10 @@ classifiers = [
"Programming Language :: Python :: 3.11",
"License :: OSI Approved :: Apache Software License"
]
dependencies = [
"requests>=2.0",
"pandas>=0.25",
"lxml"
]
keywords = [
"google trends api search"
]
dynamic = ["readme"]
dynamic = ["readme", "dependencies"]

[tool.setuptools]
packages = ["pytrends"]
Expand All @@ -42,6 +37,9 @@ packages = ["pytrends"]
file = ["README.md"]
content-type = "text/markdown"

[tool.setuptools.dynamic.dependencies]
file = ["requirements.txt"]

[tool.coverage.run]
branch = true

Expand Down
3 changes: 3 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
requests>=2.0
pandas>=0.25
lxml
11 changes: 11 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import pytest
from responses import RequestsMock


@pytest.fixture
def mocked_responses():
requests_mock = RequestsMock(
assert_all_requests_are_fired=True
)
with requests_mock as mocked_responses:
yield mocked_responses
19 changes: 9 additions & 10 deletions tests/test_request.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ def test_tokens():


@pytest.mark.vcr
def test_interest_over_time():
def test_interest_over_time_ok():
pytrend = TrendReq()
pytrend.build_payload(kw_list=['pizza', 'bagel'], timeframe='2021-01-01 2021-01-05')
df_result = pytrend.interest_over_time()
Expand Down Expand Up @@ -252,7 +252,7 @@ def test_interest_over_time_bad_gprop():


@pytest.mark.vcr
def test_interest_by_region():
def test_interest_by_region_ok():
pytrend = TrendReq()
pytrend.build_payload(kw_list=['pizza', 'bagel'], timeframe='2021-01-01 2021-12-31')
df_result = pytrend.interest_by_region()
Expand Down Expand Up @@ -444,7 +444,7 @@ def test_related_queries_result_rising():


@pytest.mark.vcr
def test_trending_searches():
def test_trending_searches_ok():
pytrend = TrendReq()
# trending_searches doesn't need to call build_payload.
df_result = pytrend.trending_searches()
Expand All @@ -462,7 +462,7 @@ def test_trending_searches():


@pytest.mark.vcr
def test_realtime_trending_searches():
def test_realtime_trending_searches_ok():
pytrend = TrendReq()
# realtime_trending_searches doesn't need to call build_payload.
df_result = pytrend.realtime_trending_searches()
Expand Down Expand Up @@ -517,7 +517,7 @@ def test_realtime_trending_searches():


@pytest.mark.vcr
def test_top_charts():
def test_top_charts_ok():
pytrend = TrendReq()
# top_chars doesn't need to call build_payload.
df_result = pytrend.top_charts(date=2021)
Expand All @@ -540,7 +540,7 @@ def test_top_charts():


@pytest.mark.vcr
def test_suggestions():
def test_suggestions_ok():
pytrend = TrendReq()
# suggestions doesn't need to call build_payload.
result = pytrend.suggestions(keyword='pizza')
Expand All @@ -564,14 +564,13 @@ def test_interest_over_time_partial():
assert s_last_row.isPartial is np.bool_(True)


@responses.activate
def test_request_args_passing():
responses.add(
def test_request_args_passing(mocked_responses):
mocked_responses.add(
url='https://trends.google.com/?geo=US',
method='GET',
match=[responses.matchers.header_matcher({'User-Agent': 'pytrends'})]
)
responses.add(
mocked_responses.add(
url='https://trends.google.com/trends/hottrends/visualize/internal/data',
method='GET',
match=[responses.matchers.header_matcher({'User-Agent': 'pytrends'})],
Expand Down

0 comments on commit d30adbd

Please sign in to comment.