Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC GH3508 (bis) added basic documentation of google analytics in remot... #8835

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 62 additions & 7 deletions doc/source/remote_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,14 @@ Remote Data Access

.. _remote_data.data_reader:

Functions from :mod:`pandas.io.data` extract data from various Internet
sources into a DataFrame. Currently the following sources are supported:
Functions from :mod:`pandas.io.data` and :mod:`pandas.io.ga` extract data from various Internet sources into a DataFrame. Currently the following sources are supported:

- Yahoo! Finance
- Google Finance
- St. Louis FED (FRED)
- Kenneth French's data library
- World Bank
- :ref:`Yahoo! Finance<remote_data.yahoo>`
- :ref:`Google Finance<remote_data.google>`
- :ref:`St.Louis FED (FRED)<remote_data.fred>`
- :ref:`Kenneth French's data library<remote_data.ff>`
- :ref:`World Bank<remote_data.wb>`
- :ref:`Google Analytics<remote_data.ga>`

It should be noted, that various sources support different kinds of data, so not all sources implement the same methods and the data elements returned might also differ.

Expand Down Expand Up @@ -330,7 +330,62 @@ indicators, or a single "bad" (#4 above) country code).

See docstrings for more info.

.. _remote_data.ga:

Google Analytics
----------------

The :mod:`~pandas.io.ga` module provides a wrapper for
`Google Analytics API <https://developers.google.com/analytics/devguides>`__
to simplify retrieving traffic data.
Result sets are parsed into a pandas DataFrame with a shape and data types
derived from the source table.

Configuring Access to Google Analytics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The first thing you need to do is to setup accesses to Google Analytics API. Follow the steps below:

#. In the `Google Developers Console <https://console.developers.google.com>`__
#. enable the Analytics API
#. create a new project
#. create a new Client ID for an "Installed Application" (in the "APIs & auth / Credentials section" of the newly created project)
#. download it (JSON file)
#. On your machine
#. rename it to ``client_secrets.json``
#. move it to the ``pandas/io`` module directory

The first time you use the :func:`read_ga` funtion, a browser window will open to ask you to authentify to the Google API. Do proceed.

Using the Google Analytics API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following will fetch users and pageviews (metrics) data per day of the week, for the first semester of 2014, from a particular property.

.. code-block:: python

import pandas.io.ga as ga
ga.read_ga(
account_id = "2360420",
profile_id = "19462946",
property_id = "UA-2360420-5",
metrics = ['users', 'pageviews'],
dimensions = ['dayOfWeek'],
start_date = "2014-01-01",
end_date = "2014-08-01",
index_col = 0,
filters = "pagePath=~aboutus;ga:country==France",
)

The only mandatory arguments are ``metrics,`` ``dimensions`` and ``start_date``. We can only strongly recommend you to always specify the ``account_id``, ``profile_id`` and ``property_id`` to avoid accessing the wrong data bucket in Google Analytics.

The ``index_col`` argument indicates which dimension(s) has to be taken as index.

The ``filters`` argument indicates the filtering to apply to the query. In the above example, the page has URL has to contain ``aboutus`` AND the visitors country has to be France.

Detailed informations in the followings:

* `pandas & google analytics, by yhat <http://blog.yhathq.com/posts/pandas-google-analytics.html>`__
* `Google Analytics integration in pandas, by Chang She <http://quantabee.wordpress.com/2012/12/17/google-analytics-pandas/>`__
* `Google Analytics Dimensions and Metrics Reference <https://developers.google.com/analytics/devguides/reporting/core/dimsmets>`_

1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.15.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Enhancements

- Added ability to export Categorical data to Stata (:issue:`8633`).
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
- Added Google Analytics (`pandas.io.ga`) basic documentation (:issue:`8835`). See :ref:`here<remote_data.ga>`.

.. _whatsnew_0152.performance:

Expand Down