Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH3508 added basic documentation of google analytics #8807

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ object.
* :ref:`read_json<io.json_reader>`
* :ref:`read_msgpack<io.msgpack>` (experimental)
* :ref:`read_html<io.read_html>`
* :ref:`read_ga<io.analytics>`
* :ref:`read_gbq<io.bigquery>` (experimental)
* :ref:`read_stata<io.stata_reader>`
* :ref:`read_clipboard<io.clipboard>`
Expand Down Expand Up @@ -3496,6 +3497,65 @@ And then issue the following queries:
pd.read_sql_query("SELECT * FROM data", con)


.. _io.analytics:

Google Analytics
----------------

The :mod:`~pandas.io.ga` module provides a wrapper for
`Google Analytics API <https://developers.google.com/analytics/devguides>`__
to simplify retrieving traffic data.
Result sets are parsed into a pandas DataFrame with a shape and data types
derived from the source table.

Configuring Access to Google Analytics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The first thing you need to do is to setup accesses to Google Analytics API. Follow the steps below:

#. In the `Google Developers Console <https://console.developers.google.com>`__
#. enable the Analytics API
#. create a new project
#. create a new Client ID for an "Installed Application" (in the "APIs & auth / Credentials section" of the newly created project)
#. download it (JSON file)
#. On your machine
#. rename it to ``client_secrets.json``
#. move it to the ``pandas/io`` module directory

The first time you use the :func:`read_ga` funtion, a browser window will open to ask you to authentify to the Google API. Do proceed.

Using the Google Analytics API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The following will fetch users and pageviews (metrics) data per day of the week, for the first semester of 2014, from a particular property.

.. code-block:: python

import pandas.io.ga as ga
ga.read_ga(
account_id = "2360420",
profile_id = "19462946",
property_id = "UA-2360420-5",
metrics = ['users', 'pageviews'],
dimensions = ['dayOfWeek'],
start_date = "2014-01-01",
end_date = "2014-08-01",
index_col = 0,
filters = "pagePath=~aboutus;ga:country==France",
)

The only mandatory arguments are ``metrics,`` ``dimensions`` and ``start_date``. We can only strongly recommend you to always specify the ``account_id``, ``profile_id`` and ``property_id`` to avoid accessing the wrong data bucket in Google Analytics.

The ``index_col`` argument indicates which dimension(s) has to be taken as index.

The ``filters`` argument indicates the filtering to apply to the query. In the above example, the page has URL has to contain ``aboutus`` AND the visitors country has to be France.

Detailed informations in the followings:

* `pandas & google analytics, by yhat <http://blog.yhathq.com/posts/pandas-google-analytics.html>`__
* `Google Analytics integration in pandas, by Chang She <http://quantabee.wordpress.com/2012/12/17/google-analytics-pandas/>`__
* `Google Analytics Dimensions and Metrics Reference <https://developers.google.com/analytics/devguides/reporting/core/dimsmets>`_

.. _io.bigquery:

Google BigQuery (Experimental)
Expand Down