Google Analytics Real Time Data to Google Sheets via. Python

Posted by in python, on June 4, 2018

There really hasn’t been a good enough reason for me to hit the Google Analytics real-time API to get data.  If I needed real-time data, I used the GA UI to look at what I needed, which was mostly high-level sanity checks for events firing or pageviews triggerring on certain pages/sites, etc.

Today though, I had the need to track an event in real time to see how many users it was affecting.  But I didint just need the raw number, I need to store it somewhere so I can see a trend.  Then I had a light bulb moment; I remembered, way back when I had applied to get access to GA’s real-time API.  I could maybe use Python and the GA API to fetch the data, store it in Google Sheets and trend the data to see what was going on.  I must admit, I was a bit skeptical as my knowledge of Python is still very limited and this seemed daunting, but wth! I was going to give it a try.  I’m sure Python experts out there can make this code run much faster and combine the two scripts.  (Script one, gets data.  Script two stores data)

As always with most things in digital analytics its helpful to have the event you want to measure already tracked, lucky for me, it was tracked as a GA event in the category/action/label format.  Now all I had to do was grab the data in real time and store it somewhere.

Now with the GA real-time API you don’t have access to all the dimensions and metrics, but it had just the right amount of data I needed (how many users were affected by X)  List of avail. metrics and dimensions for GA real-time api.

The steps to achieve this went something like this:

Getting Access to the API

  1. Request from Google access to the Real Time API (here) – I already had this part completed

Grabbing event names from GA UI

  1. Create a custom report in the GA interface for the data I want to pull via. the API (save the report for validation)
    1. At this point, you know the name of the event/action you need

Allow your code to access Google API for authentication

  1. Follow instructions here to get credentials from the Google APIs Console so that you may interact with GA programmatically

Lets Code

  1. Fire up your favorite Python IDE (PyCharm in my case) and follow instructions here to get you started on creating the skeleton for your script (copy & paste) to hit the GA API and retrieve the data you need:
    1. In the code pay special attention to:
      1. the scope of the API
      2. filters you want to use (i.e. which event do you want), real-time api dimensions and filters all start with rt: as opposed to ga: if you’ve used other GA api’s or ga explorer
        1. This is where your event name comes handy as you will need to filter by this name
      3. dimensions, multiple dimensions can be queried by using commas to separate them (I simply used category and action and that was sufficient)
dimensions='rt:eventCategory, rt:eventAction').execute()
  1. At this point, you’ve probably frustrated yourself with the complicated Google Console API authentication process to allow your python script to get the data you need (hope that went well)
  2. If all went well above you should be able to print out the metric you are looking for in your Python IDE

Let’s Store it somewhere (Google Sheets)

  1. There is a great write up on on how to get Python and Google Sheets to talk to each other, so I won’t go into much detail here.  But it goes like this:
    1. Enable the Google Sheets API (similar to the GA API above) so that you may interact with Google Sheets programmatically via python
    2. Update an existing Google Sheet with the GA data as a new row, so that we have new data in each row
    3. Pay attention to the part of the code which calls the GA sheets script from within the GA Real Time Data script; in short:
      1. First get the real-time data (run first script)
      2. Store it in a variable
      3. Call the GA sheets script (use script 1 to call the ga sheets function)
      4. Add a row to the sheet with the data from the variable you just got
client = gspread.authorize(creds)
sheet ="my super cool sheet").sheet1
sheet.append_row([metric_name,metric_total,currentDT.hour,currentDT.minute, currentDT.second,,currentDT.month,currentDT.year])

Next steps

  1. Automate this process so that the data is updated automatically and does not require me to run this code manually on my local machine (reading up on how to do this with little effort)
  2. Throw this into a Datastudio dashboard for real-time monitoring (use this sheet as a datasource in datastudio)

Github repo. (get the code from my github repo)

Was too excited to post this after the initial success, so let me know in the comments if something is unclear, I will do my very best to clarify.