How To Extract Data From Google Analytics Using Python

In order to understand how your website’s visitors interact with your site, Google Analytics is a crucial tool that provides detailed information. Extracting this information manually can be a lengthy process, but fortunately, there is a solution. By utilizing Python, a versatile scripting language commonly used in data science, we can automate this task. In this article, we will walk you through the process of using Python to extract data from Google Analytics.

Step 1: Install the Necessary Libraries

We will use the Google Analytics Reporting API v4 and the Google API Client Library for Python. Install these libraries with the following pip command:

pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib google-auth

Step 2: Set up Google Analytics Reporting API

Before you can extract data, you must set up Google Analytics Reporting API v4.

  • Go to the Google API Console and create a new project.
  • Enable Google Analytics Reporting API for the project.
  • Create credentials that your Python script will use to access the API.
  • Download these credentials as a JSON file.

Step 3: Extract Data with Python

Next, we’ll write the Python script that extracts data from Google Analytics. Here’s a basic example:

from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
KEY_FILE_LOCATION = '<path_to_json_key_file>'
VIEW_ID = '<view_id>'

def initialize_analyticsreporting():
    credentials = ServiceAccountCredentials.from_json_keyfile_name(
            KEY_FILE_LOCATION, SCOPES)
    analytics = build('analyticsreporting', 'v4', credentials=credentials)
    return analytics

def get_report(analytics):
    return analytics.reports().batchGet(
        body={
            'reportRequests': [
                {
                    'viewId': VIEW_ID,
                    'dateRanges': [{'startDate': '7daysAgo', 'endDate': 'today'}],
                    'metrics': [{'expression': 'ga:sessions'}],
                    'dimensions': [{'name': 'ga:country'}]
                }]
        }
    ).execute()

def main():
    analytics = initialize_analyticsreporting()
    response = get_report(analytics)
    print_response(response)

if __name__ == '__main__':
    main()

In this script, we define three functions:

  • initialize_analyticsreporting(): Initializes the API client using the provided service account credentials.
  • get_report(analytics): Fetches a report for the specified view, for data from the last 7 days, and prints the number of sessions per country.
  • main(): Calls the above functions and prints the response.

This is a basic example, and the full capabilities of the Google Analytics Reporting API are beyond the scope of this tutorial. For more complex queries, you can review the official API samples.

Conclusion

By automating data extraction from Google Analytics with Python, we can save ample time and get insights faster. From here, you can further analyze your data using Python’s data analysis libraries, such as Pandas and Matplotlib, or even feed it into a machine learning model. The possibilities are endless.