Skip to main content

Integration of Google BigQuery with SAP Data Warehouse Cloud (DWC)

Published on 29 July 2021
SAP Data Warehouse Cloud
Sanjana Nair
Sanjana Nair
AI Consultant

Sanjana Nair is an AI Consultant at Applexus. She has 4+ years of experience in the IT industry, centred around the development of AI/ML solutions for Customer Retail. She also has experience as a developer in the eCommerce sector and is a certified SAP Hybris developer.

Analyzing business data often requires the consumption of additional business-critical data from third-party sources. It is counterproductive to have to copy or move all this data into DWC for such analysis. Instead, DWC has the provision to integrate data from multiple sources, one such example being Google BigQuery.

Data federation between Google BigQuery and DWC has been explored in the following blogs:

Here, delving deeper into this usecase, we create a Google BigQuery connection based on a specific GCP Service account in DWC and access data from a different third-party account in BigQuery. Read on to find out how this is done.

SAP Data Warehouse Cloud(DWC) is a cloud-based data warehousing solution that combines both efficient data management and advanced analytics. Built on the SAP HANA Cloud database, DWC combines data from several sources (SAP and third-party data, across clouds and on-premise repositories), enabling highly efficient real-time analysis.

Architecture of SAP Data Warehouse Cloud
Figure 1. Architecture of SAP Data Warehouse Cloud (Refer link)

Google BigQuery is part of the Google Cloud Platform (GCP) – It is a serverless, scalable, multicloud data warehousing solution that enables data analysis at a petabyte scale.

Once integrated with DWC, data from Google BigQuery can be queried directly by means of virtual tables in DWC. Live connectivity ensures that queries are federated through virtual tables with fresh data and there is never any need for caching or to replicate the data from the source.

SAP DWC and GCP Integration
Figure 2. SAP DWC and GCP Integration

DWC can connect to GBQ using two mechanisms:

  • Through SDI with the use of the Google BigQuery adapter
  • Directly from DWC through the connection type 'Google BigQuery'

As of May 2021, the first connection type (through SDI) is not available. It requires a 3rd party driver, but this has been blocked by SAP for security reasons.

This blog will focus on the second method, i.e., establishing a connection directly from DWC using the ‘Google BigQuery’ connection type.

Following are the prerequisites to integrating Google BigQuery data with SAP DWC

  • Google Cloud Platform Account
  • User with relevant authorization to use Google BigQuery
  • Google BigQuery Service Instance with data
  • SAP DWC instance

Messer Webinar

Steps:

  1. The following steps need to be carried out on the Google BigQuery (GBQ) side:

    • i) Login to the GCP console and create a *Project

      *NOTE: Google BigQuery is managed through Projects, which consists of ‘Datasets’. Datasets in turn are made up of views and tables.

    • ii) In the GCP console, create a ‘Service account’
      • Go to IAM> Service Account > Create Service Account
      • This account is necessary in order to connect with DWC
      Creation of a Service account
      Figure 3. Creation of a Service account (Refer link)
    • iii) Give the user a name and grant the following roles:
      • BigQuery User
      Adding user roles to the Service account
      Figure 4. Adding user roles to the Service account (Refer link)
    • iv) Generate a Key file for the Service account – this can be done from the ‘Actions’ menu, once the Service account has been created.
      • The key file must be generated and downloaded in the json format
      • This key establishes a secure, trusted connection
      • In GCP IAM (Identity Access Management), the generated key ID field will be populated automatically
    • v) Download Google cloud certificate:
      • In your browser go to google.com. Open the certificate and drag it to the desktop.
        Downloading the Google cloud certificate
        Figure 5. Downloading the Google cloud certificate
      • Import the certificate to Keychain Access (on macOS) and export it as a .pem file
  2. Once GBQ has been configured, the 3rd-party vendor will have to associate the Service account (created in previous step) directly with its own GBQ project and configure it to have access to data there

    • i) - The vendor would require the service account (Gmail address) and Key file in order to do this
  3. The following steps will then have to be carried out on the DWC side:

    • i) Go to Administration section of SAP Data Warehouse Cloud and import the certificate (Downloaded in Step I)
      Adding Google cloud certificate in DWC
      Figure 6. Adding Google cloud certificate in DWC
    • ii) Go to space management / space / connections and add Connection and choose BigQuery from the list
      Adding a GBQ connection in DWC
      Figure 7. Adding a GBQ connection in DWC
    • iii) Enter the GCP Project name and upload the key exported in Step I
      Configuring GBQ connection details in DWC
      Figure 8. Configuring GBQ connection details in DWC

      NOTE:

      By default, when a GBQ service account is associated with multiple projects, the DWC connection will show data from the “home” project of that service account, i.e., the Key file of that service account (Created and downloaded in Step I).

      However, once this Service account has been associated by a 3rd party SaaS vendor to their standalone GBQ project, it is possible to get this data in DWC.

      This can be achieved by creating a new DWC connection using the same Key file (generated in Step I), however, the project name that is specified has to be the project we need data from (i.e., the 3rd party project name in this case).
    • iv) After this is done, data from the project can be accessed via DWC. Create a new graphical view and navigate to ‘Sources’ and expand ‘Connections’ – the newly created BigQuery connection should now be available for use

Further Reading:

Add new comment

Plain text

  • No HTML tags allowed.
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.