Google Analytics 4 to BigQuery: Native Export vs. Top ETL Tools

Table of ContentsToggle Table of Content

Are you tired of dealing with Google Analytics 4’s 14-month data retention limit or struggling to unnest complex JSON arrays in BigQuery? You’re not alone. Once you start looking for ways to escape the limitations of standard reporting, the data engineering landscape can get messy fast. 

Moving your data from GA4 to BigQuery has become the gold standard for marketing analytics. Establishing a robust data warehouse integration is the only reliable way to bypass the standard UI’s data sampling, implement custom marketing attribution models, and truly take ownership of your historical data. 

Full transparency: We are the engineering team at Skyvia. We build a no-code ETL tool that moves data, so we are naturally biased. However, in this guide, we won’t pretend we are the only solution. We will honestly evaluate the native Google Analytics 4 integration alongside market leaders like Fivetran and Airbyte based on real technical tests, so you can choose the right path for your architecture. 

Why Should You Move GA4 Data into BigQuery? 

The default Google Analytics 4 reporting interface is designed for high-level monitoring, not deep-dive analytical execution. For teams attempting to build comprehensive reporting models, three core limitations in the standard GA4 interface make external data warehousing necessary. 

  • Overcoming the 14-Month Data Expiration: The standard analytics interface strictly caps user-level and event-level data retention at a maximum of 14 months. If you are trying to build year-over-year performance reports or analyze long-term customer buying cycles, your historical data simply vanishes. Exporting to a data warehouse lets you own your records indefinitely. 
  • Bypassing UI Data Sampling and Thresholding: When you run custom reports, apply secondary dimensions, or look at large datasets, the standard UI frequently applies data sampling and thresholding (hiding specific rows to protect user privacy or speed up loading times). This results in inaccurate reports where the numbers do not add up. BigQuery grants you access to the raw data. 
  • Breaking Down Data Silos: Inside the standard interface, your web analytics are completely isolated from your backend business operations. By exporting to a warehouse, you can finally combine web analytics with offline CRM data, for example, matching a Google Analytics 4 client_id with a Salesforce Lead ID. This can help you track exactly which ad campaign or website click ultimately led to a closed-won deal. 

“In our experience helping hundreds of e-commerce clients, the moment they hit the Google Analytics 4 data thresholding limit is usually when they realize they need a data warehouse.”

How Does the Native GA4 to BigQuery Export Works? 

Native GA4 to BigQuery export sends raw event-level data from a GA4 property to a BigQuery dataset. Before linking these two, we will need to create a new project in Google Cloud Platform and grant the necessary permissions. 

Creating New GCP Project 

Visit Google Cloud Platform and sign in using the same Google account that has Editor or Administrator permissions on your GA4 property. 

Google Cloud platform

Open Project picker and then click New project.

Select a resource in Google Cloud

Give your project a clear name and select the organization, if needed. Click Create

Under Quick Access, select BigQuery

Google Cloud platform

BigQuery should be enabled for this project now. 

Linking GA4 and BigQuery 

The cloud environment is ready, so now we need to connect GA4 to BigQuery. Head back to the Google Analytics and open the Admin panel. Under Property settings, select Product links > BigQuery links. Click the blue Link button.

Google Analytics BigQuery links

In the opened window, click Choose a BigQuery project, select the GCP project you just created, and click Confirm

Link to a BigQuery project

One thing to watch with the region: pick the one closest to your main user base, like EU (Frankfurt) or US (Iowa). You can’t change it later without deleting the link, and deleting it wipes out your intraday tables. If GDPR applies to you, choose an EU region. Confirm your selection and click Next.

Create a Link to a BigQuery project

Under Data Streams, click Configure data streams and events. Select streams you’d like to include in your export. 

Configure data streams and events

Under Frequency, check either option or both. Daily sends one full batch of raw events once a day. Streaming sends events continuously, usually within minutes of the user action, and it needs the active GCP billing account you turned on earlier.

Configure data streams and events

Review and submit the link. After these steps, you should see confirmation that the link is created. 

confirmation that the link is created

After linking, we need to verify that the service account used by Google Analytics was automatically created in GCP. To do that, in Google Console, navigate to IAM & Admin > IAM using the left sidebar.  

Google Console

If you see a firebase-measurement@system.gserviceaccount.com, that means everything was set up correctly. Your data will start appearing in your BigQuery project soon. The native connection is a good start, but it won’t backfill anything. You only get data from the day you create the link onward. So if you need your past analytics in BigQuery now, you’ll have to pull that history in with an external ETL tool. 

How Did We Test and Evaluate the Integration Methods? 

To compare the native export against external ETL tools, our data engineering team spent 30 hours building pipelines. We tested migrating a sample of GA4 events into BigQuery, and judged each tool on three things: how it handles nested JSON flattening, how fast it loads data, and how easily it merges with a secondary source like HubSpot. 

Which ETL Tools Are Best for Exporting GA4 to BigQuery? 

Depending on your engineering resources, data volume, and budget, different tools will suit different infrastructure needs. Let’s look at the four main methods for moving your data. 

Which Solution is Best for Free, Raw Data Backups? 

Native GA4 Export 

The native connection between GA4 and Google Cloud remains the best option for teams that want an unaltered backup of their event records without incurring extra tool subscriptions.

native connection between GA4 and Google Cloud
Pros 
  • It is 100% free (within BigQuery Sandbox limits), natively supported by Google, and requires no third-party software. 
  • Supports real-time streaming to capture visitor activity with sub-minute latency. 
  • Delivers raw, unmanipulated event logs directly to your warehouse. 
Cons 
  • The strict 1-million daily event limit, inability to export historical data, and the deeply nested JSON schema that requires advanced SQL to read. 
  • Streaming export requires an active GCP billing account and is not available under the free BigQuery Sandbox tier. 

Which Solution is Best for Enterprise & High-Volume Streaming? 

Fivetran 

Fivetran is an enterprise-grade SaaS ELT platform designed for large-scale operations with dedicated data engineering budgets. It handles massive throughput and provides automated management features. 

Fivetran
Pros 
  • Incredible speed, massive enterprise compliance, automated schema drift handling. 
  • Highly reliable for syncing hundreds of tables across various data sources. 
Cons 
  • Expensive volume-based pricing (Monthly Active Rows). For high-traffic applications with millions of daily event parameters, costs can escalate rapidly. 

Which Solution is Best for Developer-Heavy Teams? 

Airbyte 

Airbyte is a popular open-source data integration platform. It is appealing to teams that prefer hosting their own pipelines to avoid vendor lock-in. 

Airbyte
Pros 
  • Open-source, highly customizable, great for dbt integration. 
  • No platform licensing fees for the self-hosted Community Edition. 
Cons 
  • Requires significant infrastructure maintenance and DevOps hours if self-hosted. Your team must manage Docker containers, sync states, and local server scaling on platforms like AWS or GCP. 

Which Solution is Best for No-Code Integration? 

Skyvia 

Skyvia is a cloud-first, completely no-code data integration platform. It is built to make it easy for business users and analytics teams to import, duplicate, and map data without managing infrastructure. 

Skyvia
Pros 
  • Visual pipeline builder, predictable pricing (not punishing for high-event volume like MAR), native flattening of complex GA4 data schemas. 
  • Enables non-technical users to build and run pipelines in a matter of minutes. 
Cons 
  • Skyvia is a cloud-first platform. If you represent a highly regulated banking institution that requires a completely air-gapped, on-premise data pipeline without any internet access, we are not the right fit for you. 

How Do These Methods Compare in Real-World Metrics? 

To help you choose, here is a detailed breakdown comparing each method across key operational criteria: 

Integration Method Schema Handling Pricing Model Setup Complexity Best For 
Native GA4 Export Nested JSON (requires advanced SQL to unnest) Free / Pay per stream Medium Teams with strong SQL skills 
Skyvia Automated Flattening, Visual Mapping Subscription Low Marketing teams and data analysts
Fivetran Standardized Schema  Monthly Active Rows (MAR) Low Enterprise budgets 
Airbyte Customizable Free (if self-hosted) High Engineering teams 

How Do I Set Up the GA4 to BigQuery Connection Using Skyvia? 

The native export gets your raw events into BigQuery, but it leaves you with nested JSON and no simple way to blend in CRM data. Skyvia works differently. It pulls GA4 report data through the Data API and lands it in BigQuery as flat tables you can query with a plain SELECT, no UNNEST gymnastics. The trade-off is real and worth stating up front: you work within Google’s API limits, choosing up to 9 dimensions per table, so wide reporting gets split across a few replication tasks rather than dumped into one. Here is the setup we used, start to finish, including the run that failed on purpose so you can see where the limit bites. 

Before You Start: Prepare BigQuery 

Skyvia loads data into a dataset, but it can’t create one for you. A fresh project has zero datasets, and, unless you tried native GA4 to BigQuery export before, you will have zero of them in your project. So do this first: 

  1. In BigQuery, open your project and click Create dataset
  2. Give it a dedicated name (we went with ga4_skyvia) and select a region that works for you. 
  3. In menu, select Cloud Storage > Buckets. Then, click Create.  
  4. Create a bucket in that same region. Skyvia will stage CSV files there on each replication run.  

Step 1: Connect to Your Google Analytics 4 Account 

In Skyvia, click + Create new, then Connectors, and pick Google Analytics 4. Click Sign In with Google, choose the account that has access to required GA4 property, and select that property in the dropdown. 

Google Analytics 4 connector by Skyvia

Save the connection. 

Step 2: Connect Your BigQuery Warehouse 

Create a second connection and pick Google BigQuery. Authenticate via user or service account, fill in Project IDDataset ID and Cloud Storage Bucket name. 

  • Project ID is your project name, visible in the top header of Google Cloud platform. 
  • Dataset ID is the name of the dataset you created above. 
  • Cloud Storage Bucket is the name of the staging bucket you created above. 
BigQuery connection by Skyvia

Step 3: Build the Replication 

Click + Create New, then Replication. Replication is the right fit here because it builds the BigQuery tables for you and then keeps them current with incremental loads. The wizard has six steps: 

  1. Select your Google Analytics 4 connection. 
Select your Google Analytics 4 connection
  1. Select your BigQuery connection. 
Select your BigQuery connection. 
  1. Pick Incremental Updates so that after the first full sync, only new and changed rows load. Pick Create new so Skyvia builds the schema in BigQuery for you. And select Apply changes for schema drift, which means that when GA4’s fields change, Skyvia updates the BigQuery tables to match instead of breaking the pipeline. 
Replication behaviour settings
  1. Set your table naming rules (case, prefix) and, if you want a load timestamp on every row, add the _skyvia_sync column. 
Schema settings
  1. The GA4 connector exposes two objects: CompleteAnalytics (standard reporting dimensions and metrics) and RealtimeAnalytics. Select CompleteAnalytics and click Edit
Object selection

Step 4: Choose Your Dimensions and Metrics 

After clicking Edit, the Task Editor window should show up. This is where Skyvia trades SQL for checkboxes: tick the GA4 dimensions and metrics you want, name the target table, and save. 

Choose Your Dimensions and Metrics

We picked 6 dimensions and 6 metrics and named the table TrafficAcquisition

  • Dimensions: Date, SessionDefaultChannelGroup, SessionSource, SessionMedium, SessionCampaignName, DeviceCategory 
  • MetricsSessions, EngagedSessions, TotalUsers, NewUsers, KeyEvents, TotalRevenue 

Due to Google’s GA4 API restrictions, you can only pick up to 9 dimensions. If you need more breakdowns, add a second replication task. Metrics are easier: ask for more than ten and Skyvia will make the extra API calls for you. 

replication task

Step 5: Run It and Check BigQuery 

Click Save, then Run. You can see the progress using Monitor or Log tabs. 

Monitor or Log tabs

Run has succeeded, so it’s time to check BigQuery: 

Google Cloud

You can now query your data. 

What Are the Most Common Errors When Loading GA4 to BigQuery? 

Most failures fall into three buckets. Here is what each one looks like and how to clear it. 

Error 1: Exceeded rate limits (“too many table update operations for this table”) 

BigQuery limits each table to 1,500 load jobs per day, and that ceiling cannot be raised. If your pipeline writes row by row or fires a fresh load every few minutes, you burn through those 1,500 fast and the job throws “Exceeded rate limits: too many table update operations for this table.” 

The fix is to batch. Instead of streaming individual rows or running constant small loads, collect changes and load them in fewer, larger jobs. This is why a tool like Skyvia stages data as CSV files in Cloud Storage and runs one bulk load per task instead of hammering the table. If you are writing your own pipeline, group your inserts and slow the cadence to stay under the limit. 

Error 2: Difficulties unnesting arrays 

The native export stores event parameters as a repeated (nested) field called event_params, so a plain SELECT event_params gives you an array instead of a value. To read a single parameter you have to UNNEST it inside a subquery: 

SELECT 

  event_date, 

  event_name, 

  (SELECT value.string_value FROM UNNEST(event_params) WHERE key = ’page_location’) AS page_location, 

  (SELECT value.int_value    FROM UNNEST(event_params) WHERE key = ’ga_session_id’) AS session_id 

FROM `your-project.analytics_XXXXXX.events_*` 

WHERE event_name = ’page_view’ 

Inline UNNEST like this keeps one row per event. The alternative, FROM … , UNNEST(event_params), explodes each event into one row per parameter, which is fine for exploring but rarely what you want in a report. 

Error 3: “Requests are limited to 9 dimensions” 

If you pull GA4 through the Data API (the route ETL tools use) you will eventually hit “Requests are limited to 9 dimensions within a nested request.” Google caps a single report request at 9 dimensions. Tools surface every available field, so it is easy to over-select and only discover the limit when the job runs. Keep each table to 9 compatible dimensions and split wider reporting across separate tasks. Metrics are more forgiving; a good tool will batch extra metric requests for you. 

Is the Native Integration Enough, or Do You Need an ETL? 

It comes down to what you do with the data after it lands. 

Stick with the native export if you want a free, raw, event-level archive and your team is comfortable writing UNNEST SQL. It is the best option for fine-grained, event-by-event analysis or feeding a machine-learning pipeline that needs every hit. 

Reach for an ETL like Skyvia if you want flat, query-ready tables, or you need to blend GA4 with offline data such as Salesforce leads or Facebook Ads spend without writing nested SQL. A no-code, scheduled pipeline that hands marketing teams clean tables is usually worth more than raw events nobody on the team can query. 

Plenty of teams run both: the native export for the raw archive, Skyvia for the curated, joinable reporting tables. 

You can try this on Skyvia’s free tier. Connect GA4 and BigQuery, build the replication from the steps above, and you will have a flat channel-performance table in BigQuery in about ten minutes. 

F.A.Q. for Google Analytics 4 to BigQuery

Loader image

The native export is forward-only on standard properties: you only get data from the day the link is switched on, with no backfill of older events. An ETL tool can pull historical report data through the Data API, but only within GA4’s data-retention window (up to 14 months). 

The native daily export is free. Streaming export costs about $0.05 per GB and needs billing enabled. BigQuery itself is free up to 10 GB storage and 1 TB of queries per month, then pay-as-you-go. At single-property volume most teams pay little or nothing. 

The native export stores event parameters in a repeated field, event_params, rather than plain columns. That nested array is why a simple query returns structures instead of values. You either UNNEST it in SQL or use an ETL tool that flattens the data for you. 

Daily sends one free batch per day and covers most reporting needs. Streaming pushes events within minutes but costs $0.05 per GB and requires a billing account. Pick Daily unless you genuinely need intraday data, such as live dashboards or fraud checks.

Switch when you are tired of writing UNNEST, need flat tables your whole team can query, or want to join GA4 with CRM and ad-platform data. If you only need a raw event archive and live in SQL, the native export is still fine. 

Share

Olena Romanchuk

Olena is a skilled writer with a unique blend of technical and FMCG industry expertise. She began her career at Skyvia as a technical support engineer, where she honed her technical problem-solving skills. Prior to Skyvia, Olena held HR and IT roles in global FMCG giants such as AB InBev, Nestlé, and Philip Morris International, where she developed analytical skills, service-oriented thinking, and excellent communication to create engaging and accessible content. From a diverse and inclusive professional background, Olena excels in breaking down complex concepts and delivering clear, impactful writing tailored to varied audiences.