Oracle to BigQuery: 2 Easy Methods

艾米丽

2024年08月14日 08:28:54📖 4 分钟

LIKE.TG | 发现全球营销软件&服务汇聚顶尖互联网营销和AI营销产品，提供一站式出海营销解决方案。唯一官网：www.like.tg

In a time where data is being termed the new oil, businesses need to have a data management system that suits their needs perfectly and positions them to be able to take full advantage of the benefits of being data-driven.

Data is being generated at rapid rates and businesses need database systems that can scale up and scale down effortlessly without any extra computational cost.

Enterprises are exhausting a huge chunk of their data budgets in just maintaining their present physical database systems instead of directing the said budget towards gaining tangible insights from their data.

This scenario is far from ideal and is the reason why moving your Oracle data to a cloud-based Data Warehouse like Google BigQuery is no longer a want but a need.

This post provides a step-by-step walkthrough on how to migrate data from Oracle to BigQuery.

Introduction to Oracle

Oracle database is a relational database system that helps businesses store and retrieve data.

Oracle DB(as it’s fondly called) provides a perfect combination of high-level technology and integrated business solutions which is a non-negotiable requisite for businesses that store and access huge amounts of data. This makes it one of the world’s trusted database management systems.

Introduction to Google BigQuery

Google BigQuery is a cloud-based serverless Data Warehouse for processing a large amount of data at a rapid rate. It is called serverless as it automatically scales when running, depending on the data volume and query complexity.

Hence, there is no need to spend a huge part of your database budget on in-site infrastructure and database administrators.

BigQuery is a standout performer when it comes to analysis and data warehousing.

It provides its customers with the freedom and flexibility to create a plan of action that epitomizes their entire business structure.

There are majorly two ways of migrating data from Oracle to BigQuery. The two ways are:

Method 1: Using Custom ETL Scripts to Connect Oracle to BigQuery

This method involves a 5-step process of utilizing Custom ETL Scripts to establish a connection from Oracle to BigQuery in a seamless fashion. There are considerable upsides to this method and a few limitations as well.

Method 2: Using LIKE.TG to Connect Oracle to BigQuery

LIKE.TG streamlines the process of connecting Oracle to BigQuery, enabling seamless data transfer and transformation between the two platforms. This ensures efficient data migration, accurate analytics, and comprehensive insights by leveraging BigQuery’s advanced analytics capabilities.

Get Started with LIKE.TG for Free

In this post, we will cover the second method (Custom Code) in detail. Toward the end of the post, you can also find a quick comparison of both data migration methods so that you can evaluate your requirements and choose wisely.

Methods to Connect Oracle to BigQuery

Here are the methods you can use to set up Oracle to BigQuery migration in a seamless fashion:

Method 1: Using Custom ETL Scripts to Connect Oracle to BigQuery

The steps involved in migrating data from Oracle DB to BigQuery are as follows:

Step 1: Export Data from Oracle DB to CSV Format
Step 2: Extract Data from Oracle DB
Step 3: Upload to Google Cloud Storage
Step 4: Upload to BigQuery from GCS
Step 5: Update the Target Table in BigQuery

Let’s take a step-by-step look at each of the steps mentioned above.

Step 1: Export Data from Oracle DB to CSV Format

BigQuery does not support the binary format produced by Oracle DB. Hence we will have to export our data to a CSV(comma-separated value) file.
Oracle SQL Developer is the preferred tool to carry out this task. It is a free, integrated development environment. This tool makes it exceptionally simple to develop and manage Oracle databases both on-premise and on the cloud. It is a migration tool for moving your database to and from Oracle. Oracle SQL Developer can be downloaded for free from here.
Open the Oracle SQL Developer tool, and right-click the table name in the object tree view.
Click on Export.
Select CSV, and the export data window will pop up.
Select the format tab and select the format as CSV.
Enter the preferred file name and location.
Select the columns tab and verify the columns you wish to export.
Select the Where tab and add any criteria you wish to use to filter the data.
Click on apply.

Step 2: Extract Data from Oracle DB

The COPY_FILE procedure in the DBMS_FILE_TRANSFER package is used to copy a file to a local file system. The following example copies a CSV file named client.csv from the /usr/admin/source directory to the /usr/admin/destination directory as client_copy.csv on a local file system.
The SQL command CREATE DIRECTORY is used to create a directory object for the object you want to create the CSV file. For instance, if you want to create a directory object called source for the /usr/admin/source directory on your computer system, execute the following code block

CREATE DIRECTORY source AS '/usr/admin/source';

Use the SQL command CREATE DIRECTORY to create a directory object for the directory into which you want to copy the CSV file. An illustration is given below

CREATE DIRECTORY dest_dir AS '/usr/admin/destination';

Where dest_dir is the destination directory

Grant required access to the user who is going to run the COPY_FILE procedure. An illustration is given below:

GRANT EXECUTE ON DBMS_FILE_TRANSFER TO admin; GRANT READ ON DIRECTORY source TO admin; GRANT WRITE ON DIRECTORY client TO admin;

Connect as an admin user and provide the required password when required:

CONNECT admin

Execute the COPY_FILE procedure to copy the file:

BEGIN DBMS_FILE_TRANSFER.COPY_FILE( source_directory_object => 'source', source_file_name => 'client.csv', destination_directory_object => 'dest_dir', destination_file_name => 'client_copy.csv'); END;

Step 3: Upload to Google Cloud Storage

Once the data has been extracted from Oracle DB the next step is to upload it to GCS. There are multiple ways this can be achieved. The various methods are explained below.

Using Gsutil

GCP has built Gsutil to assist in handling objects and buckets in GCS. It provides an easy and unique way to load a file from your local machine to GCS.

To copy a file to GCS:

gsutil cp client_copy.csv gs://my-bucket/path/to/folder/

To copy an entire folder to GCS:

gsutil dest_dir -r dir gs://my-bucket/path/to/parent/

Using Web console

An alternative means to upload the data from your local machine to GCS is using the web console. To use the web console alternative follow the steps laid out below.

Login to the GCP using the link. You ought to have a working Google account to make use of GCP. Click on the hamburger menu which produces a drop-down menu. Hit on storage and navigate to the browser on the left tab.

Create a new bucket to which you will migrate your data. Make sure the name you choose is globally unique.

Click on the bucket you created and select Upload files. This action takes you to your local directory where you choose the file you want to upload.

The data upload process starts immediately and a progress bar is shown. Wait for completion, after completion the file will be seen in the bucket.

Step 4: Upload to BigQuery from GCS

To upload to BigQuery you make use of either the web console UI or the command line. Let us look at a brief on both methods.

First, let’s let look into uploading the data using the web console UI.

The first step is to go to the BigQuery console under the hamburger menu.

Create a dataset and fill out the drop-down form.

Click and select the data set created by you. An icon showing ‘create table’ will appear below the query editor. Select it.

Fill in the drop-down list and create the table. To finish uploading the table, the schema has to be specified. This will be done using the command-line tool. When using the command line interacting with GCS is a lot easier and straightforward.

To access the command line, when on the GCS home page click on the Activate cloud shell icon shown below.

The syntax of the bq command line is shown below:

bq --location=[LOCATION] load --source_format=[FORMAT] [DATASET].[TABLE] [PATH_TO_SOURCE] [SCHEMA] [LOCATION] is an optional parameter that represents your Location. [FORMAT] is to be set to CSV. [DATASET] represents an existing dataset. [TABLE] is the table name into which you're loading data. [PATH_TO_SOURCE] is a fully-qualified Cloud Storage URI. [SCHEMA] is a valid schema. The schema must be a local JSON file or inline. Note: Instead of using supplying a schema definition, there is an autodetect flag that can be used.

You can specify your scheme using the bq command line. An illustration is shown below using a JSON file

bq --location=US load --source_format=CSV your_dataset.your_table gs://your_bucket/your_data.csv ./your_schema.json

The schema can also be auto-detected. An example is shown below:

bq --location=US load --autodetect --source_format=CSV your_dataset.your_table gs://mybucket/data.csv

BigQuery command-line interface offers us 3 options to write to an existing table. This method will be used to copy data to the table we created above.

The options are:

a) Overwrite the table

bq --location=US load --autodetect --replace --source_format=CSV your_dataset_name.your_table_name gs://bucket_name/path/to/file/file_name.csv

b) Append the table

bq --location=US load --autodetect --noreplace --source_format=CSV your_dataset_name.your_table_name gs://bucket_name/path/to/file/file_name.csv ./schema_file.json

c) Add a new field to the target table. In this code, the schema will be given an extra field.

bq --location=asia-northeast1 load --noreplace --schema_update_option=ALLOW_FIELD_ADDITION --source_format=CSV your_dataset.your_table gs://mybucket/your_data.csv ./your_schema.json

Step 5: Update the Target Table in BigQuery

The data that was joined in the steps above have not been fully updated to the target table.

The data is stored in an intermediate data table, this is because GCS is a staging area for BigQuery upload. Hence, the data is stored in an intermediate table before being uploaded to BigQuery:

There are two ways of updating the final table as explained below.

Update the rows in the final table and insert new rows from the intermediate table.

UPDATE final_table t SET t.value = s.value FROM intermediate_data_table s WHERE t.id = s.id; INSERT final_table (id, value) SELECT id, value FROM intermediate_data_table WHERE NOT id IN (SELECT id FROM final_table);

Delete all the rows from the final table which are in the intermediate table.

DELETE final_table f WHERE f.id IN (SELECT id from intermediate_data_table); INSERT data_setname.final_table(id, value) SELECT id, value FROM data_set_name.intermediate_data_table;

Download the Cheatsheet on How to Set Up High-performance ETL to BigQuery

Learn the best practices and considerations for setting up high-performance ETL to BigQuery

Limitations of Using Custom ETL Scripts to Connect Oracle to BigQuery

Writing custom code would add value only if you are looking to move data once from Oracle to BigQuery.
When a use case that needs data to be synced on an ongoing basis or in real-time from Oracle into BigQuery arises, you would have to move it in an incremental format. This process is called Change Data Capture. The custom code method mentioned above fails here. You would have to write additional lines of code to achieve this.
When you build custom SQL scripts to extract a subset of the data set in Oracle DB, there is a chance that the script breaks as the source schema keeps changing or evolving.
Often, there arises a need to transform the data (Eg: hide Personally Identifiable Information) before loading it into BigQuery. Achieving this would need you to add additional time and resources to the process.

In a nutshell, ETL scripts are fragile with a high propensity to break. This makes the entire process error-prone and becomes a huge hindrance in the path of making accurate, reliable data available in BigQuery.

Method 2: Using LIKE.TG to Connect Oracle to BigQuery

Using a fully managed No-Code Data Pipeline platform like LIKE.TG can help you replicate data from Oracle to BigQuery in minutes. LIKE.TG completely automates the process of not only loading data from Oracle but also enriching the data and transforming it into an analysis-ready form without having to write a single line of code.

Its fault-tolerant architecture ensures that the data is handled in a secure, consistent manner with zero data loss. Here are the steps to replicate data from Oracle to BigQuery using LIKE.TG :

Step 1: Connect to your Oracle database by providing the Pipeline Name, Database Host, Database Port, Database User, Database Password, and Service Name.

Step 2: Configure Oracle to BigQuery Warehouse migration by providing the Destination Name, Project ID, GCS Bucket, Dataset ID, Enabling Stream Inserts, and Sanitize Table/Column Names.

Migrate data from Oracle to BigQuery

Get a DemoTry it

Migrate data from Oracle to Snowflake

Get a DemoTry it

Migrate data from Amazon S3 to BigQuery

Get a DemoTry it

Here are more reasons to love LIKE.TG :

Secure: LIKE.TG has a fault-tolerant architecture that ensures that the data is handled in a secure, consistent manner with zero data loss.
Auto Schema Mapping: LIKE.TG takes away the tedious task of schema management & automatically detects the schema of incoming data and maps it to the destination schema.
Minimal Learning: LIKE.TG , with its simple and interactive UI, is extremely simple for new customers to work on and perform operations.
LIKE.TG is Built to Scale: As the number of sources and the volume of your data grows, LIKE.TG scales horizontally, handling millions of records per minute with very little latency.
Incremental Data Load: LIKE.TG allows the transfer of data that has been modified in real-time. This ensures efficient utilization of bandwidth on both ends.
Live Support: The LIKE.TG team is available round the clock to extend exceptional support to its customers through chat, email, and support call
Live Monitoring: LIKE.TG allows you to monitor the data flow and check where your data is at a particular point in time.

Conclusion

This blog talks about the two methods you can use to connect Oracle to BigQuery in a seamless fashion. If you rarely need to transfer your data from Oracle to BigQuery, then the first manual Method will work fine. Whereas, if you require Real-Time Data Replication and looking for an Automated Data Pipeline Solution, then LIKE.TG is the right choice for you!

Connect Oracle to Bigquery without writing any code

With LIKE.TG , you can achieve simple and efficient data migration from Oracle to BigQuery in minutes. LIKE.TG can help you replicate Data from Oracle and 150+ data sources(including 50+ Free Sources) to BigQuery or a destination of your choice and visualize it in a BI tool. This makes LIKE.TG the right partner to be by your side as your business scales.

Want to take LIKE.TG for a spin? Sign up for a 14-day free trial and experience the feature-rich LIKE.TG suite firsthand.

LIKE.TG：汇集全球营销软件&服务，助力出海企业营销增长。提供最新的“私域营销获客”“跨境电商”“全球客服”“金融支持”“web3”等一手资讯新闻。

点击【联系客服】 🎁 免费领 1G 住宅代理IP/proxy，即刻体验 WhatsApp、LINE、Telegram、Twitter、ZALO、Instagram、signal等获客系统，社媒账号购买 & 粉丝引流自助服务或关注【LIKE.TG出海指南频道】、【LIKE.TG生态链-全球资源互联社区】连接全球出海营销资源。

本文由LIKE.TG编辑部转载自互联网并编辑，如有侵权影响，请联系官方客服，将为您妥善处理。

This article is republished from public internet and edited by the LIKE.TG editorial department. If there is any infringement, please contact our official customer service for proper handling.

效率工具客服坐席客服系统坐席多开

相关产品推荐