Step 1: Ensure IBM Data Virtualization service is enabled and provisioned

Login to IBM Cloud Pak for Data with valid credentials

To perform this lab you need IBM Cloud Pak for Data's credential with admin role. Credentials include both username and password.

Validate if IBM Data Virtualization service is installed and provisioned

Follow the below steps to make sure IBM Data Virtualization service service is installed and provisioned in the provided IBM Cloud Pak for Data instance.

Steps:

Go to hamburger navigation menu and click on services and then click on services catalog to see all available services for this instance

Now in the search box type Data Virtualization to search Data Virtualization (DV) service and click on it.
Inside DV service you should be able to see that the service is enabled as shown in above picture.
Now click on Instances to verify if Data Virtualization service is running.

2. Create New Connection

In this step we will to create connection between external data sources and Data Virtualization service using CP4D connectors. In this lab, you will create connection with Amazon S3, and Amazon RDS data sources. Follow the below steps to create connection with those data sources:

Click Navigation Menu expand 'Data' and then click 'Data Virtualization'

In the Data Virtualization Page, Click Add connection + New Connection.

Select 'Amazon RDS for PostgreSQL' connection type and fill the details to create Amazon Aurora PostgreSQL Connection in Data Virtualization then click on Create. You can also specify any name to the connection and note it down for future reference.

Click skip

Similarly add Amazon S3 datasource by selecting connection type 'Amazon S3' and fill connection details. Similar to previous connection, You can also specify any name to the connection and note it down for future reference

Once you create both data sources connection successfully, you should see both Amazon S3 and Amazon RDS PostgreSQL connection listed on the Data sources page as shown below.

3. Create virtual tables

Congratulations! In the previous step you have successfully created connection between external data sources and Data Virtualization service. now you can select table and file from the connection and create Virtualized tables or objects. Once tables are virtualized you can create VIEW using those virtual tables.

Open the Data Virtualization menu and click on Virtualization to expand and then click Virtualize as shown below.

You might see several tables under Tables tab which are coming from various connections. Select the tables (ts_wallonia_region_table and ts_flanders_region_table) to create virtual tables. You can also search for or filter the tables. When selected, click Add to cart, then View cart.

As shown in this screen, Select the Virtualize Data option and provide any unique names to table (in the image below names are ts_wallonia_region_table and ts_flanders_region_table) and note it down for future reference.

Now click on the Modify columns of first table (here ts_wallonia_region_table) to verify name, type and length of the columns matches with image below and click Apply.

Repeat the same step for another table (here ts_flanders_region_table).

Once after making the changes click Apply and then click Virtualize in Review cart and virtualize table page.

Click Continue to create Virtual tables.

In the last step we Virtualized the tables coming from Amazon RDS connection, now let's use csv file coming from Amazon S3 connection and Virtualize it. Click Files tab in the Virtualize page as shown below and then click Endpoint

Navigate to regional folder inside parent bucket and select ts-Brussels-grouped-21-04.csv as shown in below image and then click Add to cart then View Cart

In the Review cart and virtualize page specify unique name to the table and note it down for future reference, and then click Modify columns button to review columns metadata.

Edit the column name, type, and length as shown in below image and then click Apply

In the Review cart and virtualize page check Virtualized data option then click Virtualize button.

Click Continue to create Virtual tables.

Click View virtualized data

4. Create VIEW by joining two virtual tables/objects

Till now we have created connections with external data sources and from external data sources we picked tables and files to Virtualize them. Now we will use those Virtualize tables/objects and join them to create a VIEW. This VIEW will give us capability to query multiple data sources without creating data replicas.

You can follow below steps to create VIEW:

Click Data Virtualization menu and expand Virtualization and then click Virtualized Data option.

Select the virtual tables which you have created in the previous step. (eg here ts_wallonia_region_table and ts_flanders_region_table) and click Join.

Join two tables by specify join key and then click on Preview to preview joined table.

Close preview page, click Next on Join virtual objects page

Edit the column names as shown in below image.

Once after ensuring all details, provide unique view name (eg. flanders_wallonia_joined_view), select Virtualized data checkbox and click Create view.

Click 'Virutalized Data' and Select joined view you created in previous step (eg. flanders_wallonia_joined_view) and ts_brussels_region_table table to create view.

Specify date as join Key and click Preview

Edit the column names as shown below

Specify the view name (eg. brussels_wallonia_flanders_joined_view) and click Create view

By following all the steps you have created a single joined view from different data source. Now let's go to the Catalog to view the data.

Click navigation bar and expand Catalogs and click All Catalogs and select default catalog to see the view we created in last step.

In the Default Catalog page you will see the joined view (brussels_wallonia_flanders_joined_view) which you have created in the last step. If you noticed view started from ADMIN. which is the username. In your case, it might start with <USER_NAME>.brussels_wallonia_flanders_joined_view.

Click on the view for more details.

Click Assets tab and provide your Cloud Pak for Data admin credentials.

After successfully validating credentials, you will be able to see the integrated data.

Click on the Profile tab to get statistics of data inside view.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dv.md

dv.md

Step 1: Ensure IBM Data Virtualization service is enabled and provisioned

2. Create New Connection

3. Create virtual tables

4. Create VIEW by joining two virtual tables/objects

Files

dv.md

Latest commit

History

dv.md

File metadata and controls

Step 1: Ensure IBM Data Virtualization service is enabled and provisioned

2. Create New Connection

3. Create virtual tables

4. Create VIEW by joining two virtual tables/objects