Skip to content

A sample of different approaches to collection joins

License

Notifications You must be signed in to change notification settings

jgebal/collection_joins

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sample project to provide means of one to many left joining collections in Oracle PL/SQL

For the purpose of this sample project following types are being used.

  • An object representing the contents of the collection to be joined
create or replace type a_demo_object as object (
   id number,
   name varchar2(100)
);
/
create or replace type a_demo_collection as table of a_demo_object;
/
create or replace type a_demo_join_object as object (
   id number,
   children a_demo_collection
);
/
create or replace type a_demo_join_collection as table of a_demo_join_object;
/

Description

There are currently three different join methods implemented:

  • Nested loops outer join with full scan of the inner collection. The method has complexity n*m, causing computation intensity to grow exponentially.
  • Nested loops outer join with index-like scan of the inner collection. First an index is build on inner collection, then access is done via index-path. Complexity x*m + y*(m+n) where x is number of steps needed to build index and y is the number of steps needed to link the two.
  • Sort-merge outer join. First both collections are sorted, then each element is accessed exactly once while joining. Complexity a*(m+n) + b*(m+n) where a is number of steps needed to sort the collection and b is the number of steps needed to link the two.

Both sort-merge and nested loops with index should provide quite linear performance with the growth of amount of data processed.

Usage

Example of joining two collections:

select *
  from table( 
     collection_outer_join.nl_index_scan_join(
        a_demo_collection(a_demo_object(1,'a')),
        a_demo_collection(
            a_demo_object(1,'a'),
            a_demo_object(1,'a')
        )
     )
  );

--join 100 sample rows with 1000 sample rows
select * from table( collection_outer_join.nl_index_scan_join(100, 1000) );
select * from table( collection_outer_join.sort_merge_join(100, 1000) );
select * from table( collection_outer_join.nl_full_scan_join(100, 1000) );

You may use the examples provided here as you find them useful.

Additional package collection_gen was added only for demo and testing purposes.

Performance

RUN_COMMENT TOTAL_TIME_MILI_SEC
nl_full_scan_join 1,000 parents 20,000 children 178,320
nl_index_scan_join 1,000 parents 20,000 children 900
sort_merge_join 1,000 parents 20,000 children 1,130
nl_index_scan_join 40,000 parents 400,000 children 20,720
sort_merge_join 40,000 parents 400,000 children 27,330
nl_index_scan_join 10,000 parents 1,000,000 children 46,810
sort_merge_join 10,000 parents 1,000,000 children 70,040

The code was profiled using DBMSL_PROFILER.

  • The nl_full_scan_join is slowest
  • The nl_index_scan_join is fastest
  • The sort_merge_join would be as fast as nested loops, but requires sorting that takes extra hit on performance.

Here you may see the full outcomes of the profiler runs.

About

A sample of different approaches to collection joins

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published