27.13.5 Joining PGX Frames
You can join two frames whose rows are correlated through one of the columns using the join
functionality. This allows us to combine frames by checking for equality between rows for a specific column.
Assume there are two PgxFrames
as shown:
//exampleFrame
+-------------------------------------------------------------------------------+
| name | age | salary | married | tax_rate | random | date_of_birth |
+-------------------------------------------------------------------------------+
| John | 27 | 4133300.0 | true | 11.0 | 123456782 | 1985-10-18 |
| Albert | 23 | 5813000.5 | false | 12.0 | 124343142 | 2000-01-14 |
| Heather | 28 | 1.0130302E7 | true | 10.5 | 827520917 | 1985-10-18 |
| Emily | 24 | 9380080.5 | false | 13.0 | 128973221 | 1910-07-30 |
| "D'Juan" | 27 | 1582093.0 | true | 11.0 | 92384 | 1955-12-01 |
+-------------------------------------------------------------------------------+
//moreInfoFrame
+---------------------------------------+
| name | title |
+---------------------------------------+
| John | Software Engineering Manager |
| Albert | Sales Manager |
| Emily | Operations Manager |
+---------------------------------------+
The following example calls the join
method to join
exampleFrame
and moreInfoFrame
. The API uses the
name
column as joinKeyColumn
to join the two frames. The
column prefixes specified in the API are leftFrame
and
rightFrame
.
opg4j> exampleFrame.join(moreInfoFrame, "name", "leftFrame", "rightFrame").print()
exampleFrame.join(moreInfoFrame, "name", "leftFrame", "rightFrame").print();
>>> example_frame.join(
... more_info_frame,
... join_key_column="name",
... left_prefix="leftFrame",
... right_prefix="rightFrame").print()
Alternatively, you can also join the frames by providing leftjoinKeyColumn
and rightjoinKeyColumn
. In this case joinKeyColumn
is
omitted.
opg4j> exampleFrame.join(moreInfoFrame, "name", "name", "leftFrame", "rightFrame").print()
exampleFrame.join(moreInfoFrame, "name", "name", "leftFrame", "rightFrame").print();
>>> example_frame.join(
... more_info_frame,
... left_join_key_column="name",
... right_join_key_column="name",
... left_prefix="leftFrame",
... right_prefix="rightFrame").print()
The resulting joined frame contains the columns of the two frames for the rows
with the same name
column as shown:
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| leftFramename | leftFrameage | leftFramesalary | leftFramemarried | leftFrametax_rate | leftFramerandom | leftFramedate_of_birth | rightFramename | rightFrametitle |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| John | 27 | 4133300.0 | true | 11.0 | 123456782 | 1985-10-18 | John | Software Engineering Manager |
| Albert | 23 | 5813000.5 | false | 12.0 | 124343142 | 2000-01-14 | Albert | Sales Manager |
| Emily | 24 | 9380080.5 | false | 13.0 | 128973221 | 1910-07-30 | Emily | Operations Manager |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Parent topic: PgxFrames Tabular Data-Structure