27.13.5 Joining PGX Frames

You can join two frames whose rows are correlated through one of the columns using the join functionality. This allows us to combine frames by checking for equality between rows for a specific column.

Assume there are two PgxFrames as shown:

//exampleFrame
+-------------------------------------------------------------------------------+
| name     | age | salary      | married | tax_rate | random    | date_of_birth |
+-------------------------------------------------------------------------------+
| John     | 27  | 4133300.0   | true    | 11.0     | 123456782 | 1985-10-18    |
| Albert   | 23  | 5813000.5   | false   | 12.0     | 124343142 | 2000-01-14    |
| Heather  | 28  | 1.0130302E7 | true    | 10.5     | 827520917 | 1985-10-18    |
| Emily    | 24  | 9380080.5   | false   | 13.0     | 128973221 | 1910-07-30    |
| "D'Juan" | 27  | 1582093.0   | true    | 11.0     | 92384     | 1955-12-01    |
+-------------------------------------------------------------------------------+

//moreInfoFrame
+---------------------------------------+
| name   | title                        |
+---------------------------------------+
| John   | Software Engineering Manager |
| Albert | Sales Manager                |
| Emily  | Operations Manager           |
+---------------------------------------+

The following example calls the join method to join exampleFrame and moreInfoFrame. The API uses the name column as joinKeyColumn to join the two frames. The column prefixes specified in the API are leftFrame and rightFrame.

opg4j> exampleFrame.join(moreInfoFrame, "name", "leftFrame", "rightFrame").print()
exampleFrame.join(moreInfoFrame, "name", "leftFrame", "rightFrame").print();
>>> example_frame.join(
...     more_info_frame,
...     join_key_column="name",
...     left_prefix="leftFrame",
...     right_prefix="rightFrame").print()

Alternatively, you can also join the frames by providing leftjoinKeyColumn and rightjoinKeyColumn. In this case joinKeyColumn is omitted.

opg4j> exampleFrame.join(moreInfoFrame, "name", "name", "leftFrame", "rightFrame").print()
exampleFrame.join(moreInfoFrame, "name", "name", "leftFrame", "rightFrame").print();
>>> example_frame.join(
...     more_info_frame,
...     left_join_key_column="name",
...     right_join_key_column="name",
...     left_prefix="leftFrame",
...     right_prefix="rightFrame").print()

The resulting joined frame contains the columns of the two frames for the rows with the same name column as shown:

+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| leftFramename | leftFrameage | leftFramesalary | leftFramemarried | leftFrametax_rate | leftFramerandom | leftFramedate_of_birth | rightFramename | rightFrametitle              |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| John          | 27           | 4133300.0       | true             | 11.0              | 123456782       | 1985-10-18             | John           | Software Engineering Manager |
| Albert        | 23           | 5813000.5       | false            | 12.0              | 124343142       | 2000-01-14             | Albert         | Sales Manager                |
| Emily         | 24           | 9380080.5       | false            | 13.0              | 128973221       | 1910-07-30             | Emily          | Operations Manager           |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+