4.3.1.3 Examples of Selecting Distinct Columns
Examples of the distinct
and arrange
functions of the OREdplyr
package.
Example 4-72 Selecting Distinct Columns
df <- data.frame(
x = sample(10, 100, rep = TRUE),
y = sample(10, 100, rep = TRUE)
)
DF <- ore.push(df)
nrow(DF)
nrow(distinct(DF))
arrange(distinct(DF, x), x)
arrange(distinct(DF, y), y)
# Use distinct on computed variables
arrange(distinct(DF, diff = abs(x - y)), diff)
Listing for This Example
R> df <- data.frame(
+ x = sample(10, 100, rep = TRUE),
+ y = sample(10, 100, rep = TRUE)
+ )
R> DF <- ore.push(df)
R> nrow(DF)
[1] 100
R> nrow(distinct(DF))
[1] 66
R> arrange(distinct(DF, x), x)
x
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
R> arrange(distinct(DF, y), y)
y
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
R>
R> # Use distinct on computed variables
R> arrange(distinct(DF, diff = abs(x - y)), diff)
diff
1 0
2 1
3 2
4 3
5 4
6 5
7 6
8 7
9 8
10 9
Parent topic: Select and Order Data