5.3.6 Rank Rows
OREdplyr
functions for ranking rows.
The ranking functions rank the elements in an ordered ore.vector
by its values. An ore.character
is coerced to an ore.factor
. The values of an ore.factor
are based upon factor levels. To reverse the direction of the ranking, use the desc
function.
Table 5-7 Ranking Rows
Function | Description |
---|---|
cume_dist |
A cumulative distribution function: returns the proportion of all values that are less than or equal to the current rank. |
dense_rank |
Like |
|
Gets the first value from an ordered |
|
Gets the last value from an ordered |
min_rank |
Equivalent to |
|
Obtains the value at the specified position in the order. |
ntile |
A rough ranking that breaks the input vector into n buckets. |
|
Gets the nth value from an ordered |
percent_rank |
Returns a number between 0 and 1 that is computed by rescaling |
|
Equivalent to |
|
Selects the top or bottom number of rows. |
Example 5-81 Ranking Rows
These examples use the ranking functions row_number
, min_rank
, dense_rank
, percent_rank
, cume_dist
, and ntile
.
X <- ore.push(c(5, 1, 3, 2, 2, NA))
row_number(X)
row_number(desc(X))
min_rank(X)
dense_rank(X)
percent_rank(X)
cume_dist(X)
ntile(X, 2)
ntile(ore.push(runif(100)), 10)
MTCARS <- ore.push(mtcars)
by_cyl <- group_by(MTCARS, cyl)
# Using ranking functions with an ore.frame
head(mutate(MTCARS, rank = row_number(hp)))
head(mutate(MTCARS, rank = min_rank(hp)))
head(mutate(MTCARS, rank = dense_rank(hp)))
# Using ranking functions with a grouped ore.frame
head(mutate(by_cyl, rank = row_number(hp)))
head(mutate(by_cyl, rank = min_rank(hp)))
head(mutate(by_cyl, rank = dense_rank(hp)))
Listing for This Example
R> X <- ore.push(c(5, 1, 3, 2, 2, NA))
R>
R> row_number(X)
[1] 5 1 4 2 3 6
R> row_number(desc(X))
[1] 1 5 2 3 4 6
R>
R> min_rank(X)
[1] 5 1 4 2 2 6
R>
R> dense_rank(X)
[1] 4 1 3 2 2 6
R>
R> percent_rank(X)
[1] 0.8 0.0 0.6 0.2 0.2 1.0
R>
R> cume_dist(X)
[1] 0.8333333 0.1666667 0.6666667 0.5000000 0.5000000 1.0000000
R>
R> ntile(X, 2)
[1] 2 1 2 1 1 2
R> ntile(ore.push(runif(100)), 10)
[1] 6 10 5 2 1 1 8 3 8 8 7 3 10 3 7 9 9 4 4 10 10 7 2 3 7 4 5 5 3 9 4 6 8 4 10 6 1 5 5 4 6 9
[43] 5 8 2 7 7 1 2 9 1 2 8 5 6 5 3 4 7 1 3 1 10 1 5 5 10 9 2 3 9 6 6 8 8 6 3 7 2 2 8 4 1 9
[85] 6 10 4 10 7 2 9 10 7 2 4 9 6 3 8 1
R>
R> MTCARS <- ore.push(mtcars)
R> by_cyl <- group_by(MTCARS, cyl)
R>
R> # Using ranking functions with an ore.frame
R> head(mutate(MTCARS, rank = row_number(hp)))
mpg cyl disp hp drat wt qsec vs am gear carb rank
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 12
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 13
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 7
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 14
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 20
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 10
R>
R> head(mutate(MTCARS, rank = min_rank(hp)))
mpg cyl disp hp drat wt qsec vs am gear carb rank
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 12
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 12
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 7
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 12
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 20
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 10
R>
R> head(mutate(MTCARS, rank = dense_rank(hp)))
mpg cyl disp hp drat wt qsec vs am gear carb rank
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 11
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 11
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 6
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 11
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 15
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 9
R>
R> # Using ranking functions with a grouped ore.frame
R> head(mutate(by_cyl, rank = row_number(hp)))
mpg cyl disp hp drat wt qsec vs am gear carb rank
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 3
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 7
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 4
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 3
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 1
R>
R> head(mutate(by_cyl, rank = min_rank(hp)))
mpg cyl disp hp drat wt qsec vs am gear carb rank
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 2
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 7
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 2
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 3
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 1
R>
R> head(mutate(by_cyl, rank = dense_rank(hp)))
mpg cyl disp hp drat wt qsec vs am gear carb rank
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 2
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 2
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 6
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 2
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 1
Parent topic: Data Manipulation Using OREdplyr