5.2.7 Rank Data
The ore.rank
function analyzes distribution of values in numeric columns of an ore.frame
.
The ore.rank
function supports useful functionality, including:
-
Ranking within groups
-
Partitioning rows into groups based on rank tiles
-
Calculation of cumulative percentages and percentiles
-
Treatment of ties
-
Calculation of normal scores from ranks
The ore.rank
function syntax is simpler than the corresponding SQL queries.
The ore.rank
function returns an ore.frame
in all instances.
You can use these R scoring methods with ore.rank
:
-
To compute exponential scores from ranks, use
savage
. -
To compute normal scores, use one of
blom,
tukey
, orvw
(van der Waerden).
For details about the function arguments, call help(ore.rank)
.
The following examples illustrate using ore.rank
. The examples use the NARROW
data set.
Example 5-45 Ranking Two Columns
This example ranks the two columns AGE and CLASS and reports the results as derived columns; values are ranked in the default order, which is ascending.
x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass')
Example 5-46 Handling Ties in Ranking
This example ranks the two columns AGE and CLASS. If there is a tie, the smallest value is assigned to all tied values.
x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass', ties='low')
Example 5-47 Ranking by Groups
This example ranks the two columns AGE and CLASS and then ranks the resulting values according to COUNTRY.
x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass', group.by='COUNTRY')
Example 5-48 Partitioning into Deciles
To partition the columns into a different number of partitions, change the value of groups
. For example, groups=4
partitions into quartiles. This example ranks the two columns AGE and CLASS and partitions the columns into deciles (10 partitions).
x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass',groups=10)
Example 5-49 Estimating Cumulative Distribution Function
This example ranks the two columns AGE and CLASS and estimates the cumulative distribution function for both column.
x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass',nplus1=TRUE)
Example 5-50 Scoring Ranks
This example ranks the two columns AGE and CLASS and scores the ranks in two different ways. The first command partitions the columns into percentiles (100 groups). The savage
scoring method calculates exponential scores and blom
scoring calculates normal scores.
x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass', score='savage', groups=100, group.by='COUNTRY') x <- ore.rank(data=NARROW, var='AGE=RankOfAge, CLASS=RankOfClass', score='blom')
Parent topic: Explore Data