clustering
This example shows how to run the agglomerative with regionalization algorithm over a given dataset, specifying the number of clusters and the type of spatial weights.
The clustering algorithm is set in the method
parameter,
while the number of clusters and the spatial weights are defined in the
n_clusters
and weights_def
parameters
respectively. The features considered for clustering are specified in the
columns
parameter.
select *
from table(
pyqEval(
'{
"oml_connect": true,
"table": "oml_user.la_block_groups",
"columns": ["median_income"],
"method": "AGGLOMERATIVE",
"n_clusters": 6,
"key_column": "geoid",
"weights_def": {"type": "Queen"}
}',
'{ "geoid": "VARCHAR2(50)", "label": "NUMBER" }',
'clustering'
)
);
The result contains the index column specified in the key_column
parameter and the labels of each row, indicating to which cluster they belong.
You can visualize the clusters using the select IMAGE clause and the
oml_graphics_flag
parameter set to true
. In the
following code, the plot
parameter indicates that it uses a basemap as
background. Also, note that the output format (out_fmt
) is set to
PNG
.
select IMAGE
from table(
pyqEval(
par_lst => '{
"oml_connect": true,
"oml_graphics_flag": true,
"table": "oml_user.la_block_groups",
"columns": ["median_income"],
"method": "AGGLOMERATIVE",
"n_clusters": 6,
"key_column": "geoid",
"weights_def": {"type": "Queen"},
"plot": {"with_basemap": true}
}',
out_fmt => 'PNG',
scr_name => 'clustering'
)
);
The result is a map with the observations colored according to the cluster
they are assigned. Note that there are six clusters as specified in the
n_clusters
parameter. By defining spatial weights, the
agglomerative clustering algorithm executes regionalization. This means that
observations assigned to the same cluster share common characteristics and are
geographically connected.