Run the Post-Processing Steps

Perform the following post-processing steps once the model is trained:
  1. Store the trained model in a datastore for later use and to avoid training the model again.

    Use the oml.ds.save function specifying the name of the datastore and the name/object pair as shown in the following code:

    oml.ds.save({'lisa_model': lisa_model}, 'spatial_ai_ds', description='Hotspot Clustering for Median Income', overwrite=True)
    print(oml.ds.dir())

    Identify the recently created one from the directory of datastores.

            datastore_name  ...                           description
    0     agglomerative_ds  ...                                      
    1  dbscan_accidents_ds  ...                                      
    2     sai_regressor_ds  ...                      some description
    3              spatial  ...                                  None
    4        spatial_ai_ds  ...  Hotspot Clustering for Median Income
    5     spatial_error_ds  ...                      some description
    
    [6 rows x 5 columns]
  2. Load the model from a datastore.

    Use the oml.ds.load function to load the model from a datastore by specifying the name of the datastore and the name of the Python object with the trained model.

    ds_objs = oml.ds.load('spatial_ai_ds', objs=['lisa_model'], to_globals=False)
    lisa_model_loaded = ds_objs['lisa_model']
    
    print(lisa_model_loaded._labels[:10])

    After loading the trained clustering model from a datastore, obtain the labels assigned to each observation with the _labels property. The preceding code prints the labels from the first ten observations:

    [ 2  2  1  1  1  1  1 -1 -1 -1]
  3. Create and store a user-defined Python function (UDF) that loads the trained model from a datastore and returns the labels assigned to the training data.

    The UDF is registered with OML using oml.script.create.

    udf = """def get_lisa_labels_():
        import oml
        ds_objs = oml.ds.load('spatial_ai_ds', objs=['lisa_model'], to_globals=False)
        lisa_model = ds_objs['lisa_model']
        
        return lisa_model._labels.tolist()""" 
    
    oml.script.create("lisaLabels", udf, is_global=True, overwrite=True)
  4. Run a Python UDF with SQL.

    The following code uses pyqEval to run the Python UDF lisaLabels in SQL.

    select *  
        from table(pyqEval(
            par_lst => '{}', 
            out_fmt => 'JSON',  
            scr_name => 'lisaLabels'  
        )  
    );

    The response is the labels assigned to all the observations by the clustering algorithm. For simplicity, the following output shows only the first ten labels.

    [2,2,1,1,1,1,1,-1,-1,-1,…]