Oracle PL/SQL: CLUSTER SET - CLUSTER SET returns a varray of objects containing all possible clusters that a given row belongs to

Quick Search:

Oracle PL/SQL: CLUSTER_SET

Jump to:

Category: >> Oracle PL/SQL >> CLUSTER_SET

<< last

next >>

Snippet Name: CLUSTER_SET

Description: CLUSTER_SET returns a varray of objects containing all possible clusters that a given row belongs to. Each object in the varray is a pair of scalar values containing the cluster ID and the cluster probability. The object fields are named CLUSTER_ID and PROBABILITY, and both are Oracle NUMBER.

For the optional topN argument, specify a positive integer. Doing so restricts the set of predicted clusters to those that have one of the top N probability values. If you omit topN or set it to NULL, then all clusters are returned in the collection. If multiple clusters are tied for the Nth value, the database still returns only N values.

For the optional cutoff argument, specify a positive integer to restrict the returned clusters to those with a probability greater than or equal to the specified cutoff. You can filter only by cutoff by specifying NULL for topN and the desired cutoff value for cutoff.

You can specify topN and cutoff together to restrict the returned clusters to those that are in the top N and have a probability that passes the threshold.

This example, and the prerequisite data mining operations, including the creation of the dm_sh_clus_sample model and the views and type, can be found in the demo file $ORACLE_HOME/rdbms/demo/dmkmdemo.sql.

Also see:
» PREDICTION_SET
» PREDICTION_PROBABILITY
» PREDICTION_DETAILS
» PREDICTION_COST
» PREDICTION_BOUNDS
» PREDICTION
» FEATURE_SET
» FEATURE_ID
» CLUSTER_PROBABILITY
» CLUSTER_ID

Comment: (none)

Language: PL/SQL
Highlight Mode: PLSQL
Last Modified: March 07th, 2009

CLUSTER_SET(<schema.model>, <top N>, <cutoff>
<mining_attribute_clause>)
 
WITH
clus_tab AS (
SELECT id,
       A.attribute_name aname,
       A.conditional_operator op,
       NVL(A.attribute_str_value,
         ROUND(DECODE(A.attribute_name, N.col,
                      A.attribute_num_value * N.scale + N.shift,
                      A.attribute_num_value),4)) val,
       A.attribute_support support,
       A.attribute_confidence confidence
  FROM TABLE(DBMS_DATA_MINING.GET_MODEL_DETAILS_KM('km_sh_clus_sample')) T,
       TABLE(T.rule.antecedent) A,
       km_sh_sample_norm N
 WHERE A.attribute_name = N.col (+) AND A.attribute_confidence > 0.55
),
clust AS (
SELECT id,
       CAST(COLLECT(Cattr(aname, op, TO_CHAR(val), support, confidence))
         AS Cattrs) cl_attrs
  FROM clus_tab
GROUP BY id
),
custclus AS (
SELECT T.cust_id, S.cluster_id, S.probability
  FROM (SELECT cust_id, CLUSTER_SET(km_sh_clus_sample, NULL, 0.2 USING *) pset
          FROM km_sh_sample_apply_prepared
         WHERE cust_id = 101362) T,
       TABLE(T.pset) S
)
SELECT A.probability prob, A.cluster_id cl_id,
       B.attr, B.op, B.val, B.supp, B.conf
  FROM custclus A,
       (SELECT T.id, C.*
          FROM clust T,
               TABLE(T.cl_attrs) C) B
 WHERE A.cluster_id = B.id
ORDER BY prob DESC, cl_id ASC, conf DESC, attr ASC, val ASC;
 
   PROB      CL_ID ATTR            OP  VAL                   SUPP    CONF
------- ---------- --------------- --- --------------- ---------- -------
  .7873          8 HOUSEHOLD_SIZE  IN  9+                     126   .7500
  .7873          8 CUST_MARITAL_ST IN  Divorc.                118   .6000
                   ATUS
 
  .7873          8 CUST_MARITAL_ST IN  NeverM                 118   .6000
                   ATUS
 
  .7873          8 CUST_MARITAL_ST IN  Separ.                 118   .6000
                   ATUS
 
  .7873          8 CUST_MARITAL_ST IN  Widowed                118   .6000
                   ATUS
 
  .2016          6 AGE             >=  17                     152   .6667
  .2016          6 AGE             <=  31.6                   152   .6667
  .2016          6 CUST_MARITAL_ST IN  NeverM                 168   .6667
                   ATUS
 
8 rows selected.