Quickstats Profiler

Use this topic for a simple page, dialog or other UI without any sections or tabs. You must describe every field and option on the page.

To configure this topic so it generates correctly in the help output, perform the following steps:

  1. Configure the Sect1 element:

    1. Create the context-sensitive topic as a Sect1.

    2. Title the context-sensitive topic to reflect the name of the UI, using a simple noun phrase with headline-style capping. In the page title, include the generic noun. For example, Preferences Page, Preferences Dialog, Editing Window, Resource Catalog. If this is a topic that you are migrating, use the same title as before..

    3. Set the HelpTOC attribute to TopicOnly, so this topic does not appear in the TOC.

    4. Modify the sample OutputFileName attribute of the element to include the name of your UI and optionally include the product name. For example, if your product is Oracle XYZ Server and the UI is the Preferences Dialog, modify the value of this attribute from cs_pagename1.htm to something like cs_preferences.htm or xyz_cs_preferences.htm.

    5. Modify the HelpTopicID attribute to match the topicID for the Help icon, button, or UI.

      Use the topicID provided by the engineer as the value for the HelpTopicID attribute for the Sect1 topic. If converting help from another format, use the existing HelpTopicID. Alternatively, you can assign a topicID to the topic and then provide the engineer with the topicID you assigned. The important thing is to make sure the topicID you assign to the topic in FrameMaker matches the topicID assigned to the UI, Help icon or button by development.

  2. Configure the InformalTable element with the following guidelines:

    1. Modify the Summary and Title attributes as appropriate for the table.

    2. In the Element column, enter all the elements or options in the UI that should be documented, using the HelpPara element. In the Element column, do not bold the UI element.

    3. To indent the text in the Element column in order to groups the elements as they display in the UI, edit the Role attribute of the HelpPara element. Select Level1 to indent the text from the left margin. You can indent up to four levels to reflect the organization of options displayed in the UI, if needed, using Level1, Level2, Level3, and Level4.

    4. In the Description column, use the Para element, not the HelpPara element. You will not indent text in this column. Describe the option or element in detail. If you reference UI elements other than the one you are describing, use the Bold emphasis. If you provide a code snippet or code reference, apply the Code emphasis.

    5. In the Description column, include More inline links to conceptual topics in administrator, developer and user guides in the documentation library; include How? inline links to individual task topics in these books. Do not include inline links for other information. Instead, include those other links under the Related Topics heading, as described below.

    6. To insert More and How? link to books, use the Xulink element and for the Attribute value, enter olink:DOCID (for example, olink:ASADM11113). For more detailed information about linking, see the topic Applying TopicID Values and Adding Links to Books in the online help standards.

    7. In both the Element and Description columns, add inline graphics where necessary, as described below. Use inline graphics sparingly, only as needed, to provide UI overviews and define unlabeled elements on screen. Icons and buttons are not required in context sensitive online help.

    8. Inline graphics are limited to a height of 22 pixels. If you insert an image that exceeds 22 pixels in height, the image will be reduced and distorted in the OHW/OHJ help JAR, so do not insert it. There are no specific restrictions to the pixel width of an image, however images that fill more than half a column would be better inserted as informal figures in the introductory paragraphs.

  3. To insert an inline graphic, follow these guidelines:

    1. Select the location where you want to insert the inline graphic.

    2. In the Elements window, select InlineGraphic, and click Insert. The Attributes for New Element window appears.

    3. Type the AltText value. This is a required attribute and the book will fail if you do not enter some text. Click Insert Element. The Import dialog appears.

    4. Select the graphic to be imported from the graphics folder. The Import Graphic Scaling dialog appears.

    5. Select 72 dpi. Click Set. The graphic is inserted.

    6. Using the spacebar, insert one space before or after the graphic as needed, so there is a single space between the image and text A single space is required between an image and text before or after it (identical to the space between words). No extra space should be added before an image at the beginning of a line.

  4. To include the RelatedTopics element and links, follow these guidelines:

    1. The RelatedTopics element is the last element in a Sect1 element and is mandatory. It is included in the template and does not need to be inserted.

    2. The only child element available under the RelatedTopics element is the Para element. You can add multiple Para elements under the RelatedTopics element, in addition to those already in the template.

    3. Add any topics about technologies mentioned, if they are not covered by inline How? and More links.

    4. Add additional cross references to chapters or Sect1s in the documentation library.

    5. To insert link to books, use the Xulink element and for the Attribute value, enter olink:DOCID (for example, olink:ASADM11113). For more detailed information about linking, see the topic Applying TopicID Values and Adding Links to Books in the online help standards.

The Quickstats Profiler provides fundamental quality metrics for a number of records or transactions, highlighting:

  • Candidate key columns

  • Completeness and missing data

  • Duplication

  • Uniqueness and diversity of values

Each input attribute is profiled individually.

Quickstats is useful to establish a picture of some of the fundamentals of data and its quality.

Often documentation and meta-data information are missing, incomplete, out of date, or not trusted. It is important to produce an unequivocal picture of the data, from the data itself, so that mistakes are not made inadvertently through false assumptions.

The following table describes the configuration options:

Configuration Description

Inputs

Specify any attributes from which you want to obtain quick profiling statistics.

Options

None.

Outputs

Describes any data attribute or flag attribute outputs.

Data Attributes

None.

Flags

The following flag is output:

  • [Attribute name].Populated: indicates which reports are populated. Possible values are Y/N.

The Quickstats Profiler requires a batch of records to produce its statistics (for example, in order to tell how many duplicate values there are for each attribute analyzed). It must therefore run to completion before its results are available, and is not suitable for a process that requires a real time response.

When executed against a batch of transactions from a real time data source, it will finish its processing when the commit point (transaction or time limit) configured on the Read Processor is reached.

The following table describes the statistics produced by the profiler for each attribute:

Statistic Description

With data

The number of records with data in that attribute.

Without data

The number of records without data in that attribute. This includes records that had a NULL value, and those that contained other types of No Data, such as only white space or non-printing characters. Drill down on the number to see a breakdown of the types of No Data found.

Singletons

The number of records with values that were found only once in that attribute.

Duplicates

The number of records with values that were found more than once in that attribute.

Distinct

he number of different values that were found in the attribute. Drill down on the number to see a breakdown of these values by their frequency of occurrence.

Comments

Automated comments based on the findings of the Quickstats profiler. See below.

Clicking on the Additional Information button will show the above statistics as percentages of the total number of records analyzed.

Automated Comments

Automated Comments are generated in order to highlight potential areas of interest in the data. For example:

  • Where an attribute is 100% complete and unique, it is identified as a possible key

  • Where an attribute is nearly 100% complete and unique it is highlighted as a possibly damaged key

  • Where an attribute is nearly 100% complete (suggesting blanks are not expected), the comment prompts the user to investigate nulls

  • Where an attribute is nearly 100% unique (suggesting duplicates are not expected), the comment prompts the user to investigate duplicates

  • Where an attribute has only one distinct value, the comment suggests that the attribute may be redundant

Note that where many of the above comments apply, the comments are concatenated.

Example

In this example, the Quickstats Profiler is used to gain an initial overview of a table of Customer records.

Table 1-123 Quickstats Profiler Example

Input Field Record Total With Data Without Data Singletons Duplications Distinct Values

CU_NO

2001

2000

1

1997

3

1998

CU_ACCOUNT

2001

2000

1

2000

0

2000

TITLE

2001

1862

139

3

1859

8

NAME

2001

2000

1

1980

20

1990

GENDER

2001

1853

148

0

1853

2

BUSINESS

2001

1670

331

1629

41

1649

ADDRESS1

2001

1999

2

1926

73

1954

ADDRESS2

2001

1921

80

554

1367

839

ADDRESS3

2001

1032

969

278

754

379

POSTCODE

2001

1762

239

1604

158

1672

AREA_CODE

2001

1884

117

64

1820

270

TEL_NO

2001

1994

7

1875

119

1934

EMAIL

2001

1936

65

1904

32

1920

ACC_MGR

2001

1996

5

0

1996

30

DT_PURCHASED

2001

1998

3

1090

908

1499

DT_ACC_OPEN

2001

1998

3

1093

905

1500

DT_LAST_PAYMENT

2001

1997

4

1026

971

1425

DT_LAST_PO_RAISED

2001

1998

3

1003

995

1433

BALANCE

2001

1999

2

7

1992

10

In most cases, drilling down on the numbers in the Summary View will take you directly to the records. However, some numbers take you to an interim view.

  • If you drill down on the 41 duplicate BUSINESS values in the Summary View, EDQ shows the frequency of each duplicate value.

  • If you drill down on the 8 distinct TITLE values in the Summary View, EDQ shows the frequency of each distinct value.

  • If you drill down on the 239 POSTCODE values without any data in the Summary View, EDQ shows a summary view of the different types of No Data found (though note that all of these will be Null values if the default No Data Handling Reference Data map is used in the Snapshot).