![]() Previous |
![]() Next |
The DEFINE command with the VARIABLE keyword adds a new variable object to an analytic workspace. Variables store one type of data, which can be numeric, text, Boolean, or dates. Beside the data type of a variable, the definition that you create for a variable also determines the following characteristics of the variable:
The number of elements that are actually created in the array that is the variable.
The logical order of the variable's elements.
Whether the variable's data is stored permanently or is only available for the session.
The number of LOBs that Oracle OLAP creates for the variable's data.
You can also define local program variables using a VARIABLE command. These variables exist only when the program is running.
Note: Defining a variable merely adds the definition of the variable to the analytic workspace; it does not populate the variable. To populate variables using the OLAP DML, you can issue OLAP DML SQL, FILEREAD, SET, or SET1 statements. |
Syntax
DEFINE name [VARIABLE] datatype [<dims...>] [WITH NULLTRACKING] [WITH AGGCOUNT] -
[PERMANENT | TEMP ] -
[ RANSPACE64] [(partition-instance...)] [WIDTH n] [AW workspace] [SESSION]
where:
dims are the dimensions of the variable separated by commas. For a dimension of a variable you can specify a dimension object, a partition template object, a named uncompressed composite, a compressed composite, or an unnamed uncompressed composite using one of the following:
Note: The order in which you list the dims of a variable is the default order of the dimensions and behavior of a variety of statements (such as REPORT and UNRAVEL) and affects how the data for the variable is stored (as discussed in "Effect of Dimension Order on Variable Storage and Statement Looping". Also, when you define multiple objects with the same dimensions, most operations work much more efficiently when you list the dimensions in the same order in each definition. |
partition-instance are the partitions of the variable separated by commas. Use the following syntax to specify a partition.
PARTITION partition-name INTERNAL [TEMP | PERMANENT]
Parameters
The name of the variable you are defining. For general information about this argument, see the main entry for the DEFINE command.
The object type when you are defining a variable. You do not have to include the word VARIABLE, because it is the default.
The data type of the data to be stored in the variable. The data types, their abbreviations, and the range of acceptable values are shown in Table: Summary of OLAP DML Data Types.
The name of a simple, concat, conjoint, or alias dimension that you have previously defined using a DEFINE DIMENSION statement. In this case, you specify the name of the dimension.
When defining a TEXT
, NTEXT
, or RAW
variable, specify this keyword to increase the maximum number of characters for the values of the variable from nearly 2**32
to nearly 2**64
.
The name of a partition template object that you have previously defined using a DEFINE PARTITION TEMPLATE statement. For dims, specify the names of the dimensions of the partition template object. These dimensions must be the same dimensions as those used to define the partition template object.
The name of an uncompressed composite previously defined using a DEFINE COMPOSITE statement. For the optional basedims argument, specify the names, separated by commas, of the dimensions used to define the composite.
The name of a compressed composite previously defined using a DEFINE COMPOSITE statement. For the optional basedims argument, specify the names, separated by commas, of the dimensions used to define the composite.
When defining a variable that is dimensioned by a compressed composite, keep the following points in mind:
A compressed composite can dimension only one variable or one partition of a variable. A compressed composite cannot be a shared composite.
The compressed composite must be the last dimension in the variable's dimension list of the DEFINE VARIABLE statement that defines the variable.
Indicates that you want Oracle OLAP to create an unnamed composite and use it when dimensioning the variable. For the basedims argument, specify the names of the dimensions, separated by commas, for which the unnamed composite is created.
When the variable is dimensioned by a composite, specifies that Oracle OLAP create NA2 bits for the cells of the variable.
See also: For more information on:
|
Specifies that Oracle OLAP automatically creates an INTEGER
variable in which it stores the non-NA
counts of the number of leaf nodes that contributed to aggregate values calculated for RELATION statements that have an AVERAGE, HWAVERAGE, or WAVERAGE operator. You must include this phrase to calculate average aggregations for a variable dimensioned by a compressed composite. For more information on Aggcount variables, see "Aggcount Variables".
Specifies that a variable or a partition of a variable is either permanent or temporary. After you update and commit, the definition of both permanent and temporary variables and partitions is always saved between sessions. Specifying permanent or temporary determines whether or not the values of a variable or partition of a variable are saved or discarded, after you update and commit, when you leave end your session or switch to another workspace:
Permanent variables and partitions—Oracle OLAP saves the data values or a permanent variable or permanent partitions. When you start the workspace, the data values or a permanent variable or permanent partitions are the same as they were at the last commit.
Temporary variables and partitions—Oracle OLAP discards the data values of a temporary variable or temporary partition. Each time you start the workspace, the values of a temporary variable or temporary partition are NA
.
Keep the following points in mind when specifying the PERMANENT and TEMP keywords:
By default, a variable is permanent.
Temporary variables can be dimensioned by partition template objects or by temporary dimensions.
By default, a top-level partition of a variable has the same permanence as the variable that contains it. Specifically, a partition of a temporary variable is a temporary partition unless you use the PEMANENT keyword to make it a permanent partition, and a partition of a permanent variable is a permanent partition unless you use the TEMPORARY keyword to make it a temporary partition. To indicate different behavior, use either the PERMANENT or TEMP keyword.
By default, a subpartition has the same permanence as its parent partition. To indicate different behavior, use either the PERMANENT or TEMP keyword.
(You can abbreviate WIDTH as W.) The width, in bytes, of the storage area for each value of a variable. When you are using a multibyte character set, be sure to specify the number of bytes, not characters.
You specify fixed widths to create faster and more compact data storage formats. You can specify fixed widths for dimensioned TEXT, NTEXT, and INTEGER variables only, as described in the following list:
For dimensioned TEXT and NTEXT variables, you can specify a width from 1 byte through 4,000 bytes. Specify a fixed width for such variables only when you are certain that the values of a particular variable are of similar size. You cannot assign a width to a scalar variable.
For dimensioned INTEGER variables, you can specify a width of 1 byte only. Define a fixed width INTEGER variable only when you are certain that all the values for that variable are between -128 and 127.
The default widths for variables are: 2 bytes for SHORTINTEGER, 4 bytes for DATE, INTEGER, and SHORTDECIMAL, and 8 bytes for DECIMAL and ID. TEXT and NTEXT variables that do not have fixed widths are stored on two sets of pages. The first set contains 4-byte cells, each of which points to the actual text value that is stored in the other set of pages. The default width of 4 bytes for TEXT and NTEXT variables is for these 4-byte cells.
Specifies a partition of the variable where partition-name is the name of the partition.
When defining the partitions of a variable dimensioned by a compressed composite, keep the following points in mind:
A compressed composite can dimension only one partition.
The partitions of a variable dimensioned by a compressed composite must respect the parent-child relationships of the hierarchical dimensions. When an AGGREGATE command executes, data cannot be aggregated across partitions. To verify if a variable is partitioned correctly, use the PARTITIONCHECK function.
The name of an attached workspace in which you want to define the variable. When the variable is dimensioned, it must be defined in the same workspace as its dimensions. For general information about this argument, see the main entry for the DEFINE command.
Specifies that the object exists only in the current session. When the session ends, the object no longer exists. The behavior specified by SESSION is different than the be3havior specified by the TEMP keyword which is that the values are temporary but the object definition remains in the workspace in which you create it.
Usage Notes
Aggcount Variables
When you include the WITH AGGCOUNT phrase in a DEFINE VARIABLE statement, Oracle OLAP automatically creates the variable specified in the DEFINE statement and a secondary variable (often called the Aggcount variable). The Aggcount variable is an INTEGER
variable that Oracle OLAP uses when performing average aggregations for the defined variable. When resolving RELATION statements that have an AVERAGE, HAVERAGE, WAVERAGE, or HWAVERAGE operator and that do not have a COUNT NO phrase, Oracle OLAP stores the non-NA
counts of the number of leaf nodes that contribute to the average aggregate values in the Aggcount variable.
Most statements that maintain a variable also automatically maintain an associated Aggcount variable. For example, an EXPORT statement exports both a variable and its associated Aggcount variable, and a CLEAR statement clears both the variable and the related portions of the associated Aggcount variable. Additionally, some OLAP DML statements are specific to the use of Aggcount objects. Table: OLAP DML Statements for Aggcount Variables lists these statements.
OLAP DML Statements for Aggcount Variables
Statement | Keywords | Description |
---|---|---|
|
WITH AGGCOUNT |
Defines a variable and an associated Aggcount variable. |
|
Retrieves the values of the Aggcount variable associated with the specified variable. |
|
|
ADD|DROP AGGCOUNT |
Adds or drops an Aggcount variable for the specified variable. |
|
HASAGGCOUNT |
Returns a |
NA2 Bits and Null Tracking
Relational fact tables sometimes have null facts (that is, facts that have a null value). Typically, when OLAP DML creates a variable dimensioned by a composite, it does not create a composite tuple for an NA (or null) value. Given this typical behavior, OLAP DML variables would not correspond to their base relational fact table because the variables would eliminate the null facts.
To support OLAP DML composite-dimensioned variables that correspond to relational fact tables with null facts, OLAP has a special NA bit called an NA2 bit. These NA2 bits tracks whether or not each cell of the variable has null value because the underlying relational table has a null fact. When the corresponding fact table has a null fact, you want Oracle OLAP to intentionally include an NA value in the composite tuples for the variable and NA2 bits are used by Oracle OLAP to do just that. NA2 bits are used by Oracle OLAP when it populates variables using the SQL IMPORT command, the AGGREGATE command, and variables that were created as materialized views. It is also used by Oracle OLAP when it populates a relational table using the OLAP_TABLE SQL function. Additionally, Oracle OLAP recognizes NA2 values when evaluating expressions using arithmetic and Boolean operators.
The OLAP DML provides the following statements for working with variables that have NA2 bits:
To create a variable with NA2 bits, use the DEFINE VARIABLE statement with the NULLTRACKING phrase.
To add NA2 bits to a variable that does not have NA2 bits, use the CHGDFN statement with the NULLTRACKING phrase.
To remove NA2 bits from a variable that has NA2 bits, use the CHGDFN statement with the DROP NULLTRACKING phrase.
For testing and debugging purposes, use the NA2 function to set one or more of the NA2 bit of a variable to TRUE. Use the NAFLAG function to identify if one or more values of a variable are NA values and, if so, if the NA value is just the typical NA values that OLAP should ignore or both the typical NA value and also an NA2 value.
Defining Very Large Variables
Theoretically, a variable can contain up to 2**63
cells and a TEXT or NTEXT variable can contain up to 2 billion bytes. However, the page size determines if a variable can be stored entirely on a page or how many variables can be stored on a page. To calculate the maximum number of values for a variable of a given width that fit on one page, use the VALSPERPAGE program.
Effect of Dimension Order on Variable Storage and Statement Looping
The order in which you list the dimensions of a variable definition determines the order in which the elements of the variable are stored and, consequently, how the data is accessed. The first dimension in the variable definition is the fastest-varying dimension, and the last dimension is the slowest-varying dimension.
For example, assume your analytic workspace has an opcosts
variable that contains the operating costs, by month, of each city in which you have offices. In the following definition for the opcosts
variable, month
is the fastest-varying dimension and city
is the slowest-varying dimension.
DEFINE opcosts VARIABLE DECIMAL <month city>
The data for a multidimensional variable is stored as a linear stream of values, in which the values of the fastest-varying dimension are clustered. For example, for the opcosts
variable, the values for Boston for all the months are stored in a sequence, and then it stores the values for Chicago for all the months in a sequence, and so on. Thus the month values vary fastest in the opcosts
variable, as shown in the following table.
When you define variables and other dimensioned objects, and when you write programs that loop over multidimensional expressions in nested loops, always try to maximize performance by matching the fastest-varying dimension with the inner loop.
Unnamed Composites
Oracle OLAP automatically defines an unnamed composite when a DEFINE VARIABLE statement with a SPARSE <dimlist> phrase executes. An unnamed composite can have either a b-tree or hash index. The type of index is determined by the value of the SPARSEINDEX option when Oracle OLAP defines an unnamed composite.
Once Oracle OLAP has created a definition for an unnamed composite for a certain dimension list, it uses that composite any time you define a variable with the same SPARSE <dimlist> phrase. Thus all variables that are defined with the same SPARSE <dimlist> phrase share the same unnamed composite. For more information on sharing composites, see "Shared Composites".
Variable Segments
Within a partition, variable data is stored in analytic workspace segments. An analytic workspace segment is a group of logically contiguous analytic workspace pages. By default, the segment sizes of a variable are automatically determined by Oracle OLAP. Each segment is the exactly the number of analytic workspace pages needed to store the values assigned by the one OLAP DML statement. You can explicitly specify a segment size for a variable using the SEGWIDTH keyword of the CHGDFN command. In this case, when you assign values to a variable, Oracle OLAP stores the data assigned by multiple OLAP DML statements into a segment until the segment is full.
Examples
Defining an INTEGER Variable with One Regular Dimension
This example adds the variable population
to an analytic workspace. It is dimensioned by city
, which has already been defined in the workspace. The LD Statement attaches a description to the object. The statements
DEFINE population INTEGER <city> LD Population in each city DESCRIBE population
produce the following description.
DEFINE POPULATION VARIABLE INTEGER <CITY> LD Population in each city
Defining a Single-Cell Variable
The following is a definition for a variable named newdata
which is a single Boolean value. It has no dimensions. An application might set it to YES
when new data is added to the workspace and to NO
after a user views the data.
DEFINE newdata BOOLEAN newdata = YES
Defining NUMBER Variables
The following statement defines a NUMBER
variable named sales
and dimensioned by product
and geography
with a precision of 16 digits and a scale of 4 digits.
DEFINE sales VARIABLE NUMBER (16,4) <product, geography>
The following statements define a NUMBER
variable named numvar
with 5 significant digits and 2 decimal places. The number 1234567 is out of its range.
DEFINE numvar VARIABLE NUMBER (5, 2) numvar = 1234567 SHOW numvar NA
A negative scale defines a NUMBER
variable named numnegvar
with 5 significant digits and 2 rounded digits to the left of the decimal point. The number 1,234,567 is rounded up.
DEFINE numnegvar VARIABLE NUMBER (5, -2) numnegvar = 1234567 SHOW numnegvar 1,234,600.00
Defining a Variable Dimensioned by Two Regular Dimensions
Assume that you have an analytic workspace that contains the following definitions for dimensions, relations, and aggmaps.
DEFINE GEOG_CITY DIMENSION TEXT DEFINE GEOG_STATE DIMENSION TEXT DEFINE GEOG_AREA DIMENSION TEXT DEFINE GEOG_CONT DIMENSION TEXT DEFINE GEOG DIMENSION CONCAT (GEOG_CITY GEOG_STATE GEOG_AREA GEOG_CONT) DEFINE PROD_UPC DIMENSION TEXT DEFINE PROD_FAMILY DIMENSION TEXT DEFINE PROD_DIV DIMENSION TEXT DEFINE PROD_TOP DIMENSION TEXT DEFINE PROD DIMENSION CONCAT (PROD_UPC PROD_FAMILY PROD_DIV PROD_TOP) DEFINE GEOGLEVEL DIMENSION TEXT DEFINE PRODLEVEL DIMENSION TEXT DEFINE GEOG.PARENT RELATION GEOG <GEOG> DEFINE PROD.PARENT RELATION PROD <PROD> DEFINE GEOG.LEVELREL RELATION GEOGLEVEL <GEOG> DEFINE PROD.LEVELREL RELATION PRODLEVEL <PROD> DEFINE GEOG.FAMILYREL RELATION GEOG <GEOG GEOGLEVEL> DEFINE PROD.FAMILYREL RELATION PROD <PROD PRODLEVEL> DEFINE SALES_DIMS_REG VARIABLE NUMBER (12,0) <PROD GEOG> DEFINE SALES_AGGMAP AGGMAP AGGMAP RELATION geog.parent RELATION prod.parent END
The two parent relations (prod.parent
and geog.parent
) have the following values.
PROD PROD.PARENT ------------------------- ------------------------- <PROD_UPC: ColorTV> <PROD_FAMILY: TV> <PROD_UPC: BWTV> <PROD_FAMILY: TV> <PROD_UPC: StndVCR> <PROD_FAMILY: VCR> <PROD_UPC: StrVCR> <PROD_FAMILY: VCR> <PROD_FAMILY: VCR> <PROD_DIV: VideoDiv> <PROD_FAMILY: TV> <PROD_DIV: VideoDiv> <PROD_DIV: VideoDiv> <PROD_TOP: Total Prod> <PROD_TOP: Total Prod> NA GEOG GEOG.PARENT ------------------------- ------------------------- <GEOG_CITY: Canberra> <GEOG_STATE: ACT> <GEOG_CITY: Sydney> <GEOG_STATE: NSW> <GEOG_CITY: Darwin> <GEOG_STATE: NT> <GEOG_CITY: Brisbane> <GEOG_STATE: QLD> <GEOG_CITY: Adelaide> <GEOG_STATE: SA> <GEOG_CITY: Hobart> <GEOG_STATE: TAS> <GEOG_CITY: Melbourne> <GEOG_STATE: VIC> <GEOG_CITY: Perth> <GEOG_STATE: WA> <GEOG_STATE: ACT> <GEOG_AREA: Aust Terr> <GEOG_STATE: NSW> <GEOG_AREA: Aust State> <GEOG_STATE: NT> <GEOG_AREA: Aust Terr> <GEOG_STATE: QLD> <GEOG_AREA: Aust State> <GEOG_STATE: SA> <GEOG_AREA: Aust State> <GEOG_STATE: TAS> <GEOG_AREA: Aust State> <GEOG_STATE: VIC> <GEOG_AREA: Aust State> <GEOG_STATE: WA> <GEOG_AREA: Aust State> <GEOG_AREA: Aust State> <GEOG_CONT: Australia> <GEOG_AREA: Aust Terr> <GEOG_CONT: Australia> <GEOG_CONT: Australia> NA
Assume that you aggregate sales_dims_reg
using sales_aggmap
). Now assume that you issue the following REPORT statement for a report of the sales_dims_reg
variable.
REPORT sales_dims_reg->REPORT sales_dims_reg
As you can see from the output of the REPORT statement, the sales_dims_reg
variable is a sparsely populated variable with 152 cells, many of which contain NA
values.
----------------------------SALES_DIMS_REG----------------------------- ---------------------------------PROD---------------------------------- <PROD_DI <PROD_UP <PROD_UP <PROD_UP <PROD_FA <PROD_FA V: <PROD_TO C: <PROD_UP C: C: MILY: MILY: VideoDiv P: Total GEOG ColorTV> C: BWTV> StndVCR> StrVCR> VCR> TV> > Prod> ------------------------- -------- -------- -------- -------- -------- -------- -------- -------- <GEOG_CITY: Canberra> 11,592.0 NA 38,356.0 3,444.00 41,800.0 11,592.0 53,392.0 53,392.0 <GEOG_CITY: Sydney> NA NA NA NA NA NA NA NA <GEOG_CITY: Darwin> 24,868.0 NA 22,104.0 32,667.0 54,771.0 24,868.0 79,639.0 79,639.0 <GEOG_CITY: Brisbane> 49,556.0 NA 48,239.0 24,285.0 72,524.0 49,556.0 122,080 122,080 <GEOG_CITY: Adelaide> NA NA NA NA NA NA NA NA <GEOG_CITY: Hobart> 17,223.0 NA 18,872.0 48,780.0 67,652.0 17,223.0 84,875.0 84,875.0 <GEOG_CITY: Melbourne> NA 22,000.0 NA NA NA 22,000.0 22,000.0 22,000.0 <GEOG_CITY: Perth> NA NA NA NA NA NA NA NA <GEOG_STATE: ACT> 11,592.0 NA 38,356.0 3,444.00 41,800.0 11,592.0 53,392.0 53,392.0 <GEOG_STATE: NSW> NA NA NA NA NA NA NA NA <GEOG_STATE: NT> 24,868.0 NA 22,104.0 32,667.0 54,771.0 24,868.0 79,639.0 79,639.0 <GEOG_STATE: QLD> 49,556.0 NA 48,239.0 24,285.0 72,524.0 49,556.0 122,080 122,080 <GEOG_STATE: SA> NA NA NA NA NA NA NA NA <GEOG_STATE: TAS> 17,223.0 NA 18,872.0 48,780.0 67,652.0 17,223.0 84,875.0 84,875.0 <GEOG_STATE: VIC> NA 22,000.0 NA NA NA 22,000.0 22,000.0 22,000.0 <GEOG_STATE: WA> NA NA NA NA NA NA NA NA <GEOG_AREA: Aust State> 66,779.0 22,000.0 67,111.0 73,065.0 140,176 88,779.0 228,955 228,955 <GEOG_AREA: Aust Terr> 36,460.0 NA 60,460.0 36,111.0 96,571.0 36,460.0 133,031 133,031 <GEOG_CONT: Australia> 103,239 22,000.0 127,571 109,176 236,747 125,239 361,986 361,986
Because the sales_dims_reg
variable is dimensioned by two regular dimensions (rather than by composites or concat dimensions), the values of all of its cells (even those with an NA
value) are stored in variable. You can confirm the number of physical values stored in the workspace by issuing the following statement.
SHOW OBJ(NUMVALS 'sales_dims_reg') 152.00
The result of the statement is that the value 152.00
displays which indicates that every value in the 152 cells of the sales_dims_reg
variable (even the NA
values) are stored as part of the variable.
Defining a Variable Dimensioned by an Uncompressed Composite
Assume that you have created an analytic workspace with the same dimensions, relations, and aggmap as those in Example: Defining a Variable Dimensioned by Two Regular Dimensions. Now assume that you define a composite and a variable dimensioned by that composite by issuing the following statements.
DEFINE COMP_PROD_GEOG COMPOSITE <PROD GEOG> DEFINE SALES_DIMS_COMPOSITE VARIABLE NUMBER (12,2) <COMP_PROD_GEOG <PROD GEOG>>
Assume that you populate sales_dims_composite
with the same base values as you did sales_dims_reg
in Example: Defining a Variable Dimensioned by Two Regular Dimensions, and that you aggregate sales_dims_composite
using the same aggmap (that is, sales_aggmap
) and issue the following. REPORT statement for the sales_dims_composite
variable.
REPORT sales_dims_composite
A report for the sales_dims_composite
variable displays the same 152 cells as the report for the sales_dims_reg
variable.
-------------------------SALES_DIMS_COMPOSITE-------------------------- ---------------------------------PROD---------------------------------- <PROD_DI <PROD_UP <PROD_UP <PROD_UP <PROD_FA <PROD_FA V: <PROD_TO C: <PROD_UP C: C: MILY: MILY: VideoDiv P: Total GEOG ColorTV> C: BWTV> StndVCR> StrVCR> VCR> TV> > Prod> ------------------------- -------- -------- -------- -------- -------- -------- -------- -------- <GEOG_CITY: Canberra> 11,592.0 NA 38,356.0 3,444.00 41,800.0 11,592.0 53,392.0 53,392.0 <GEOG_CITY: Sydney> NA NA NA NA NA NA NA NA <GEOG_CITY: Darwin> 24,868.0 NA 22,104.0 32,667.0 54,771.0 24,868.0 79,639.0 79,639.0 <GEOG_CITY: Brisbane> 49,556.0 NA 48,239.0 24,285.0 72,524.0 49,556.0 122,080 122,080 <GEOG_CITY: Adelaide> NA NA NA NA NA NA NA NA <GEOG_CITY: Hobart> 17,223.0 NA 18,872.0 48,780.0 67,652.0 17,223.0 84,875.0 84,875.0 <GEOG_CITY: Melbourne> NA 22,000.0 NA NA NA 22,000.0 22,000.0 22,000.0 <GEOG_CITY: Perth> NA NA NA NA NA NA NA NA <GEOG_STATE: ACT> 11,592.0 NA 38,356.0 3,444.00 41,800.0 11,592.0 53,392.0 53,392.0 <GEOG_STATE: NSW> NA NA NA NA NA NA NA NA <GEOG_STATE: NT> 24,868.0 NA 22,104.0 32,667.0 54,771.0 24,868.0 79,639.0 79,639.0 <GEOG_STATE: QLD> 49,556.0 NA 48,239.0 24,285.0 72,524.0 49,556.0 122,080 122,080 <GEOG_STATE: SA> NA NA NA NA NA NA NA NA <GEOG_STATE: TAS> 17,223.0 NA 18,872.0 48,780.0 67,652.0 17,223.0 84,875.0 84,875.0 <GEOG_STATE: VIC> NA 22,000.0 NA NA NA 22,000.0 22,000.0 22,000.0 <GEOG_STATE: WA> NA NA NA NA NA NA NA NA <GEOG_AREA: Aust State> 66,779.0 22,000.0 67,111.0 73,065.0 140,176 88,779.0 228,955 228,955 <GEOG_AREA: Aust Terr> 36,460.0 NA 60,460.0 36,111.0 96,571.0 36,460.0 133,031 133,031 <GEOG_CONT: Australia> 103,239 22,000.0 127,571 109,176 236,747 125,239 361,986 361,986
However, because the sales_dims_comp
variable is dimensioned by a composite, the 65 cells that display as NA
values are not stored in variable. You can confirm the number of physical values stored in the workspace by issuing the following statement that calls the OBJ function with the NUMVALS keyword on sales_dims_composite
.
SHOW OBJ(NUMVALS 'sales_dims_composite') 87.00
The result of the statement is that the value 87.00
displays which indicates that only the 87 non-NA
values are stored as part of the sales_dims_composite
variable.
Defining a Variable Dimensioned by a Compressed Composite
Assume that you have created an analytic workspace with the same dimensions, relations, and aggmap as those in Example: Defining a Variable Dimensioned by Two Regular Dimensions. Now assume that you define a composite and a variable dimensioned by that composite by issuing the following statements.
DEFINE CC_COMP_PROD_GEOG COMPOSITE <PROD GEOG> COMPRESSED DEFINE SALES_DIMS_COMP_COMPOSITE VARIABLE NUMBER (12,0) <CC_COMP_PROD_GEOG <PROD GEOG>>
Assume that you populate sales_dims_composite
with the same base values as you did sales_dims_reg
in Example: Defining a Variable Dimensioned by Two Regular Dimensions, and that you aggregate sales_dims_comp_composite
using the same aggmap (that is, sales_aggmap
). Now you issue the following statement.
REPORT sales_dims_comp_composite
A report for the sales_dims_comp_comp_composite
variable displays the same 152 cells as the report for the sales_dims_reg
variable.
-----------------------SALES_DIMS_COMP_COMPOSITE----------------------- ---------------------------------PROD---------------------------------- <PROD_DI <PROD_UP <PROD_UP <PROD_UP <PROD_FA <PROD_FA V: <PROD_TO C: <PROD_UP C: C: MILY: MILY: VideoDiv P: Total GEOG ColorTV> C: BWTV> StndVCR> StrVCR> VCR> TV> > Prod> ------------------------- -------- -------- -------- -------- -------- -------- -------- -------- <GEOG_CITY: Canberra> 11,592.0 NA 38,356.0 3,444.00 41,800.0 11,592.0 53,392.0 53,392.0 <GEOG_CITY: Sydney> NA NA NA NA NA NA NA NA <GEOG_CITY: Darwin> 24,868.0 NA 22,104.0 32,667.0 54,771.0 24,868.0 79,639.0 79,639.0 <GEOG_CITY: Brisbane> 49,556.0 NA 48,239.0 24,285.0 72,524.0 49,556.0 122,080 122,080 <GEOG_CITY: Adelaide> NA NA NA NA NA NA NA NA <GEOG_CITY: Hobart> 17,223.0 NA 18,872.0 48,780.0 67,652.0 17,223.0 84,875.0 84,875.0 <GEOG_CITY: Melbourne> NA 22,000.0 NA NA NA 22,000.0 22,000.0 22,000.0 <GEOG_CITY: Perth> NA NA NA NA NA NA NA NA <GEOG_STATE: ACT> 11,592.0 NA 38,356.0 3,444.00 41,800.0 11,592.0 53,392.0 53,392.0 <GEOG_STATE: NSW> NA NA NA NA NA NA NA NA <GEOG_STATE: NT> 24,868.0 NA 22,104.0 32,667.0 54,771.0 24,868.0 79,639.0 79,639.0 <GEOG_STATE: QLD> 49,556.0 NA 48,239.0 24,285.0 72,524.0 49,556.0 122,080 122,080 <GEOG_STATE: SA> NA NA NA NA NA NA NA NA <GEOG_STATE: TAS> 17,223.0 NA 18,872.0 48,780.0 67,652.0 17,223.0 84,875.0 84,875.0 <GEOG_STATE: VIC> NA 22,000.0 NA NA NA 22,000.0 22,000.0 22,000.0 <GEOG_STATE: WA> NA NA NA NA NA NA NA NA <GEOG_AREA: Aust State> 66,779.0 22,000.0 67,111.0 73,065.0 140,176 88,779.0 228,955 228,955 <GEOG_AREA: Aust Terr> 36,460.0 NA 60,460.0 36,111.0 96,571.0 36,460.0 133,031 133,031 <GEOG_CONT: Australia> 103,239 22,000.0 127,571 109,176 236,747 125,239 361,986 361,986
However, because the sales_dims_comp_comp
variable is dimensioned by a compressed composite not all of values in all of the cells are stored in the variable. The 65 cells that display as NA
values are not stored in variable, Also, the values that are "passed up" the hierarchy are stored only once — at the lowest level of the hierarchy.
You can confirm the number of physical values stored in the workspace by issuing the following statement that calls the OBJ function with the NUMVALS keyword on sales_dims_comp_composite
.
SHOW OBJ(NUMVALS 'sales_dims_comp_composite') 38.00
The result of the statement is that the value 38.00
displays which indicates that only 38 values are stored as part of the sales_dims_comp_composite
variable. These values are shown in the following table.
GEOG | PROD_UPC:ColorTV | PROD_UPC:BWTV | PROD_UPC:StandVCR | PROD_UPC:StrVCR | PROD_FAMILY: VCR | PROD_FAMILY: TV | PROD_DIV: VideoDiv |
---|---|---|---|---|---|---|---|
GEOG_CITY: Canberra | 11,592.0 | 38,356.0 | 3,444.00 | 41,800.0 | 53,392.0 | ||
GEOG_CITY: Darwin | 24,868.0 | 22,104.0 | 32,667.0 | 54,771.0 | 79,639.0 | ||
GEOG_CITY: Brisbane | 49,556.0 | 48,239.0 | 24,285.0 | 72,524.0 | 122,080 | ||
GEOG_CITY: Hobart | 17,223.0 | 18,872.0 | 48,780.0 | 67,652.0 | 84,875.0 | ||
GEOG_CITY: Melbourne | 22,000.0 | ||||||
GEOG_AREA: Aust State | 66,779.0 | 67,111.0 | 73,065.0 | 140,176 | 88,779.0 | 228,955 | |
GEOG_AREA: Aust Terr | 36,460.0 | 60,460.0 | 36,111.0 | 96,571.0 | 133,031 | ||
GEOG_Cont: Australia | 103,239 | 127,57 | 109,176 | 236,747 | 125,239 | 361,986 |
Defining a Variable with Partitions
Assume that you want to define a sales
variable that is dimensioned by product and time and that is partitioned so that each year's data is in a separate partition.
Assume that the analytic workspace contains a products
dimension, a time
dimension that is a simple hierarchical dimension with three levels of data (day, month, and year), and a time_parentrel
relation that represents the child-parent relationships between the values of time.
DEFINE TIME DIMENSION TEXT DEFINE PRODUCT DIMENSION TEXT DEFINE TIME_PARENTREL RELATION TIME <TIME>
For simplicity's sake, in this example the time
and product
dimensions are only partially populated and have only the following values.
TIME -------------- 2003 2002 Dec2003 Jan2003 Dec2002 Jan2002 31Dec2003 01Dec2003 31Jan2003 01Jan2003 31Dec2002 01Dec2002 31Jan2002 01Jan2002 PRODUCT ------- 00001 00002
To create the partitioned variable, take the following steps:
Define a partition template that defines one partition for each year's data.
DEFINE partition_sales_by_year PARTITION TEMPLATE <time product> - PARTITION BY LIST (time)(- PARTITION time_2003 VALUES ('2003', 'Dec2003', 'Jan2003', '31Dec2003', '01Dec2003', '31Jan2003', '01Jan2003') <time product>- PARTITION time_2002 VALUES ('2002', 'Dec2002', 'Jan2002', '31Dec2002', '01Dec2002', '31Jan2002', '01Jan2002') <time product>)
(note that for simplicity's sake, only some of each year's dimension values are specified for each partition in this example. Typically, when you want to specify a large number of values for a partition, you do not do so within the DEFINE PARTITION STATEMENT statement. Instead, you define the partition without specifying any values, and then later specify the values using MAINTAIN ADD TO PARTITION or MAINTAIN MOVE TO PARTITION statements as illustrated in Example: Specifying the Values of a Partition Using Valuesets.)
Define a partitioned sales
variable with the partitions defined by the partition template named partition_sales_by_year
.
DEFINE sales DECIMAL <partition_sales_by_year<time product>>
After you populate sales with day values, you can issue the following REPORT statement to see which sales
values are in which partition.
REPORT DOWN PARTITION(partition_sales_by_year) time product sales PARTITION(PARTITION_SALES_BY_YEAR) TIME PRODUCT SALES ----------------------------------- ---------- ---------- ---------- TIME_2003 2003 00001 NA TIME_2003 Dec2003 00001 NA TIME_2003 Jan2003 00001 NA TIME_2003 31Dec2003 00001 14.78 TIME_2003 01Dec2003 00001 15.52 TIME_2003 31Jan2003 00001 13.61 TIME_2003 01Jan2003 00001 10.39 TIME_2003 2003 00002 NA TIME_2003 Dec2003 00002 NA TIME_2003 Jan2003 00002 NA TIME_2003 31Dec2003 00002 16.05 TIME_2003 01Dec2003 00002 12.27 TIME_2003 31Jan2003 00002 10.83 TIME_2003 01Jan2003 00002 11.07 TIME_2002 2002 00001 NA TIME_2002 Dec2002 00001 NA TIME_2002 Jan2002 00001 NA TIME_2002 31Dec2002 00001 18.80 TIME_2002 01Dec2002 00001 13.64 TIME_2002 31Jan2002 00001 12.41 TIME_2002 01Jan2002 00001 16.97 TIME_2002 2002 00002 NA TIME_2002 Dec2002 00002 NA TIME_2002 Jan2002 00002 NA TIME_2002 31Dec2002 00002 17.47 TIME_2002 01Dec2002 00002 16.58 TIME_2002 31Jan2002 00002 18.94 TIME_2002 01Jan2002 00002 18.36
Defining a Fixed-Width TEXT Variable
The following statement defines a TEXT variable named lastname
dimensioned by employee
. Values in lastname
are limited to 20 characters, so that longer values are truncated.
DEFINE lastname TEXT <employee> WIDTH 20
Defining a Variable That Uses a Named B-Tree Composite
Assume that you have the following dimensions in your analytic workspace.
DEFINE month DIMENSION TEXT DEFINE product DIMENSION TEXT DEFINE region DIMENSION TEXT
When your company does promotional marketing for certain products in some but not all regions, then your variable data is sparse along the product
and region
dimensions. Therefore, suppose you define a composite named proddist
, whose base dimensions are product
and region
. There are dimension-value combinations in the composite only for those values that have data. For example, when you run a promotion for tents but not skis, then the composite includes the tents and region combinations, but not the skis and region combinations.
The following statement creates a b-tree composite named proddist
whose base dimensions are product
and district
, and a variable called promo
that is dimensioned by month
and proddist
.
DEFINE proddist COMPOSITE <product region> DEFINE promo VARIABLE INTEGER <month proddist <product district>>
For simplicity's sake assume that you have only stored the following dimension data in your analytic workspace.
PRODUCT -------------- Tents Skis REGION -------------- Northeast Southwest MONTH -------------- Jan2003 Feb2003 Mar2003 Apr2003 May2003 Jun2003 Jul2003 Aug2003 Sep2003 Oct2003 Nov2003 Dec2003
You decide to run a promotional sales for skis in the Northeast region in the month of September, 2003 at a cost of $5,000. Once you populate promo
with this, promo
contains only 12 cells—each cell is dimensioned by a value of month
and the composite tuple value of <'Skis' 'Northeast'>
for proddist
. The cell for September 2003 contains the value $5,000, and all of the other cells contain NA
. No other NA values are stored in promo; there are no cells are created for any other values of product
or region
.
Defining a Variable with Null Tracking
Assume that you have the following objects defined in your analytic workspace.
DEFINE GEOG DIMENSION TEXT LD A dimension with a simple hierarchy for geography DEFINE geog_levellist DIMENSION TEXT LD List of Levels in in the hierarchy of the geog dimension DEFINE GEOG_PARENTREL RELATION GEOG <GEOG> LD Self-relation for geog showing parents of each value in the hierarchy DEFINE GEOG_LEVELREL RELATION GEOG_LEVELLIST <GEOG> LD Level of each dimension member for geog DEFINE product DIMENSION TEXT LD A nonhierarchical dimension DEFINE time DIMENSION TEXT LD A hierarchical text dimension for time DEFINE time_levellist DIMENSION TEXT LD List of Levels in hierarchy of the time dimension DEFINE time_parentrel RELATION time <time> LD A self-relation for time show parents of each value in the hierarchy DEFINE TIME_LEVELREL RELATION TIME_LEVELLIST <TIME> LD Level of each dimension member for time DEFINE prod_geog COMPOSITE <product geog> COMPRESSED
Now assume that you define a sales
variable that you want to have dimensioned by time
and the prod_geog
composite. You want this variable to have null tracking because you eventually populate it using SQL IMPORT and you know that some facts in the fact table have null values. To do this you issue the following statement that includes the WITH NULLTRACKING phrase.
DEFINE sales VARIABLE DECIMAL <time prod_geog<product geog>> WITH NULLTRACKING
For testing purposes, you populate the variable using the RANDOM function. After you populate the variable in this way, you issue a report on it that shows the NA values in the variable.
REPORT DOWN time ACROSS geog: sales PRODUCT: TVs -----------------------SALES----------------------- -----------------------GEOG------------------------ TIME Boston Springfield Hartford All Places -------------- ------------ ------------ ------------ ------------ 2007 NA NA NA NA 2008 NA NA NA NA All years NA NA NA NA Jan07 NA NA NA NA Feb07 NA NA NA NA Mar07 NA NA NA NA Apr07 NA NA NA NA May07 NA NA NA NA Jun07 NA NA NA NA Jul07 NA NA NA NA Aug07 NA NA NA NA Sep07 NA NA NA NA Oct07 NA NA NA NA Nov07 NA NA NA NA Dec07 NA NA NA NA Jan08 NA NA NA NA Feb08 NA NA NA NA Mar08 NA NA NA NA Apr08 NA NA NA NA May08 NA NA NA NA Jun08 NA NA NA NA Jul08 NA NA NA NA Aug08 NA NA NA NA Sep08 NA NA NA NA Oct08 NA NA NA NA Nov08 NA NA NA NA Dec08 NA NA NA NA PRODUCT: Radios -----------------------SALES----------------------- -----------------------GEOG------------------------ TIME Boston Springfield Hartford All Places -------------- ------------ ------------ ------------ ------------ 2007 NA NA NA NA 2008 NA NA NA NA All years NA NA NA NA Jan07 24.59 23.70 33.12 28.65 Feb07 22.78 21.42 26.28 37.06 Mar07 25.74 32.08 22.75 24.62 Apr07 22.23 23.21 20.79 28.68 May07 20.51 29.71 30.35 33.05 Jun07 34.43 35.96 33.85 39.34 Jul07 24.86 38.02 36.78 31.22 Aug07 39.05 21.08 35.80 33.81 Sep07 34.38 21.69 25.04 33.40 Oct07 33.82 39.27 20.28 24.39 Nov07 25.48 23.03 32.45 39.94 Dec07 25.14 30.66 33.75 23.37 Jan08 NA NA NA NA Feb08 NA NA NA NA Mar08 NA NA NA NA Apr08 NA NA NA NA May08 NA NA NA NA Jun08 NA NA NA NA Jul08 NA NA NA NA Aug08 NA NA NA NA Sep08 NA NA NA NA Oct08 NA NA NA NA Nov08 NA NA NA NA Dec08 NA NA NA NA
For testing purposes, you also generate a report using the NAFLAG function to retrieve the type of NAs that are in the variable. As the following report shows, because it was populated using RANDOM, all of the NAs are the typical NA values; they are not NA2 values.
REPORT DOWN time ACROSS geog: NAFLAG(sales) PRODUCT: TVs -------------------NAFLAG(SALES)------------------- -----------------------GEOG------------------------ TIME Boston Springfield Hartford All Places -------------- ------------ ------------ ------------ ------------ 2007 1 1 1 1 2008 1 1 1 1 All years 1 1 1 1 Jan07 1 1 1 1 Feb07 1 1 1 1 Mar07 1 1 1 1 Apr07 1 1 1 1 May07 1 1 1 1 Jun07 1 1 1 1 Jul07 1 1 1 1 Aug07 1 1 1 1 Sep07 1 1 1 1 Oct07 1 1 1 1 Nov07 1 1 1 1 Dec07 1 1 1 1 Jan08 1 1 1 1 Feb08 1 1 1 1 Mar08 1 1 1 1 Apr08 1 1 1 1 May08 1 1 1 1 Jun08 1 1 1 1 Jul08 1 1 1 1 Aug08 1 1 1 1 Sep08 1 1 1 1 Oct08 1 1 1 1 Nov08 1 1 1 1 Dec08 1 1 1 1 PRODUCT: Radios -------------------NAFLAG(SALES)------------------- -----------------------GEOG------------------------ TIME Boston Springfield Hartford All Places -------------- ------------ ------------ ------------ ------------ 2007 1 1 1 1 2008 1 1 1 1 All years 1 1 1 1 Jan07 0 0 0 0 Feb07 0 0 0 0 Mar07 0 0 0 0 Apr07 0 0 0 0 May07 0 0 0 0 Jun07 0 0 0 0 Jul07 0 0 0 0 Aug07 0 0 0 0 Sep07 0 0 0 0 Oct07 0 0 0 0 Nov07 0 0 0 0 Dec07 0 0 0 0 Jan08 1 1 1 1 Feb08 1 1 1 1 Mar08 1 1 1 1 Apr08 1 1 1 1 May08 1 1 1 1 Jun08 1 1 1 1 Jul08 1 1 1 1 Aug08 1 1 1 1 Sep08 1 1 1 1 Oct08 1 1 1 1 Nov08 1 1 1 1 Dec08 1 1 1 1
Again, for testing purposes, you use the NA function to set an NA2 bit on the variable cells dimensioned by the months of 2008. The following code shows the result of issuing a SHOW of the NA2 function and using that function to set the NA2 bit on the cells dimensioned by the months in 2008.
SHOW NA2 NA LIMIT time TO 'Jan08' 'Feb08' 'Mar08' 'Apr08' 'May08' 'Jun08' 'Jul08' 'Aug08' 'Sep08' 'Oct08' 'Nov08' 'Dec08' saleswithnull= NA2
For brevity's sake assume that your test now issues the following three LIMIT statements and then reports on the sales variable and the NAFLAG function against the sales variable. As the NAFLAG report illustrate, the value Jan08
which is a month to which an NA2 value was assigned returns the value of 2
for NAFLAG, while the NAFLAG report still returns the value of 1
for the year 2008
.
LIMIT product to 'Radios' LIMIT time TO 'Jan08' '2008' LIMIT geog TO 'Boston' 'All Places' REPORT DOWN time ACROSS geog: sales PRODUCT: Radios ----------SALES---------- ----------GEOG----------- TIME Boston All Places -------------- ------------ ------------ Jan08 NA NA 2008 NA NA REPORT DOWN time ACROSS geog: NAFLAG(sales)
PRODUCT: Radios ------NAFLAG(SALES)------ ----------GEOG----------- TIME Boston All Places -------------- ------------ ------------ Jan08 2 2 2008 1 1
Now assume that you issue the following code to remove the NA2 bits from the sales variable.
CHGDFN sales DROP NULLTRACKING
A DESCRIBE of the sales variable shows that it no longer has the WITH NULLTRACKING phrase in its definition while a report of the results of NAFLAG show that the NA values are now just the typical NA values without an NA2 bit.
DESCRIBE sales DEFINE SALES VARIABLE DECIMAL <TIME PROD_GEOG <PRODUCT GEOG>> REPORT DOWN time ACROSS geog: sales PRODUCT: Radios ----------SALES---------- ----------GEOG----------- TIME Boston All Places -------------- ------------ ------------ Jan08 NA NA 2008 NA NA "Report on the type of NA values in the sales variable REPORT DOWN time ACROSS geog: NAFLAG(sales) PRODUCT: Radios ------NAFLAG(SALES)------ ----------GEOG----------- TIME Boston All Places -------------- ------------ ------------ Jan08 1 1 2008 1 1