Previous
Previous
 
Next
Next


IMPORT (text)

You can use the IMPORT (from text) command to copy data from a text file into an Oracle OLAP worksheet object. A worksheet's rows are similar to the lines of a text file.

IMPORT is commonly used to copy text files into an analytic workspace from other software products.

Typically, you use a FILEREAD statement for text files instead of IMPORT. FILEREAD is more efficient and does not require a worksheet object or separate handling of each column of data.

Syntax

IMPORT worksheetname FROM [TEXT|STRUCTURED|RULED [RULER ruler-exp] - 

  PRN FILE file-id [STOPAFTER n] [TEXTSTART schar] [TEXTEND echar] -

  [DELIMITER dchar] [NLS_CHARSET charset-exp]

Parameters

worksheetname

A text expression that specifies the name of an Oracle OLAP worksheet object. When you have not previously defined worksheetname in your workspace, IMPORT defines it for you automatically, using the default dimensions WKSCOL and WKSROW. Any previous contents of worksheetname are overwritten. In any one IMPORT statement, you can import only one worksheetname from one text file.

FROM . . . PRN

Indicates that you want to import your Oracle OLAP worksheet from a text file.

TEXT

Imports a whole source file as-is into an Oracle OLAP worksheet on a line-by-line basis. The source file is copied into a single wide worksheet column with a data type of TEXT. The column is always column 1 of the worksheet. Each line in the source file is imported into a separate cell on a separate row in the first column, using as many rows as there are lines in the source file. A blank line in the source file produces a TEXT value with zero characters (a null) in the corresponding row of the worksheet's first column. (TEXT is the default.)

STRUCTURED

Imports a source file into a target worksheet on a cell-by-cell basis, automatically performing three functions:

  1. Each line of characters in the source file is copied into a single row of the target worksheet.

  2. Each group of characters on a line in the source file is copied into a separate TEXT cell on the target worksheet row. A group of characters is defined by two conditions: an uninterrupted (except by a decimal point) sequence of numbers, or enclosure in double quotes. Consequently, numbers containing commas to mark off thousands are split up into different cells unless the commas are first removed.

  3. Any non-numeric characters not enclosed in double quotes are ignored, except minus signs that immediately precede numbers and so are copied into the same TEXT cell along with the numbers. (Be sure there are no spaces between a minus sign and its number in the source file.)

A blank line in the source file results in an NA in the first cell of the corresponding worksheet row.

When your file format does not conform to the pattern described here, you can use the TEXTSTART, TEXTEND, and DELIMITER keywords. These arguments let you customize the delimiters IMPORT uses to identify the start and end of each field.

RULED

Indicates import of a file on a column-by-column basis into worksheet cells of various data types. Every line in the source file must follow the same pattern of data along its length as every other line in the file. You describe this data pattern to Oracle OLAP in the one-line ruler-exp using the RULER keyword. IMPORT loops over each line in the source file and copies its contents into a corresponding pattern of cells on a row of the target worksheet, one row for each line. As ruler-exp loops over successive lines in the source file, it adds row after row to the target worksheet, building vertical columns of similar cells as it goes along. Status messages are sent to the current outfile every 20 rows, starting with the message 20 rows processed.

When the source file contains records that follow several different patterns of character groups, use the less exacting options, STRUCTURED or TEXT, to import the data.

RULER ruler-exp

Used only with the RULED keyword to specify the data type, length, and repeat count of each character group in the record pattern of the source file. Ruler-exp consists of a list of character-group specifications. Each character-group specification must be separated by a comma (,), by backslash N (\n), or by a space( ). You do not have to include enough specifications to account for all the characters in the basic record pattern (or line pattern) of the source file; RULER steps to the next record as soon as it runs out of specifications on each line, regardless of how far it is from the end of the current record. Remember to enclose literal text in single quotes.

The specifications for groups of characters are of three types: T for TEXT, A for numeric (INTEGER or DECIMAL), and S for skip or ignore. The formats for these types are shown in Table: Character-Group Specifications for IMPORT from Text.

Character-Group Specifications for IMPORT from Text

Format Description

[mm]Tnn

Specifies that Oracle OLAP should copy mm groups (default = 1) of nn characters (bytes) apiece as TEXT. Specifying a group (or groups) of 0 characters leaves an empty cell(s) in the corresponding position in the worksheet. Each group may consist of up to 400098 characters. Trailing blanks are stripped.

[mm]Ann

Specifies that Oracle OLAP should copy mm groups (default = 1) of nn characters (bytes) apiece and try to convert each group to a number. When a character group cannot be converted to a number, it is copied into a TEXT cell and any trailing blanks are stripped. A valid number includes anything you can type for a GET(DECIMAL) function except NA.

Commas embedded in a number before a period (decimal point) are ignored. Consequently, multiple numbers separated only by commas or two numbers separated only by a single period are treated as parts of a single number (when you want the numbers treated separately, insert spaces between them in the source file). Leading dollar signs ($) and trailing percent signs (%) are ignored, and leading and trailing spaces are stripped. Multiple periods are treated as excess decimal points and ignored (to undo the effects of dotfill). For example,...17... is treated as though the field is 17.

Numbers preceded by a hyphen, or a hyphen and spaces, and numbers enclosed in parentheses, are treated as negative. Specifying a group (or groups) of 0 (zero) characters leaves an empty cell (or cells) in the corresponding position in the worksheet. Each group may consist of up to 4,000 characters.

[mm]Snn

Specifies that Oracle OLAP should skip or ignore mm groups of nn characters (bytes). The limit for nn is 32,767. (You would probably only use mm to expand this limit to handle a very long record.)


FILE file-id

Identifies the file you want to import. File-id is a text expression that represents the name of the file. The name must be in a standard format for a file identifier.

STOPAFTER n

Specifies that no more than n records should be read from the file. When STOPAFTER is omitted, Oracle OLAP reads the whole file.

TEXTSTART schar

The schar argument is a text expression that specifies a single character that you want Oracle OLAP to interpret as the start of a text field in a structured file. The default character is a double quote (").

TEXTEND echar

The echar argument is a text expression that specifies a single character that you want Oracle OLAP to interpret as the end of a text field in a structured file. The default character is a double quote (").

DELIMITER dchar

The dchar argument is a text expression that specifies a single character that you want Oracle OLAP to interpret as the general field delimiter in a structured file. Oracle OLAP uses the general field delimiter to identify both numeric and text fields. The default character is a comma (,).

NLS_CHARSET charset-exp

Specifies the character set that Oracle OLAP uses when importing text data from the file specified by file-id which allows Oracle OLAP to convert the data accurately from that character set. This argument must be the last one specified. When this argument is omitted, and Oracle OLAP cannot determine the character set from the file itself, then Oracle OLAP imports the data in the database character set, which is recorded in the NLS_LANG option.

Usage Notes

WKSROW and WKSCOL Dimensions

The WKSROW (the default worksheet row) dimension of an Oracle OLAP worksheet object corresponds to the lines of a text file. The WKSCOL (the default worksheet column) dimension of a worksheet divides each row into cells that can be used to separate data types when there are potentially several types on each line of the source file. WKSROW and WKSCOL are INTEGER dimensions with values of 1, 2, 3, and so on.

Minimum Worksheet Size

Oracle OLAP sets up a minimum-size worksheet that is 63 cells square, regardless of whether or not all the cells are used. When the source text file requires an Oracle OLAP worksheet larger than the minimum, IMPORT automatically increases the dimension values of WKSCOL and WKSROW as needed.

Importing Numbers

When importing a number from a text file, IMPORT gives it an INTEGER data type.

File Transfer to Another Computer

When the file you are importing originated on another computer, ensure that its character set is appropriate. When you transfer a text file to another computer, the communications program handling the transfer makes any necessary character translations; for example, from ASCII to EBCDIC. Set the parameters of the transfer program so that the resulting file is in the correct character set for the receiving computer.

Examples

Importing Columns Without the RULER Keyword

Suppose you have a file named abctxt in your current directory. It has 10 five-digit groups of INTEGER values, followed by a group of 20 characters of text. To import this file into an Oracle OLAP worksheet called sheet1, use the following statement.

IMPORT sheet1 FROM RULED PRN FILE 'abctxt' ruler '10a5, t20'

The actual format for the file name must follow the conventions for your operating system.

Importing Columns with the RULER Keyword

Suppose a file called mix has no line delimiters, with records containing 100 characters apiece. Each record has the character distribution illustrated in the following table.

Character Content
1 - 10 To be ignored
11 - 17 Decimal number
18 - 28 To be ignored
29 - 30 Two single-character code
31 - 35 Integer
36 - 100 To be ignored

To import this file into an Oracle OLAP worksheet called sheet2, use the following statement.

DEFINE sheet2 WORKSHEET temp
IMPORT sheet2 FROM RULED PRN FILE 'mix' RULER -
   's10, a7, s11, 2t1, a5'