Accounts Systems are made of one or more accounts. Accounts are directories stored on the host operating system that initially contain the set of
files needed for the system to function properly. This includes the system's VOC (
vocabulary) file that contains every
command,
filename,
keyword,
alias,
script, and other
pointers. Each of these classes of VOC entries can also be created by a
user.
Files Files are similar to
tables in a
relational database in that each file has a unique name to distinguish it from other files and zero to multiple unique
records that are logically related to each other. Files are made of two parts: a data file and a file dictionary (DICT). The data file contains records that store the actual data. The file dictionary may contain
metadata to describe the contents or to output the contents of a file.
Hashed files For hashed files, a U2 system uses a
hashing algorithm to allocate the file's records into groups based on the
record IDs. When searching for data in a hashed file, the system only searches the group where the record ID is stored, making the search process more efficient and quicker than searching through the whole file.
Nonhashed files Nonhashed files are used to store data with little or no logical structure such as program
source code,
XML or
plain text. This type of file is stored as a subdirectory within the account directory on the host operating system and may be read or edited using appropriate tools.
Records Files are made of records, which are similar to rows within tables of a relational database. Each record has a unique key (called a "record ID") to distinguish it from other records in the file. These record IDs are typically hashed so that data can be retrieved quickly and efficiently. Records (including record IDs) store the actual data as pure
ASCII strings; there is no binary data stored in U2. For example, the hardware representation of a floating-point number would be converted to its ASCII equivalent before being stored. Usually these records are divided into
fields (which are sometimes called "attributes" in U2). Each field is separated by a "field mark" (hexadecimal character FE). Thus this string: : might represent a record in the EMPLOYEE file with 123-45-6789 as the Record ID, JOHN JONES as the first field, jjones@example.com as the second field and $4321.00 as a monthly salary stored in the third field. (The up-arrow (^) above is the standard
Pick notation of a field mark; that is, xFE). Thus the first three fields of this record, including the record ID and trailing field mark, would use 49 bytes of storage. A given value uses only as many bytes as needed. For example, in another record of the same file, JOHN JONES (10 bytes) may be replaced by MARJORIE Q. HUMPERDINK (21 bytes) yet each name uses only as much storage as it needs, plus one for the field mark. Fields may be broken down into values and even subvalues. Values are separated by value marks (character xFD); subvalues are separated by subvalue marks (character xFC). Thus, if John Jones happened to get a second email address, the record may be updated to: : where the close bracket (]) represents a value mark. Since each email address can be the ID of a record in separate file (in SQL terms, an
outer join; in U2 terms, a "translate"), this provides the reason why U2 may be classified as a
MultiValued database.
Data Raw information is called Data. A record is a set of logical
grouped data. e.g. an employee record will have data stored in the form of fields/attributes like his name, address etc. ==Programmability==