MarketUniversally unique identifier
Company Profile

Universally unique identifier

A universally unique identifier (UUID) is a 128-bit number used to identify information in computer systems. The term globally unique identifier (GUID) is also used, typically in software created by Microsoft.

History
In the 1980s, Apollo Computer originally used UUIDs in the Network Computing System (NCS). Later, the Open Software Foundation (OSF) used UUIDs for their Distributed Computing Environment (DCE). The design of the DCE UUIDs was partly based on the NCS UUIDs, whose design was in turn inspired by the (64-bit) unique identifiers defined and used pervasively in Domain/OS, an operating system designed by Apollo Computer. Later in the early 1990s, the Microsoft Windows platforms adopted the DCE design as "Globally Unique IDentifiers" (GUIDs). == Standards ==
Standards
UUIDs were standardized by various bodies, starting with the Open Software Foundation in 1996–1997, as part of the Distributed Computing Environment (DCE). The definition was also documented in 1996 as part of ISO/IEC 11578:1996 "Information technology – Open Systems Interconnection – Remote Procedure Call. In July 2005, the Internet Engineering Task Force (IETF) published the Standards-Track RFC 4122. RFC 4122 also registered a URN namespace for UUIDs. The ITU had also standardized UUIDs, based on the previous standards and early versions of RFC 4122, in ITU-T Rec. X.667 ISO/IEC 9834-8. This is technically equivalent to RFC 4122. In May 2024, RFC 9562 was published, introducing three new "versions", clarifying some ambiguities, and superseding RFC 4122. This applies only to "variant 1" of the RFC 4122 UUID definition, with the other variants being out of scope. == Format ==
Format
A UUID is a 128-bit number. The meaning of the bits is determined by the variant, of which four are defined. The two most common variants further define eight versions. Variants The variant field is in a variable number of the most-significant bits of the ninth byte. It indicates the format of the UUID. The following variants are defined: • The Apollo NCS variant 0 (indicated by the one-bit pattern 0xxx2) is for backwards compatibility with the now-obsolete Apollo Network Computing System 1.5 UUID format developed around 1988. The variant field of current UUIDs overlaps the address family octet in NCS UUIDs in such a way that any NCS UUIDs still in use have a 0 in the first bit of the variant field. • The OSF DCE variant 1 (102) UUIDs are referred to as RFC 4122/DCE 1.1 UUIDs, or "Leach–Salz" UUIDs, after the authors of the original Internet Draft. • The Microsoft COM/DCOM variant 2 (1102) is characterized in the RFC as "reserved, Microsoft Corporation backward compatibility" and was used for early GUIDs on the Microsoft Windows platform. The main difference between this variant and variant 1, aside from the extra variant bit, is byte-ordering within the UUID. Current Microsoft tools do not generate this variant. Also, RFC 9562, which added versions 6, 7, and 8, states that the variants other than variant 1 are out of its scope though this is unlikely to result in interoperability problems in practice. The versions applicable to the legacy Microsoft variant 2 are therefore somewhat unclear, but likely include only versions 1, 3, and 4. • Variant 3 (1112) is reserved. Versions The OSF DCE and Microsoft COM/DCOM variants (1 & 2) have versions, indicated by the value of the high 4 bits of the 7th byte of the UUID. In textual representations of the UUID, this is the character after the second hyphen. Versions 1 and 6 (date-time and MAC address) Version 1 concatenates the 48-bit MAC address of the "node" (that is, the computer generating the UUID), with a 60-bit timestamp. On systems with 64-bit EUI-64 "MAC addresses", the least significant 48 bits are used. A 48-bit random number may also be used. The timestamp is the number of 100-nanosecond intervals since midnight 15 October 1582 Coordinated Universal Time (UTC), the date on which the Gregorian calendar was first adopted. RFC 4122 states that the time value rolls over around 3400 AD, Versions 3 and 5 (namespace name-based) Version 3 and version 5 UUIDs are generated by hashing a namespace identifier and name. Version 3 uses MD5 as the hashing algorithm, and version 5 uses SHA-1. The uniqueness of the UUIDs based on network-card MAC addresses also depends on network-card manufacturers properly assigning unique MAC addresses to their cards, which like other manufacturing processes is subject to error. Also, MAC addresses may not come from network cards. For example, virtual machines receive a MAC address from a range that is configurable in the hypervisor, and some operating systems permit the end user to customise the MAC address, notably OpenWrt. When a device has an EUI-64 64-bit "MAC address", using the least significant 48 bits of it, as recommended by the RFC, may result in the node ID part of the UUID being duplicated. Thus, node IDs based on MAC addresses may not be globally unique. Usage of the node's network card MAC address for the node ID does often mean that version 1, 2, and 6 UUIDs can be tracked back to the computer that created them. Documents can sometimes be traced to the computers where they were created or edited through UUIDs embedded into them by word processing software. This privacy hole was used when locating the creator of the Melissa virus. RFC 9562 does allow the MAC address in a version 1, 2 or 6 UUID to be replaced by a random 48-bit node ID, either because the node does not have a MAC address, or because it is not desirable to include it. In that case, the RFC requires that the least significant bit of the first octet of the node ID should be set to 1. This corresponds to the multicast bit in MAC addresses, and setting it serves to differentiate UUIDs where the node ID is randomly generated from UUIDs based on MAC addresses from network cards, which typically have unicast MAC addresses. == Special values ==
Special values
The "nil" UUID is 00000000-0000-0000-0000-000000000000 (that is, all clear bits), which can be useful to express the concept of "no such value". The "max" UUID, sometimes also called the "omni" UUID, is FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF (that is, all set bits), which is reserved for the usage of expressing "end of UUID list". == Encoding ==
Encoding
Binary representation Initially, Apollo Computer designed the UUID with the following wire format, very similar to version 1 Later, the UUID was extended by overlapping the new variant field with the legacy family field, The highest family field value defined had been 13, meaning that the most significant bit of the field in a legacy UUID had always been 0. Variant 0 would therefore correspond to legacy UUIDs. The legacy Apollo NCS UUID has the format described in the previous table. The OSF DCE UUID variant is described in RFC 9562. The Microsoft COM / DCOM UUID has its variant described in the Microsoft documentation. Endianness When saving UUIDs to binary format, they are sequentially encoded in big-endian. For example, 00112233-4455-6677-8899-aabbccddeeff, a variant 1 UUID, is encoded as the bytes 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff. An exception to this are Microsoft's variant 2 UUIDs ("GUID"): historically used in COM/OLE libraries, they use a little-endian format, but appear mixed-endian with the first three components of the UUID as little-endian and last two big-endian. Microsoft's GUID structure defines the last eight bytes as an 8-byte array, which are serialized in ascending order, which makes the byte representation appear mixed-endian. For example, variant 2 UUID 00112233-4455-6677-8899-ccddeeffaabb is encoded as the bytes 33 22 11 00 55 44 77 66 88 99 cc dd ee ff aa bb. Textual representation In most cases, UUIDs are represented as hexadecimal values separated by hyphens. Most used is the 8-4-4-4-12 format, a string of 32 hexadecimal digits with four hyphens, xxxxxxxx-xxxx-vxxx-wxxx-xxxxxxxxxxxx. The hyphens separate the version 1 fields but the same format is commonly used for all versions. Every hexadecimal digit represents 4 bits; v represents the version nibble; and the high-order one to three bits of w are the variant. The Windows registry format is the same but wraps the UUID in {} braces. Though they are still occasionally omitted, the format with hyphens was introduced with the newer variant system. Before that, the legacy Apollo format used a slightly different format 34dc23469000.0d.00.00.7c.5f.00.00.00. The first part is the time (time_high and time_low combined). The reserved field is skipped. The family field comes directly after the first dot, so in this case 0d (13 in decimal) for DDS (Data Distribution Service). The remaining parts, each separated with a dot, are the node bytes. Lowercase hexadecimal digits are preferred. ITU-T Rec. X.667 requires lowercase on generation, but also requires the uppercase version to be accepted on input. Since UUIDs are 128-bit numbers, other formats are possible, and occasionally seen, such as decimal digits or binary. RFC 9562 registers the "uuid" namespace. This makes it possible to make URNs out of UUIDs, like urn:uuid:550e8400-e29b-41d4-a716-446655440000. The normal 8-4-4-4-12 format is used for this. It is also possible to make a OID URN out of UUIDs, like urn:oid:2.25.113059749145936325402354257176981405696. In that case, the unsigned decimal format is used. The "uuid" URN is recommended over the "oid" URN. == Collisions ==
Collisions
A collision occurs when the same UUID is generated more than once and is assigned to different referents. In the case of standard version 1, 2, or 6 and some version 7 UUIDs using unique MAC addresses and/or timestamps, collisions can occur only as a result of error, such as manufacturing problems, skewed clocks, or software bugs. In contrast, with UUID versions generated using processes such as random number generation or hashing, collisions can occur without error, due to chance. The probability of this is normally so small that it can be ignored, and can be computed precisely based on analysis of the birthday problem. For example, the number of random version 4 UUIDs which need to be generated in order to have a 50% probability of at least one collision is 2.71 quintillion, computed as follows: {{block indent| n \approx \frac{1}{2} + \sqrt{\frac{1}{4} + \ln(2) \times 2^{123}} \approx 2.71 \times 10^{18}. }} This number would be equivalent to generating 1 billion UUIDs per second for about 86 years. A file containing this many UUIDs, at 16 bytes per UUID, would be about 43.4 exabytes (37.7 EiB). The smallest number of version 4 UUIDs which must be generated for the probability of finding of at least one collision to be p is approximated by the formula {{block indent| \sqrt{2^{123} \times \ln\frac{1}{1 - p}}. }} Thus, the probability to find a duplicate within 103 trillion properly-generated version 4 UUIDs is one in a billion. == Uses ==
Uses
Filesystems Several filesystem types (for example, ext4 and Btrfs) use a UUID to uniquely identify each filesystem to the operating system. (NTFS and FAT32 do not, utilising a shorter UID (Unique identifier) instead.) Filesystem userspace tools, most of which are derived from the original implementation by Theodore Ts'o, therefore make use of UUIDs. An /etc/fstab file might assign mount points based on these UUIDs (or a UID for a FAT32 EFI system partition (ESP)): • device-uuid mount-point fs-type options dump pass UUID=b18e3b6c-ccb7-4308-b527-35e5e6ee2145 / btrfs defaults 0 0 UUID=103C-86D6 /efi vfat utf8 0 2 UUID=64f3cb6a-e70e-45e5-8b90-d86cddbab7bb swap swap defaults 0 0 UUID=eda746c6-1f1b-4cf1-9225-d8b0b46511cc /mnt/Stuff btrfs defaults 0 0 Partition tables The GUID Partition Table (GPT) uses UUIDs (called there "GUID"s) to identify partitions and partition types. Unique partition IDs are assigned locally by the operating system. Partition type IDs are well-known numbers, usually assigned by operating-system or hardware vendors. Microsoft COM There are several flavors of GUIDs used in Microsoft's Component Object Model (COM): • – interface identifier; (The ones that are registered on a system are stored in the Windows Registry at ) • – class identifier; (Stored at ). In practice it is not entirely separate from the space, because remoting the interface can require a proxy/stub object which some toolsets used to create with a equal to the interface's . • – type library identifier; (Stored at ) • – category identifier; (its presence on a class identifies it as belonging to certain class categories, listed at ) Databases UUIDs are commonly used as a unique key in database tables. The function in Microsoft SQL Server version 4 Transact-SQL returns standard random version 4 UUIDs, while the function returns 128-bit identifiers similar to UUIDs which are committed to ascend in sequence until the next system reboot. The Oracle Database function does not return a standard GUID, despite the name. Instead, it returns a 16-byte 128-bit RAW value based on a host identifier and a process or thread identifier, somewhat similar to a GUID. PostgreSQL contains a datatype and can generate most versions of UUIDs through the use of functions from modules. MySQL provides a function, which generates standard version 1 UUIDs. Combined Time-GUID The random nature of standard UUIDs of versions 3, 4, and 5, and the ordering of the fields within standard versions 1 and 2 may create problems with database locality or performance when UUIDs are used as primary keys. For example, in 2002 Jimmy Nilsson reported a significant improvement in performance with Microsoft SQL Server when the version 4 UUIDs being used as keys were modified to include a non-random suffix based on system time. This so-called "COMB" (combined time-GUID) approach made the UUIDs significantly more likely to be duplicated, as Nilsson acknowledged, but Nilsson only required uniqueness within the application. By reordering and encoding version 1 and 2 UUIDs so that the timestamp comes first, insertion performance loss can be averted. COMB-like arrangements of UUID payloads were eventually standardized in RFC 9562 as versions 6 and 7. Other examples UEFI and ACPI are examples that use GUID. == See also ==
tickerdossier.comtickerdossier.substack.com