Main Types of Data
There are three broad types of data: structured, semi-structured and unstructured. Data may have the following characteristics:
- Primary data is from an original source, such as a weighing scale.
- Secondary data comes from a secondary source, such as a report that interprets the original data.
- Qualitative data is subjective in nature.
- Quantitative data is a numerical value such as a score.
- Discrete data is an unrounded whole number.
- Continuous data can be a rounded measurement.
Actian’s Types of Data
In this article, we will focus on data types that Actian’s databases management systems (DBMSs) can access. These fall into the following five categories:
- Character
- Numeric
- Date and Time
- Abstract
- Boolean
Character Data
Character data types are strings of ASCII characters, both printable and non-printable. Uppercase and lowercase alphabetic characters are accepted literally. Character data can be of fixed or variable-length data types. Variable length columns occupy more space than a fixed length type because a length specifier must be stored. If a data field can contain a null value, an additional byte is used to store a null indicator.
Spaces in character strings are treated as part of the string. A fixed-length string such as CHAR(4) will be padded with trailing spaces like “ABC “. Leading and trailing blanks are significant when comparing values.
As with fixed-length CHAR strings, variable-length or VARCHAR strings can contain any character, including non-printing characters, except the ASCII null character, which occupies an additional byte if allowed. Blank characters are significant when stored or compared. The Actian Data Platform uses NCHAR and NVARCHAR data types to store UTF8 encoded characters.
JSON Data
An example of a semi-structured data type is JSON. JSON use its own data type. JSON values are stored in any string column, such as CHAR, VARCHAR, NCHAR, and NVARCHAR. Values can be a scalar, arrays or a JSON object.
A JSON object is a comma-separated list of key:value pairs surrounded by brackets {}.
A key must be a double-quoted string. A value can be any JSON value, including a JSON object or JSON array. It cannot be blank, and whitespace is ignored in a JSON object string except for whitespace within the double quotes of a string.
XML and JSON semi-structured data strings are stored as variable-length strings.
Numeric Data
Integer Data Types
Four Integer data types are used to hold whole numbers. The more bytes the data type uses, the bigger number it can hold. The four integer types that the Actian Data Platform uses are:
- INTEGER1 or TINYINT (one-byte)
- INTEGER2 or SMALLINT (two-byte)
- INTEGER4 or INTEGER (four-byte)
- INTEGER8 or BIGINT (eight-byte)
Decimal Data
The decimal data type stores fractional numbers by specifying the total number of digits and the number of decimal places. For example, DECIMA(20,5) stores a number with 20 digits of precision, with 5 being to the right of the decimal point.
Floating Point Data Type
Floating-point values can be expressed as FLOAT4 for four-byte precision or FLOAT8 for 8 bytes of precision. The exact precision of 4-byte numbers is processor dependent. Internally, eight-byte numbers are rounded to fifteen decimal digits.
Money Data Type
MONEY is an example of an abstract data type. Stored values are rounded to 2 decimal places. Values must be in the range of $-999,999,999,999.99 to $999,999,999,999.99. The currency symbol is optional.
Date and Time Data
Timestamp Data Type
The TIMESTAMP data type is used to record when events happen. It consists of a date and time, with an optional time zone. For example, TIMESTAMP(5) WITH TIME ZONE could look like this:
2023-15-20 9:30:55.12345-08:00, which would be in the pacific time zone.
Abstract Data
Boolean Data Type
BOOLEAN columns contain literal values of ‘TRUE’ or ‘FALSE’, which internally have values of 0 and 1.
IP Network Address Data Type
An abstract data type for IPV4 and IPV6 addresses is very useful when storing and manipulating weblogs. An IPv4 address might look like 176.12.254.1. The newer IPV6 has far more variations, so it looks like the following format: 2101:0cb8:8ca3:0d42:1900:8d2e:0e70:7734.
Using IPV4 and IPV6 data provides input error checking and supports specialized operators and functions.
Universal Unique Identifier (UUID)
A Universal Unique Identifier (UUID) is a 128-bit, unique identifier generated by the local system upon request or loaded from external sources. They are suitable for reliably identifying persistent objects across a network or generating unique values such as transaction IDs.
Geospatial Data
The Ingres Transactional Database provides deep support for geospatial data types. All spatial data types store features using the Well-Known-Binary (WKB) format, a specification of the Open Geospatial Consortium (OGC).
2D data types exist in a two-dimensional coordinate space represented by X (longitude) and Y (latitude) coordinates. These include geometry and line strings, for example. 3D data types add a third dimension of Z in X, Y, and Z coordinate spaces. 4D data adds a fourth, application-dependent dimension to a 3D coordinate.
Unstructured Data
Unstructured data, such as text, is stored in CHAR or VARCHAR formats in the database. Video and audio data ARE generally accessed as an externally stored object in a file system using a database connector like Spark.
Actian and Supported Data Formats
You can learn more about Actian transactional databases by visiting our website.