Data Types
This page documents the data types supported by vtlengine, covering input formats, internal representation, output formats, and type casting rules based on the VTL 2.2 specification.
See also
VTL Data Types — Full type system in the VTL 2.2 User Manual
Scalar type definitions — Detailed scalar type descriptions
Type conversion: cast — Cast operator reference
Type Conversion and Formatting Mask — Conversion rules and masks
Type Hierarchy
The VTL 2.2 specification defines a hierarchy of scalar types:
Scalar
├── String
├── Number
│ └── Integer (subtype of Number)
├── Time
│ ├── Date (subtype of Time)
│ └── Time_Period (subtype of Time)
├── Duration
└── Boolean
Note
In vtlengine, the VTL Time type is implemented as
TimeInterval, and Time_Period as TimePeriod.
The user-facing names remain Time and Time_Period.
Data Types Reference
Each type below describes how vtlengine handles input, storage, and output. For the formal VTL definitions, see External representations and literals.
String
Input (CSV) |
Any text value. Surrounding double quotes are stripped automatically. |
Input (DataFrame) |
Any value (all values pass validation). |
Internal representation |
Python |
Output dtype |
|
Integer
Input (CSV) |
Whole numbers: |
Input (DataFrame) |
Values are cast via |
Internal representation |
Python |
Output dtype |
|
Integer is a subtype of Number — anywhere a Number is expected, an Integer is accepted automatically.
Number
Input (CSV) |
Decimal or integer numbers: |
Input (DataFrame) |
Values are cast via |
Internal representation |
Python |
Output dtype |
|
Boolean
Input (CSV) |
|
Input (DataFrame) |
Same string values or native Python
|
Internal representation |
Python |
Output dtype |
|
Date
Input (CSV) |
ISO 8601 date: |
Input (DataFrame) |
String values validated against the same ISO 8601 formats. |
Internal representation |
Python |
Output dtype |
|
Date is a subtype of Time — anywhere a Time value is expected, a Date is accepted automatically.
Time_Period
Input (CSV/DataFrame) |
Multiple formats accepted (see tables below). |
Internal representation |
Hyphenated string (e.g. |
Output dtype |
|
Accepted input formats:
Period |
Formats |
Examples |
|---|---|---|
Annual |
|
|
Semester |
|
|
Quarter |
|
|
Monthly |
|
|
Weekly |
|
|
Daily |
|
|
Output formats (controlled by time_period_output_format
parameter):
Format |
Annual |
Semester |
Quarter |
Month |
Week |
Day |
|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Not supported |
Not supported |
|
Not supported |
|
|
|
|
|
|
|
|
Time_Period is a subtype of Time — anywhere a Time value is expected, a Time_Period is accepted automatically.
Time (TimeInterval)
Input (CSV/DataFrame) |
ISO 8601 interval: |
Internal representation |
Python |
Output dtype |
|
Duration
Input (CSV/DataFrame) |
Single-letter period indicator: |
Internal representation |
Python |
Output dtype |
|
Null Handling
All VTL scalar types support null values (represented as
pd.NA / None), with one exception:
Identifiers cannot be null — loading data with null identifiers raises an error.
Measures and Attributes can be nullable (controlled by the
nullableflag in the data structure definition).
During operations, null propagates: any operation involving
a null operand typically produces a null result.
The Null type is compatible with all other types for
implicit promotion.
Type Casting
Implicit Casting (Automatic)
Implicit casts happen automatically when operators receive operands of different but compatible types. The engine resolves the common type using the type promotion rules defined in VTL 2.2.
From / To |
String |
Number |
Integer |
Boolean |
Time |
Date |
Time_Period |
Duration |
|---|---|---|---|---|---|---|---|---|
String |
✅ |
— |
— |
— |
— |
— |
— |
— |
Number |
— |
✅ |
✅ |
— |
— |
— |
— |
— |
Integer |
— |
✅ |
✅ |
— |
— |
— |
— |
— |
Boolean |
✅ |
— |
— |
✅ |
— |
— |
— |
— |
Time |
— |
— |
— |
— |
✅ |
— |
— |
— |
Date |
— |
— |
— |
— |
✅ |
✅ |
— |
— |
Time_Period |
— |
— |
— |
— |
✅ |
— |
✅ |
— |
Duration |
— |
— |
— |
— |
— |
— |
— |
✅ |
Key rules:
Integer / Number: Both directions are implicit (Integer is a subtype of Number).
Date to Time: A Date is implicitly converted to a Time interval (
"2020-01-15"becomes"2020-01-15/2020-01-15").Time_Period to Time: A Time_Period is implicitly converted to a Time interval (
"2020-Q1"becomes"2020-01-01/2020-03-31").Boolean to String:
truebecomes"True",falsebecomes"False".Null to any type: Null is compatible with every type.
Explicit Casting (cast operator)
The cast operator converts values from one type to another:
/* Without mask */
DS_r <- cast(DS_1, integer);
/* With mask */
DS_r <- cast(DS_1, date, MASK);
Note
VTL type names in the cast operator are lowercase:
string, integer, number, boolean,
time, date, time_period, duration.
Supported conversions without mask
From / To |
String |
Number |
Integer |
Boolean |
Time |
Date |
Time_Period |
Duration |
|---|---|---|---|---|---|---|---|---|
String |
✅ |
✅ |
✅ |
— |
✅ |
✅ |
✅ |
✅ |
Number |
✅ |
✅ |
✅ |
✅ |
— |
— |
— |
— |
Integer |
✅ |
✅ |
✅ |
✅ |
— |
— |
— |
— |
Boolean |
✅ |
✅ |
✅ |
✅ |
— |
— |
— |
— |
Time |
✅ |
— |
— |
— |
✅ |
— |
— |
— |
Date |
✅ |
— |
— |
— |
— |
✅ |
✅ |
— |
Time_Period |
✅ |
— |
— |
— |
— |
— |
✅ |
— |
Duration |
✅ |
— |
— |
— |
— |
— |
— |
✅ |
Conversion details:
Number/Integer to Boolean:
0becomesfalse, any other value becomestrue.Boolean to Number/Integer:
truebecomes1(or1.0),falsebecomes0(or0.0).String to Integer: Must be a valid integer string (rejects
"3.5").Date to Time_Period: Converts to daily period (e.g.
"2020-01-15"becomes"2020D15"with the defaultvtloutput format).
Supported conversions with mask
From / To |
String |
Number |
Time |
Date |
Time_Period |
Duration |
|---|---|---|---|---|---|---|
String |
— |
⏳ |
⏳ |
⏳ |
⏳ |
⏳ |
Time |
⏳ |
— |
— |
— |
— |
— |
Date |
⏳ |
— |
— |
— |
— |
— |
Time_Period |
— |
— |
— |
⏳ |
— |
— |
Duration |
⏳ |
— |
— |
— |
— |
— |
Legend: ✅ = implemented, ⏳ = defined in VTL 2.2 but not
yet implemented (raises NotImplementedError).
Note
Formal definition of masks is still to be decided.
Cast on datasets
When cast is applied to a Dataset, it must have exactly
one measure (monomeasure). The measure is renamed to a
generic name based on the target type:
Target type |
Renamed measure |
|---|---|
String |
|
Number |
|
Integer |
|
Boolean |
|
Time |
|
Time_Period |
|
Date |
|
Duration |
|
Note
When the source type can be implicitly promoted to the target type (e.g. Boolean to String, Integer to Number, or Number to Integer), the measure is not renamed.