@@ -73,122 +73,118 @@ should not be used in production. The feature will be enabled in a future releas
7373
7474Cast operations in Comet fall into three levels of support:
7575
76- - ** Compatible** : The results match Apache Spark
77- - ** Incompatible** : The results may match Apache Spark for some inputs, but there are known issues where some inputs
76+ - ** C ( Compatible) ** : The results match Apache Spark
77+ - ** I ( Incompatible) ** : The results may match Apache Spark for some inputs, but there are known issues where some inputs
7878 will result in incorrect results or exceptions. The query stage will fall back to Spark by default. Setting
7979 ` spark.comet.expression.Cast.allowIncompatible=true ` will allow all incompatible casts to run natively in Comet, but this is not
8080 recommended for production use.
81- - ** Unsupported** : Comet does not provide a native version of this cast expression and the query stage will fall back to
81+ - ** U ( Unsupported) ** : Comet does not provide a native version of this cast expression and the query stage will fall back to
8282 Spark.
83+ - ** N/A** : Spark does not support this cast.
8384
84- ### Compatible Casts
85+ ### Legacy Mode
8586
86- The following cast operations are generally compatible with Spark except for the differences noted here.
87+ <!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->
88+
89+ <!-- BEGIN:CAST_LEGACY_TABLE-->
90+ <!-- prettier-ignore-start -->
91+ | | binary | boolean | byte | date | decimal | double | float | integer | long | short | string | timestamp |
92+ | ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---|
93+ | binary | - | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | C | N/A |
94+ | boolean | N/A | - | C | N/A | U | C | C | C | C | C | C | U |
95+ | byte | U | C | - | N/A | C | C | C | C | C | C | C | U |
96+ | date | N/A | U | U | - | U | U | U | U | U | U | C | U |
97+ | decimal | N/A | C | C | N/A | - | C | C | C | C | C | C | U |
98+ | double | N/A | C | C | N/A | I | - | C | C | C | C | C | U |
99+ | float | N/A | C | C | N/A | I | C | - | C | C | C | C | U |
100+ | integer | U | C | C | N/A | C | C | C | - | C | C | C | U |
101+ | long | U | C | C | N/A | C | C | C | C | - | C | C | U |
102+ | short | U | C | C | N/A | C | C | C | C | C | - | C | U |
103+ | string | C | C | C | C | I | C | C | C | C | C | - | I |
104+ | timestamp | N/A | U | U | C | U | U | U | U | C | U | C | - |
105+ <!-- prettier-ignore-end -->
106+
107+ ** Notes:**
108+
109+ - ** decimal -> string** : There can be formatting differences in some case due to Spark using scientific notation where Comet does not
110+ - ** double -> decimal** : There can be rounding differences
111+ - ** double -> string** : There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
112+ - ** float -> decimal** : There can be rounding differences
113+ - ** float -> string** : There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
114+ - ** string -> date** : Only supports years between 262143 BC and 262142 AD
115+ - ** string -> decimal** : Does not support fullwidth unicode digits (e.g \\ uFF10)
116+ or strings containing null bytes (e.g \\ u0000)
117+ - ** string -> timestamp** : Not all valid formats are supported
118+ <!-- END:CAST_LEGACY_TABLE-->
119+
120+ ### Try Mode
87121
88122<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->
89123
90- <!-- BEGIN:COMPAT_CAST_TABLE -->
124+ <!-- BEGIN:CAST_TRY_TABLE -->
91125<!-- prettier-ignore-start -->
92- | From Type | To Type | Notes |
93- | -| -| -|
94- | boolean | byte | |
95- | boolean | short | |
96- | boolean | integer | |
97- | boolean | long | |
98- | boolean | float | |
99- | boolean | double | |
100- | boolean | string | |
101- | byte | boolean | |
102- | byte | short | |
103- | byte | integer | |
104- | byte | long | |
105- | byte | float | |
106- | byte | double | |
107- | byte | decimal | |
108- | byte | string | |
109- | short | boolean | |
110- | short | byte | |
111- | short | integer | |
112- | short | long | |
113- | short | float | |
114- | short | double | |
115- | short | decimal | |
116- | short | string | |
117- | integer | boolean | |
118- | integer | byte | |
119- | integer | short | |
120- | integer | long | |
121- | integer | float | |
122- | integer | double | |
123- | integer | decimal | |
124- | integer | string | |
125- | long | boolean | |
126- | long | byte | |
127- | long | short | |
128- | long | integer | |
129- | long | float | |
130- | long | double | |
131- | long | decimal | |
132- | long | string | |
133- | float | boolean | |
134- | float | byte | |
135- | float | short | |
136- | float | integer | |
137- | float | long | |
138- | float | double | |
139- | float | string | There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45 |
140- | double | boolean | |
141- | double | byte | |
142- | double | short | |
143- | double | integer | |
144- | double | long | |
145- | double | float | |
146- | double | string | There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45 |
147- | decimal | boolean | |
148- | decimal | byte | |
149- | decimal | short | |
150- | decimal | integer | |
151- | decimal | long | |
152- | decimal | float | |
153- | decimal | double | |
154- | decimal | decimal | |
155- | decimal | string | There can be formatting differences in some case due to Spark using scientific notation where Comet does not |
156- | string | boolean | |
157- | string | byte | |
158- | string | short | |
159- | string | integer | |
160- | string | long | |
161- | string | float | |
162- | string | double | |
163- | string | binary | |
164- | string | date | Only supports years between 262143 BC and 262142 AD |
165- | binary | string | |
166- | date | string | |
167- | timestamp | long | |
168- | timestamp | string | |
169- | timestamp | date | |
126+ | | binary | boolean | byte | date | decimal | double | float | integer | long | short | string | timestamp |
127+ | ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---|
128+ | binary | - | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | C | N/A |
129+ | boolean | N/A | - | C | N/A | U | C | C | C | C | C | C | U |
130+ | byte | U | C | - | N/A | C | C | C | C | C | C | C | U |
131+ | date | N/A | U | U | - | U | U | U | U | U | U | C | U |
132+ | decimal | N/A | C | C | N/A | - | C | C | C | C | C | C | U |
133+ | double | N/A | C | C | N/A | I | - | C | C | C | C | C | U |
134+ | float | N/A | C | C | N/A | I | C | - | C | C | C | C | U |
135+ | integer | U | C | C | N/A | C | C | C | - | C | C | C | U |
136+ | long | U | C | C | N/A | C | C | C | C | - | C | C | U |
137+ | short | U | C | C | N/A | C | C | C | C | C | - | C | U |
138+ | string | C | C | C | C | I | C | C | C | C | C | - | I |
139+ | timestamp | N/A | U | U | C | U | U | U | U | C | U | C | - |
170140<!-- prettier-ignore-end -->
171- <!-- END:COMPAT_CAST_TABLE-->
172141
173- ### Incompatible Casts
142+ ** Notes: **
174143
175- The following cast operations are not compatible with Spark for all inputs and are disabled by default.
144+ - ** decimal -> string** : There can be formatting differences in some case due to Spark using scientific notation where Comet does not
145+ - ** double -> decimal** : There can be rounding differences
146+ - ** double -> string** : There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
147+ - ** float -> decimal** : There can be rounding differences
148+ - ** float -> string** : There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
149+ - ** string -> date** : Only supports years between 262143 BC and 262142 AD
150+ - ** string -> decimal** : Does not support fullwidth unicode digits (e.g \\ uFF10)
151+ or strings containing null bytes (e.g \\ u0000)
152+ - ** string -> timestamp** : Not all valid formats are supported
153+ <!-- END:CAST_TRY_TABLE-->
154+
155+ ### ANSI Mode
176156
177157<!-- WARNING! DO NOT MANUALLY MODIFY CONTENT BETWEEN THE BEGIN AND END TAGS -->
178158
179- <!-- BEGIN:INCOMPAT_CAST_TABLE -->
159+ <!-- BEGIN:CAST_ANSI_TABLE -->
180160<!-- prettier-ignore-start -->
181- | From Type | To Type | Notes |
182- | -| -| -|
183- | float | decimal | There can be rounding differences |
184- | double | decimal | There can be rounding differences |
185- | string | decimal | Does not support fullwidth unicode digits (e.g \\ uFF10)
186- or strings containing null bytes (e.g \\ u0000) |
187- | string | timestamp | Not all valid formats are supported |
161+ | | binary | boolean | byte | date | decimal | double | float | integer | long | short | string | timestamp |
162+ | ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---| ---|
163+ | binary | - | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A | C | N/A |
164+ | boolean | N/A | - | C | N/A | U | C | C | C | C | C | C | U |
165+ | byte | U | C | - | N/A | C | C | C | C | C | C | C | U |
166+ | date | N/A | U | U | - | U | U | U | U | U | U | C | U |
167+ | decimal | N/A | C | C | N/A | - | C | C | C | C | C | C | U |
168+ | double | N/A | C | C | N/A | I | - | C | C | C | C | C | U |
169+ | float | N/A | C | C | N/A | I | C | - | C | C | C | C | U |
170+ | integer | U | C | C | N/A | C | C | C | - | C | C | C | U |
171+ | long | U | C | C | N/A | C | C | C | C | - | C | C | U |
172+ | short | U | C | C | N/A | C | C | C | C | C | - | C | U |
173+ | string | C | C | C | C | I | C | C | C | C | C | - | I |
174+ | timestamp | N/A | U | U | C | U | U | U | U | C | U | C | - |
188175<!-- prettier-ignore-end -->
189- <!-- END:INCOMPAT_CAST_TABLE-->
190176
191- ### Unsupported Casts
177+ ** Notes:**
178+
179+ - ** decimal -> string** : There can be formatting differences in some case due to Spark using scientific notation where Comet does not
180+ - ** double -> decimal** : There can be rounding differences
181+ - ** double -> string** : There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
182+ - ** float -> decimal** : There can be rounding differences
183+ - ** float -> string** : There can be differences in precision. For example, the input "1.4E-45" will produce 1.0E-45 instead of 1.4E-45
184+ - ** string -> date** : Only supports years between 262143 BC and 262142 AD
185+ - ** string -> decimal** : Does not support fullwidth unicode digits (e.g \\ uFF10)
186+ or strings containing null bytes (e.g \\ u0000)
187+ - ** string -> timestamp** : ANSI mode not supported
188+ <!-- END:CAST_ANSI_TABLE-->
192189
193- Any cast not listed in the previous tables is currently unsupported. We are working on adding more. See the
194- [ tracking issue] ( https://github.com/apache/datafusion-comet/issues/286 ) for more details.
190+ See the [ tracking issue] ( https://github.com/apache/datafusion-comet/issues/286 ) for more details.
0 commit comments