Compatibility Reference

Apache Spark supports a large number of operations and data types for structured data processing. This page documents how data transformation results produced by the Xonai Accelerator are compatible with Spark and details the support status of each operation.

Compatibility of Results

The Xonai Accelerator produces the same data transformation results as Apache Spark, except for cases where Spark cannot itself guarantee to be deterministic or are left to the implementation to define.

In practice, this means that inconsistency between results is already present in Spark itself, even if the user is unaware of it. The following sections document specific cases where this occurs but is nevertheless expected.

Floating Point Arithmetic

Floating point calculations are never expected to be exact because of the fundamental limitation of representing continuous real numbers with a fixed set of bits.

The Xonai Accelerator supports the same 4-byte single-precision and 8-byte double-precision floating-point types as Spark, but both engines may produce results with very small discrepancies as the order of machine instructions is implementation-defined.

For example, Spark itself may produce inconsistent results for the same application if the JDK version is changed, as each JDK may produce instructions to do floating point arithmetic with a different order or because of changes in math-related builtins.

When maintaining arithmetic precision is critical, such as in currency calculations, the Spark Decimal type should be used instead.

Ordering of Results

Rows with the same sorting or grouping values may be returned in a different order than default Spark, but the SQL standard specification is always adhered to.

This means that the order of results of aggregations is inherently inconsistent, as the order of elements in the underlying hash table is implementation-defined, and this applies to other operations such as sort-merge join.

When sorting, both Spark and the Xonai Accelerator comply with the SQL standards specification and do not guarantee stable sorting, meaning the order of rows with equal sorting may not be the same between different Spark engines.

Bug Fixes

When the Xonai Accelerator supports a new Spark release version, it backports bug fixes that will also fix results-related bugs in previous Spark versions (see this as an example).

Additionally, the accelerator may fix bugs that are not publicly identified.


Operation Support

The Xonai Accelerator is regularly updated to support new operations, while components not yet supported will simply fall back to the default Spark execution engine.

This section documents all operation support status and is updated at every new release.

Support Status Symbols

The following table describes the meaning of each status support symbol.

Symbol

Description

Supported

Unsupported at the moment

Not applicable (type does not apply to the corresponding plan or expression)

Undetermined

Data Types

The following table describes the meaning of each abbreviated type name in the table and a brief description of the type. See the official documentation for more information.

Type Name

Spark Type

Description

byte

ByteType

Represents 1-byte signed integer numbers.

short

ShortType

Represents 2-byte signed integer numbers.

int

IntType

Represents 4-byte signed integer numbers.

long

LongType

Represents 8-byte signed integer numbers.

float

FloatType

Represents 4-byte single-precision floating point numbers.

double

DoubleType

Represents 8-byte double-precision floating point numbers.

decimal

DecimalType

Represents arbitrary-precision signed decimal numbers.

string

StringType

VarCharType

CharType

Represents character string values.

bin

BinaryType

Represents byte sequence values.

bool

BooleanType

Represents boolean values.

tstamp

TimestampType

Represents values with year, month, day, hour, minute, and second fields.

date

DateType

Represents values with year, month and day fields, without a time zone.

calendar

CalendarInterval

Represents calendar intervals.

array

ArrayType

Represents a sequence of elements of a specific type.

map

MapType

Represents a set of key-value pairs.

struct

StructType

Represents a sequence of named fields.

udt

-

User-defined types and Java objects (non-standard SQL types).

SQL Operators

Operator

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

AggregateInPandasExec

To be supported soon

ArrowEvalPythonExec

To be supported soon

BatchScanExec

Output

BroadcastExchangeExec

Input/Output

BroadcastHashJoinExec

Input/Output

BroadcastNestedLoopJoinExec

To be supported soon

CartesianProductExec

To be supported soon

CoalesceExec

Input/Output

CollectLimitExec

To be supported soon

CollectMetricsExec

To be supported soon

CollectTailExec

To be supported soon

DataWritingCommandExec

To be supported soon

DebugExec

To be supported soon

ExpandExec

Input/Output

FileSourceScanExec (Parquet)

Output

FileSourceScanExec (JSON)

Output

FileSourceScanExec (ORC)

Output

FilterExec

Input/Output

FlatMapGroupsInPandasExec

To be supported soon

GenerateExec

Input/Output

GlobalLimitExec

To be supported soon

HashAggregateExec

Input/Output

InMemoryTableScanExec

To be supported soon

LocalLimitExec

To be supported soon

MapInPandasExec

To be supported soon

ObjectHashAggregateExec

Input/Output

ProjectExec

Input

Output

RangeExec

To be supported soon

SampleExec

To be supported soon

ShuffleExchangeExec

Input/Output

ShuffledHashJoinExec

Input/Output

SortAggregateExec

Input/Output

SortExec

Input/Output

SortMergeJoinExec

Input/Output

SubqueryBroadcastExec

Input/Output

SubqueryExec

To be supported soon

TakeOrderedAndProjectExec

Input/Output

UnionExec

Input/Output

WindowExec

To be supported soon

WindowInPandasExec

To be supported soon

WriteFilesExec

To be supported soon

SQL Expressions

Aggregate Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

AggregateExpression

aggFunc

filter

Output

AnyValue

To be supported soon

ApproxCountDistinctForIntervals

To be supported soon

ApproximatePercentile

To be supported soon

Average

Input

Output

BitAndAgg

Input/Output

BitOrAgg

Input/Output

BitXorAgg

Input/Output

BloomFilterAggregate

To be supported soon

BloomFilterMightContain

To be supported soon

CollectList

To be supported soon

CollectSet

To be supported soon

CollectTopK

To be supported soon

Corr

Input/Output

Count

Input

Output

CountMinSketchAgg

To be supported soon

CovPopulation

Input/Output

CovSample

Input/Output

First

Input/Output

HistogramNumeric

To be supported soon

HyperLogLogPlusPlus

To be supported soon

Kurtosis

Input/Output

Last

To be supported soon

Max

Input/Output

MaxBy

To be supported soon

Min

Input/Output

MinBy

To be supported soon

Mode

To be supported soon

PandasCovar

To be supported soon

PandasKurtosis

To be supported soon

PandasMode

To be supported soon

PandasProduct

To be supported soon

PandasSkewness

To be supported soon

PandasStddev

To be supported soon

PandasVariance

To be supported soon

Percentile

To be supported soon

PercentileDisc

To be supported soon

PivotFirst

To be supported soon

Product

To be supported soon

RegrIntercept

Input/Output

RegrR2

Input/Output

RegrReplacement

Input/Output

RegrSXY

Input/Output

RegrSlope

Input/Output

Skewness

Input/Output

StddevPop

Input/Output

StddevSamp

Input/Output

Sum

Input

Output

VariancePop

Input/Output

VarianceSamp

Input/Output

Alchemy Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

HyperLogLogCardinality

Input

Output

HyperLogLogInitSimpleAgg

Input/Output

Arithmetic Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Abs

Input/Output

Add

Input/Output

Divide

Input/Output

IntegralDivide

Input

Output

Multiply

Input/Output

Pmod

Input/Output

Remainder

Input/Output

Subtract

Input/Output

UnaryMinus

Input/Output

UnaryPositive

Input/Output

Array Type Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

ArrayAggregate

To be supported soon

ArrayContains

To be supported soon

ArrayDistinct

To be supported soon

ArrayExcept

To be supported soon

ArrayExists

To be supported soon

ArrayFilter

To be supported soon

ArrayForAll

To be supported soon

ArrayIntersect

To be supported soon

ArrayJoin

To be supported soon

ArrayMax

To be supported soon

ArrayMin

To be supported soon

ArrayPosition

To be supported soon

ArrayRemove

To be supported soon

ArrayRepeat

To be supported soon

ArraySort

To be supported soon

ArrayTransform

To be supported soon

ArrayUnion

To be supported soon

ArraysOverlap

To be supported soon

ArraysZip

To be supported soon

CreateArray

To be supported soon

Flatten

To be supported soon

GetArrayItem

To be supported soon

GetArrayStructFields

To be supported soon

Sequence

To be supported soon

Shuffle

To be supported soon

Slice

To be supported soon

SortArray

To be supported soon

ZipWith

To be supported soon

Bitmap Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

BitmapBitPosition

To be supported soon

BitmapBucketNumber

To be supported soon

BitmapConstructAgg

To be supported soon

BitmapCount

To be supported soon

BitmapOrAgg

To be supported soon

Bitwise Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

BitwiseAnd

Input/Output

BitwiseCount

Input

Output

BitwiseGet

Input 1

Input 2

Output

BitwiseNot

Input/Output

BitwiseOr

Input/Output

BitwiseReverse

Input/Output

BitwiseXor

Input/Output

Core Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Alias

Input/Output

AttributeReference

Output

Cast

Input/Output

Literal

Output

ScalarSubquery

Output

Collection Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

ElementAt

To be supported soon

Size

Input

Output

Conditional Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

CaseWhen

when

then

else

Output

If

predicate

trueValue

falseValue

Output

Constraint Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

KnownFloatingPointNormalized

Input/Output

KnownNotNull

Input/Output

CSV Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

CsvToStructs

To be supported soon

SchemaOfCsv

To be supported soon

StructsToCsv

To be supported soon

Datetime Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

AddMonths

Input 1

Input 2

Output

CurrentBatchTimestamp

To be supported soon

CurrentDate

To be supported soon

CurrentTimeZone

To be supported soon

CurrentTimestamp

To be supported soon

DateAdd

Input 1

Input 2

Output

DateAddInterval

To be supported soon

DateDiff

Input

Output

DateFormatClass

To be supported soon

DateFromUnixDate

Input

Output

DateSub

Input 1

Input 2

Output

DayOfMonth

Input

Output

DayOfWeek

Input

Output

DayOfYear

Input

Output

FromUTCTimestamp

To be supported soon

FromUnixTime

To be supported soon

Hour

To be supported soon

LastDay

Input/Output

MakeDate

Input

Output

MakeTimestamp

To be supported soon

MicrosToTimestamp

Input

Output

MillisToTimestamp

Input

Output

Minute

To be supported soon

Month

Input

Output

MonthsBetween

To be supported soon

NextDay

Input 1

Input 2

Output

Now

To be supported soon

Quarter

Input

Output

Second

To be supported soon

SecondWithFraction

To be supported soon

SecondsToTimestamp

Input

Output

SubtractDates

To be supported soon

SubtractTimestamps

To be supported soon

TimeAdd

To be supported soon

ToUTCTimestamp

To be supported soon

ToUnixTimestamp

To be supported soon

TruncDate

Input 1

Input 2

Output

TruncTimestamp

To be supported soon

UnixDate

Input

Output

UnixMicros

Input

Output

UnixMillis

Input

Output

UnixSeconds

Input

Output

UnixTimestamp

To be supported soon

WeekDay

Input

Output

WeekOfYear

Input

Output

Year

Input

Output

YearOfWeek

Input

Output

Decimal Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

CheckOverflow

To be supported soon

CheckOverflowInSum

To be supported soon

MakeDecimal

To be supported soon

UnscaledValue

To be supported soon

Generator Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Explode

Input

Output

GeneratorOuter

To be supported soon

Inline

To be supported soon

PosExplode

To be supported soon

ReplicateRows

To be supported soon

Stack

To be supported soon

UserDefinedGenerator

To be supported soon

Hash Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Crc32

Input

Output

HiveHash

To be supported soon

Md5

Input

Output

Murmur3Hash

Input

Output

Sha1

Input

Output

Sha2

Input 1

Input 2

Output

XxHash64

Input

Output

Hll Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

HllSketchAgg

To be supported soon

HllSketchEstimate

To be supported soon

HllUnion

To be supported soon

HllUnionAgg

To be supported soon

Input File Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

InputFileBlockLength

To be supported soon

InputFileBlockStart

To be supported soon

InputFileName

To be supported soon

Interval Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

DivideInterval

To be supported soon

ExtractIntervalDays

To be supported soon

ExtractIntervalHours

To be supported soon

ExtractIntervalMinutes

To be supported soon

ExtractIntervalMonths

To be supported soon

ExtractIntervalSeconds

To be supported soon

ExtractIntervalYears

To be supported soon

MakeInterval

To be supported soon

MultiplyInterval

To be supported soon

JSON Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

GetJsonObject

To be supported soon

JsonObjectKeys

To be supported soon

JsonToStructs

To be supported soon

JsonTuple

To be supported soon

LengthOfJsonArray

To be supported soon

SchemaOfJson

To be supported soon

StructsToJson

To be supported soon

Lambda Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

LambdaFunction

To be supported soon

NamedLambdaVariable

To be supported soon

Map Type Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

CreateMap

To be supported soon

GetMapValue

To be supported soon

MapConcat

To be supported soon

MapEntries

To be supported soon

MapFilter

To be supported soon

MapFromArrays

To be supported soon

MapFromEntries

To be supported soon

MapKeys

To be supported soon

MapValues

To be supported soon

MapZipWith

To be supported soon

StringToMap

To be supported soon

TransformKeys

To be supported soon

TransformValues

To be supported soon

Math Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Acos

Input/Output

Acosh

Input/Output

Asin

Input/Output

Asinh

Input/Output

Atan

Input/Output

Atan2

Input/Output

Atanh

Input/Output

BRound

Input/Output

Bin

Input

Output

Cbrt

Input/Output

Ceil

Input

Output

Conv

To be supported soon

Cos

Input/Output

Cosh

Input/Output

Cot

Input/Output

Csc

Input/Output

EulerNumber

Output

Exp

Input/Output

Expm1

Input/Output

Factorial

Input

Output

Floor

Input

Output

Hex

Input

Output

Hypot

Input/Output

Log

Input/Output

Log10

Input/Output

Log1p

Input/Output

Log2

Input/Output

Logarithm

Input/Output

NormalizeNaNAndZero

Input/Output

Pi

Output

Pow

Input/Output

Rint

Input/Output

Round

Input/Output

Sec

Input/Output

ShiftLeft

Input 1

Input 2

Output

ShiftRight

Input 1

Input 2

Output

ShiftRightUnsigned

Input 1

Input 2

Output

Signum

Input/Output

Sin

Input/Output

Sinh

Input/Output

Sqrt

Input/Output

Tan

Input/Output

Tanh

Input/Output

ToDegrees

Input/Output

ToRadians

Input/Output

Unhex

Input

Output

WidthBucket

value

minValue

maxValue

numBucket

Output

Miscellaneous Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

MonotonicallyIncreasingID

Output

PrintToStderr

To be supported soon

PythonUDF

To be supported soon

RaiseError

To be supported soon

Rand

Output

Randn

Output

ScalaUDF

To be supported soon

SparkPartitionID

Output

SparkVersion

Output

TypeOf

To be supported soon

Uuid

Output

Null Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

AtLeastNNonNulls

Input

Output

Coalesce

Input/Output

IsNaN

Input

Output

IsNotNull

Input

Output

IsNull

Input

Output

NaNvl

Input/Output

Ordering Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Greatest

Input/Output

Least

Input/Output

SortOrder

Input/Output

Predicate Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

And

Input/Output

EqualNullSafe

Input

Output

EqualTo

Input

Output

GreaterThan

Input

Output

GreaterThanOrEqual

Input

Output

In

Input

Output

InSet

Input

Output

InSubquery

To be supported soon

LessThan

Input

Output

LessThanOrEqual

Input

Output

Not

Input/Output

Or

Input/Output

Regex Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Like

Input

Output

LikeAll

To be supported soon

LikeAny

To be supported soon

NotLikeAll

To be supported soon

NotLikeAny

To be supported soon

RLike

To be supported soon

RegExpExtract

To be supported soon

RegExpExtractAll

To be supported soon

RegExpReplace

To be supported soon

StringSplit

To be supported soon

String Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

Ascii

Input

Output

Base64

To be supported soon

BitLength

Input

Output

Chr

Input

Output

Concat

Input/Output

ConcatWs

sep

strings

Output

Contains

Input

Output

Decode

To be supported soon

Elt

To be supported soon

Encode

To be supported soon

EndsWith

Input

Output

EphemeralSubstring

str

pos

len

Output

FindInSet

To be supported soon

FormatNumber

To be supported soon

FormatString

To be supported soon

InitCap

Input/Output

Length

Input

Output

Levenshtein

To be supported soon

Lower

Input/Output

OctetLength

Input

Output

Overlay

To be supported soon

ParseUrl

To be supported soon

Reverse

Input/Output

Sentences

To be supported soon

SoundEx

Input/Output

StartsWith

Input

Output

StringInstr

Input

Output

StringLPad

To be supported soon

StringLocate

substr

str

pos

Output

StringRPad

To be supported soon

StringRepeat

str

times

Output

StringReplace

To be supported soon

StringSpace

Input

Output

StringTranslate

Input/Output

StringTrim

Input/Output

StringTrimLeft

Input/Output

StringTrimRight

Input/Output

Substring

str

pos

len

Output

SubstringIndex

To be supported soon

UnBase64

To be supported soon

Upper

Input/Output

Struct Type Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

CreateNamedStruct

To be supported soon

GetStructField

Input

Output

Window Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

CumeDist

To be supported soon

DenseRank

To be supported soon

NTile

To be supported soon

NthValue

To be supported soon

PercentRank

To be supported soon

PreciseTimestampConversion

To be supported soon

Rank

To be supported soon

RowNumber

To be supported soon

XML Expressions

Expression

Param(s)

Numeric Types

Misc. Types

Date/Time Types

Complex Types

byte

short

int

long

float

double

decimal

string

bin

bool

null

tstamp

date

calendar

array

map

struct

udt

XPathBoolean

To be supported soon

XPathDouble

To be supported soon

XPathFloat

To be supported soon

XPathInt

To be supported soon

XPathList

To be supported soon

XPathLong

To be supported soon

XPathShort

To be supported soon

XPathString

To be supported soon


Last update: Apr 02, 2025