Typecasting in SQL could force database does not take into account index or sort column
I faced this issue while working with Redshift. But it can be found in any SQL database engine.
Redshift is known as special column database, that allows run query in parallel way that to distribution and sort key. Of course, the most widely used sort key is date and should really careful with it in queries. Sort key is associated only with specific date/time format, so converting this field to any format will cause huge drop of performance. Redshift have to convert all stored values to new format during query execution and new converted values have no sort index, so no advantages of sort key.
As an example, let's imagine that there is a column with timestamp type without time zone. In SQL query we need to convert string to timestampt, so let's use function TO_TIMESTAMP
(http://docs.aws.amazon.com/redshift/latest/dg/r_TO_TIMESTAMP.html):
select * from
some_table
where timestamp_field > TO_TIMESTAMP('2018-03-24', 'yyyy-MM-dd')
Actually, in this case Redshift should compare field timestamp_field
with type TIMESTAMP
against value 2018-03-24
in format of timestampTZ
, because TO_TIMESTAMP
function returns timestampTZ
value. It will require implicit casting of timestamp_field
to type timestampTZ
and it will be absolutely new field without any sort key.
To avoid performance drop can be used ::TIMESTAMP
or directly value 2018-03-24
with no casting, it will be casted implicitly to TIMESTAMP
.
I have faced such issue on table with ~60M of rows and to_timestamp
function cased drop of performance to 90 seconds per query against 6 seconds in case of using ::TIMESTAMP
.