pyspark.sql.functions.timestamp_diff#

pyspark.sql.functions.timestamp_diff(unit, start, end)[source]#

Gets the difference between the timestamps in the specified units by truncating the fraction part.

New in version 4.0.0.

Parameters
unitstr

This indicates the units of the difference between the given timestamps. Supported options are (case insensitive): “YEAR”, “QUARTER”, “MONTH”, “WEEK”, “DAY”, “HOUR”, “MINUTE”, “SECOND”, “MILLISECOND” and “MICROSECOND”.

startColumn or str

A timestamp which the expression subtracts from endTimestamp.

endColumn or str

A timestamp from which the expression subtracts startTimestamp.

Returns
Column

the difference between the timestamps.

Examples

>>> import datetime
>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame(
...     [(datetime.datetime(2016, 3, 11, 9, 0, 7), datetime.datetime(2024, 4, 2, 9, 0, 7))],
... ).toDF("start", "end")
>>> df.select(sf.timestamp_diff("year", "start", "end")).show()
+-------------------------------+
|timestampdiff(year, start, end)|
+-------------------------------+
|                              8|
+-------------------------------+
>>> df.select(sf.timestamp_diff("WEEK", "start", "end")).show()
+-------------------------------+
|timestampdiff(WEEK, start, end)|
+-------------------------------+
|                            420|
+-------------------------------+
>>> df.select(sf.timestamp_diff("day", "end", "start")).show()
+------------------------------+
|timestampdiff(day, end, start)|
+------------------------------+
|                         -2944|
+------------------------------+