pyspark.sql.functions.sequence#

pyspark.sql.functions.sequence(start, stop, step=None)[source]#

Array function: Generate a sequence of integers from start to stop, incrementing by step. If step is not set, the function increments by 1 if start is less than or equal to stop, otherwise it decrements by 1.

New in version 2.4.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
startColumn or str

The starting value (inclusive) of the sequence.

stopColumn or str

The last value (inclusive) of the sequence.

stepColumn or str, optional

The value to add to the current element to get the next element in the sequence. The default is 1 if start is less than or equal to stop, otherwise -1.

Returns
Column

A new column that contains an array of sequence values.

Examples

Example 1: Generating a sequence with default step

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([(-2, 2)], ['start', 'stop'])
>>> df.select(sf.sequence(df.start, df.stop)).show()
+---------------------+
|sequence(start, stop)|
+---------------------+
|    [-2, -1, 0, 1, 2]|
+---------------------+

Example 2: Generating a sequence with a custom step

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([(4, -4, -2)], ['start', 'stop', 'step'])
>>> df.select(sf.sequence(df.start, df.stop, df.step)).show()
+---------------------------+
|sequence(start, stop, step)|
+---------------------------+
|          [4, 2, 0, -2, -4]|
+---------------------------+

Example 3: Generating a sequence with a negative step

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([(5, 1, -1)], ['start', 'stop', 'step'])
>>> df.select(sf.sequence(df.start, df.stop, df.step)).show()
+---------------------------+
|sequence(start, stop, step)|
+---------------------------+
|            [5, 4, 3, 2, 1]|
+---------------------------+