pyspark.sql.functions.slice#
- pyspark.sql.functions.slice(x, start, length)[source]#
Array function: Returns a new array column by slicing the input array column from a start index to a specific length. The indices start at 1, and can be negative to index from the end of the array. The length specifies the number of elements in the resulting array.
New in version 2.4.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- Returns
Column
A new Column object of Array type, where each value is a slice of the corresponding list from the input column.
Examples
Example 1: Basic usage of the slice function.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, 3],), ([4, 5],)], ['x']) >>> df.select(sf.slice(df.x, 2, 2)).show() +--------------+ |slice(x, 2, 2)| +--------------+ | [2, 3]| | [5]| +--------------+
Example 2: Slicing with negative start index.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, 3],), ([4, 5],)], ['x']) >>> df.select(sf.slice(df.x, -1, 1)).show() +---------------+ |slice(x, -1, 1)| +---------------+ | [3]| | [5]| +---------------+
Example 3: Slice function with column inputs for start and length.
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([([1, 2, 3], 2, 2), ([4, 5], 1, 3)], ['x', 'start', 'length']) >>> df.select(sf.slice(df.x, df.start, df.length)).show() +-----------------------+ |slice(x, start, length)| +-----------------------+ | [2, 3]| | [4, 5]| +-----------------------+