pyspark.pandas.groupby.GroupBy.quantile#
- GroupBy.quantile(q=0.5, accuracy=10000)[source]#
Return group values at the given quantile.
New in version 3.4.0.
- Parameters
- qfloat, default 0.5 (50% quantile)
Value between 0 and 1 providing the quantile to compute.
- accuracyint, optional
Default accuracy of approximation. Larger value means better accuracy. The relative error can be deduced by 1.0 / accuracy. This is a panda-on-Spark specific parameter.
- Returns
- pyspark.pandas.Series or pyspark.pandas.DataFrame
Return type determined by caller of GroupBy object.
See also
Notes
quantile in pandas-on-Spark are using distributed percentile approximation algorithm unlike pandas, the result might be different with pandas, also interpolation parameter is not supported yet.
Examples
>>> df = ps.DataFrame([ ... ['a', 1], ['a', 2], ['a', 3], ... ['b', 1], ['b', 3], ['b', 5] ... ], columns=['key', 'val'])
Groupby one column and return the quantile of the remaining columns in each group.
>>> df.groupby('key').quantile() val key a 2.0 b 3.0