actio_python_utils.spark_functions.serialize_array_field

actio_python_utils.spark_functions.serialize_array_field(self, column, new_column, dtype, struct_columns_to_use=None)[source]

Serializes a pyspark.sql.types.ArrayType field for output.

Parameters:
  • self (DataFrame) – The dataframe to use

  • column (str) – The name of the column to serialize

  • new_column (str) – The name to give the new serialized column

  • dtype (ArrayType) – The column definition

  • struct_columns_to_use (Optional[Container], default: None) – A set of struct values to use (assuming column is a struct)

Raises:

NotImplementedError – If the type in the array is a nested struct

Return type:

DataFrame

Returns:

A new dataframe with the serialized column