pyspark.sql.streaming.DataStreamReader.load¶
-
DataStreamReader.
load
(path: Optional[str] = None, format: Optional[str] = None, schema: Union[pyspark.sql.types.StructType, str, None] = None, **options: OptionalPrimitiveType) → DataFrame[source]¶ Loads a data stream from a data source and returns it as a
DataFrame
.New in version 2.0.0.
- Parameters
- pathstr, optional
optional string for file-system backed data sources.
- formatstr, optional
optional string for format of the data source. Default to ‘parquet’.
- schema
pyspark.sql.types.StructType
or str, optional optional
pyspark.sql.types.StructType
for the input schema or a DDL-formatted string (For examplecol0 INT, col1 DOUBLE
).- **optionsdict
all other string options
Notes
This API is evolving.
Examples
>>> json_sdf = spark.readStream.format("json") \ ... .schema(sdf_schema) \ ... .load(tempfile.mkdtemp()) >>> json_sdf.isStreaming True >>> json_sdf.schema == sdf_schema True