Описание тега apache-spark-dataset
Spark Dataset is a strongly typed collection of objects mapped to a relational schema. It supports the similar optimizations to Spark DataFrames providing type-safe programming interface at the same time.
External links:
- SPARK-9999 - Dataset API on top of Catalyst/DataFrame
- Michael Armbrust, Wenchen Fan, Reynold Xin and Matei Zaharia. Introducing Spark Datasets. https://databricks.com/blog/2016/01/04/introducing-spark-datasets.html
Related tags: apache-spark, apache-spark-sql, spark-dataframe, rdd