mathacker

no math, little hacker


  • Home

  • Categories

  • About

  • Archives

  • Tags

Spark SQL

Posted on 2019-05-31 | In Apache Spark

RDD是无结构的,Spark 1.6引入了Structured API,后者是多数场景下更为适用。

Spark SQL基于Spark Core,包含两部分:

  • DataFrame & Dataset
  • Catalyst optimizer

结构化数据结构具有schema定义。

Creating DataFrames

# Apache Spark
RDDs and in Spark
爱的艺术读书笔记
  • Table of Contents
  • Overview
Anders Cui

Anders Cui

47 posts
13 categories
30 tags
  1. 1. Creating DataFrames
© June 2017 - 2021 Anders Cui
Powered by Hexo
Theme - NexT.Mist