home..
Spark documentation 목차
July 2021
https://spark.apache.org/docs/latest/ Spark Documentation을 전체적으로 살펴보자.
- Launching on a Cluster https://spark.apache.org/docs/latest/spark-standalone.html
- RDD Programming Guide : RDDs, accumulators, broadcast variables
https://spark.apache.org/docs/latest/rdd-programming-guide.html - Spark SQL, Datasets, DataFrames: 관계형 쿼리로 정형 데이터 처리
https://spark.apache.org/docs/latest/sql-programming-guide.html - Structured Streaming: 관계형 쿼리로 정형 데이터 streams 처리 https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html
- Spark Streaming: DStreams로 data streams 처리 https://spark.apache.org/docs/latest/streaming-programming-guide.html
- MLlib: 머신러닝 https://spark.apache.org/docs/latest/ml-guide.html
- GraphX: graph https://spark.apache.org/docs/latest/graphx-programming-guide.html
- PySpark: python으로 spark 데이터 처리 https://spark.apache.org/docs/latest/api/python/getting_started/index.html
- Spark Python API(Sphinx) https://spark.apache.org/docs/latest/api/python/index.html
- Configuration https://spark.apache.org/docs/latest/configuration.html
- Monitoring : Web 모니터링 https://spark.apache.org/docs/latest/monitoring.html
이런 흐름으로 Spark를 이해하면 될 것 같다.