Complete PySpark Developer Course - Druckversion +- Forum Rockoldies (https://rockoldies.net/forum) +-- Forum: Fotobearbeitung - Photoshop (https://rockoldies.net/forum/forumdisplay.php?fid=16) +--- Forum: E-Learning, Tutorials (https://rockoldies.net/forum/forumdisplay.php?fid=18) +--- Thema: Complete PySpark Developer Course (/showthread.php?tid=37600) |
Complete PySpark Developer Course - Panter - 12.10.2021 Complete PySpark Developer Course Genre: eLearning | MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz Language: English | Size: 5.34 GB | Duration: 22h 36m Learn PySpark in-depth with hundreds of Practical examples. Be a complete PySpark Developer. What you'll learn Complete Curriculum for a successful PySpark Developer Complete Flow of Installation of PySpark Introduction to Spark Understand SparkSession Spark RDD Fundamentals, Operations, Persistence. Practical Examples to solve problems. Spark Cluster Architecture - Execution, YARN, JVM Processes, DAG Scheduler, Task Scheduler Spark Shared Variables Spark SQL Architecture, Catalyst Optimizer, Volcano Iterator Model, Tungsten Execution Engine DataFrame Fundamentals DataFrame Rows, Columns and DataTypes. Practical examples. ETL Using DataFrame (Extraction APIs, Transformation APIs, and Loading APIs). Practical Examples. Optimization and Management - Join Strategies, Driver Conf, Executor Conf etc HDFS Commands (Will be added shortly) Python Fundamentals (Will be added shortly) Description This is a complete PySpark Developer course for Data Engineers and Data Scientists and others who wants to process Big Data in an effective manner. We will cover below topics and more: Complete Curriculum for a successful PySpark Developer Complete Flow of Installation of PySpark Introduction to Spark (Why Spark was Developed, Spark Features, Spark Components) Understand SparkSession Spark RDD Fundamentals How to Create RDDs RDD Operations (Transformations & Actions) Spark Cluster Architecture - Execution, YARN, JVM Processes, DAG Scheduler, Task Scheduler RDD Persistence Spark Shared Variables (Broadcast and Accumulators) Spark SQL Architecture, Catalyst Optimizer, Volcano Iterator Model, Tungsten Execution Engine, Different Benchmarks Spark Commonly Used Functions - Version, range, createDataFrame, sql, table, SparkContext, conf, read, udf, newSession, stop, catalog etc DataFrame Built-in functions - new column, encryption, string, regexp, date, null, collection, na, math and statistics, explode, flatten, formatting and json What is Partition, Repartition and Coalesce Repartition Vs Coalesce Extraction - csv file, text file, Parquet File, orc file, json file, avro file, hive, jdbc DataFrame Fundamentals (What is a DataFrame, DataFrame Sources, DataFrame Features, DataFrame Organization) DataFrame Rows, Columns and DataTypes. Practical examples. ETL Using DataFrame (Extraction APIs, Transformation APIs, and Loading APIs). Practical Examples. Optimization and Management - Join Strategies, Driver Conf, Parallelism Configurations, Executor Conf etc HDFS Commands (Will be added shortly) Python Fundamentals (Will be added shortly) More will be added Who this course is for: Any IT professional willing to learn advanced Big Data Technologies like PySpark. Python Developers who wants to learn Spark. Data Engineers and Data Scientists. Homepage Download from Rapidgator: |