Spark : big data cluster computing in production / Ilya Ganelin [and others].

By:

Ganelin, Ilya

Contributor(s):

text

Media type:

computer

Carrier type:

online resource

ISBN:

9781119254805
1119254809
9781119254041
1119254043
9781119254058
1119254051

Subject(s):

Genre/Form:

Electronic books.

Additional physical formats: Print version:: Spark : Big Data Cluster Computing in Production.DDC classification:

006.3/12 23

LOC classification:

QA76.9.D343

Online resources:

Wiley Online Library

Contents:

Spark'!Big Data Cluster Computing in Production; About the Authors; About the Technical Editors; Credits; Acknowledgments; Contents at a glance; Contents; Introduction; Chapter 1 Finishing Your Spark Job; Installation of the Necessary Components; Native Installation Using a Spark Standalone Cluster; The History of Distributed Computing That Led to Spark; Enter the Cloud; Understanding Resource Management; Using Various Formats for Storage; Text Files; Sequence Files; Avro Files; Parquet Files; Making Sense of Monitoring and Instrumentation; Spark UI; Spark Standalone UI; Metrics REST API.

Metrics SystemExternal Monitoring Tools; Summary; Chapter 2 Cluster Management; Background; Spark Components; Driver; Workers and Executors; Configuration; Spark Standalone; Architecture; Single-Node Setup Scenario; Multi-Node Setup; YARN; Architecture; Dynamic Resource Allocation; Scenario; Mesos; Setup; Architecture; Dynamic Resource Allocation; Basic Setup Scenario; Comparison; Summary; Chapter 3 Performance Tuning; Spark Execution Model; Partitioning; Controlling Parallelism; Partitioners; Shuffling Data; Shuffling and Data Partitioning; Operators and Shuffling.

Shuffling Is Not That Bad After AllSerialization; Kryo Registrators; Spark Cache; Spark SQL Cache; Memory Management; Garbage Collection; Shared Variables; Broadcast Variables; Accumulators; Data Locality; Summary; Chapter 4 Security; Architecture; Security Manager; Setup Configurations; ACL; Configuration; Job Submission; Web UI; Network Security; Encryption; Event logging; Kerberos; Apache Sentry; Summary; Chapter 5 Fault Tolerance or Job Execution; Lifecycle of a Spark Job; Spark Master; Spark Driver; Spark Worker; Job Lifecycle; Job Scheduling; Scheduling within an Application.

Scheduling with External UtilitiesFault Tolerance; Internal and External Fault Tolerance; Service Level Agreements (SLAs); Resilient Distributed Datasets (RDDs); Batch versus Streaming; Testing Strategies; Recommended Configurations; Summary; Chapter 6 Beyond Spark; Data Warehousing; Spark SQL CLI; Thrift JDBC/ODBC Server; Hive on Spark; Machine Learning; DataFrame; MLlib and ML; Mahout on Spark; Hivemall on Spark; External Frameworks; Spark Package; XGBoost; spark-jobserver; Future Works; Integration with the Parameter Server; Deep Learning; Enterprise Usage.

Collecting User Activity Log with Spark and KafkaReal-Time Recommendation with Spark; Real-Time Categorization of Twitter Bots; Summary; Index; EULA.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

No physical items for this record

Print version record.

Collecting User Activity Log with Spark and KafkaReal-Time Recommendation with Spark; Real-Time Categorization of Twitter Bots; Summary; Index; EULA.

There are no comments on this title.

to post a comment.

Back to results

1 Identity in narrative :
by De Fina, Anna
2 Discourse and identity /
3 Dislocations/relocations :
4 Analyzing narrative :
by De Fina, Anna.
5 The handbook of narrative analysis /

Spark : big data cluster computing in production / Ilya Ganelin [and others].

Contact Us

Copyright ©The National University of Malaysia Library