Mastering Apache Spark is published by Packt Publishing in October 2015. This book has 341 pages in English, ISBN-13 978-1783987146.
Gain expertise in processing and storing data by using advanced techniques with Apache Spark About This Book Explore the integration of Apache Spark with third party applications such as H20, Databricks and Titan Evaluate how Cassandra and Hbase can be used for storage An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities Who This Book Is For If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected. What You Will Learn Extend the tools available for processing and storage Examine clustering and classification using MLlib Discover Spark stream processing via Flume, HDFS Create a schema in Spark SQL, and learn how a Spark schema can be populated with data Study Spark based graph processing using Spark GraphX Combine Spark with H20 and deep learning and learn why it is useful Evaluate how graph storage works with Apache Spark, Titan, HBase and Cassandra Use Apache Spark in the cloud with Databricks and AWS In Detail Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations. This book aims to take your limited knowledge of Spark to the next level by teaching you how to expand Spark functionality. The book commences with an overview of the Spark eco-system. You will learn how to use MLlib to create a fully working neural net for handwriting recognition. You will then discover how stream processing can be tuned for optimal performance and to ensure parallel processing. The book extends to show how to incorporate H20 for. 🐧 @iranopensource
Gain expertise in processing and storing data by using advanced techniques with Apache Spark About This Book Explore the integration of Apache Spark with third party applications such as H20, Databricks and Titan Evaluate how Cassandra and Hbase can be used for storage An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities Who This Book Is For If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected. What You Will Learn Extend the tools available for processing and storage Examine clustering and classification using MLlib Discover Spark stream processing via Flume, HDFS Create a schema in Spark SQL, and learn how a Spark schema can be populated with data Study Spark based graph processing using Spark GraphX Combine Spark with H20 and deep learning and learn why it is useful Evaluate how graph storage works with Apache Spark, Titan, HBase and Cassandra Use Apache Spark in the cloud with Databricks and AWS In Detail Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations. This book aims to take your limited knowledge of Spark to the next level by teaching you how to expand Spark functionality. The book commences with an overview of the Spark eco-system. You will learn how to use MLlib to create a fully working neural net for handwriting recognition. You will then discover how stream processing can be tuned for optimal performance and to ensure parallel processing. The book extends to show how to incorporate H20 for. 🐧 @iranopensource
Author: Timothy Boronczyk
Pub Date: 2016
ISBN: 978-1-78328-888-5
Pages: 406
Language: English
Format: EPUB, PDF (conv)
Book Details
CentOS is derived from Red Hat Enterprise Linux (RHEL) sources and is widely used as a Linux server. This book will help you to better configure and manage Linux servers in varying scenarios and business requirements. Starting with installing CentOS, this book will walk you through the networking aspects of CentOS. You will then learn how to manage users and their permissions, software installs, disks, filesystems, and so on. You’ll then see how to secure connection to remotely access a desktop and work with databases. Toward the end, you will find out how to manage DNS, e-mails, web servers, and more. You will also learn to detect threats by monitoring network intrusion. Finally, the book will cover virtualization techniques that will help you make the most of CentOS.
Table of Contents
What You Will Learn
See how to deploy CentOS easily and painlessly, even in multi-server environments
Configure various methods of remote access to the server so you don’t always have to be in the data center
Make changes to the default configuration of many services to harden them and increase the security of the system
Learn to manage DNS, emails and web servers
Protect yourself from threats by monitoring and logging network intrusion and system intrusion attempts, rootkits, and viruses
Take advantage of today’s powerful hardware by running multiple systems using virtualization
Authors
Timothy Boronczyk
Timothy Boronczyk is a native of Syracuse, New York, where he works as a lead developer at Optanix, Inc. (formerly ShoreGroup, Inc.). He’s been involved with web technologies since 1998, has a degree in Software Application Programming, and is a Zend Certified Engineer. In what little spare time he has left, Timothy enjoys hanging out with friends, studying Esperanto, and sleeping with his feet off the end of the bed. He’s easily distracted by shiny objects. 🐧 @iranopensource
Pub Date: 2016
ISBN: 978-1-78328-888-5
Pages: 406
Language: English
Format: EPUB, PDF (conv)
Book Details
CentOS is derived from Red Hat Enterprise Linux (RHEL) sources and is widely used as a Linux server. This book will help you to better configure and manage Linux servers in varying scenarios and business requirements. Starting with installing CentOS, this book will walk you through the networking aspects of CentOS. You will then learn how to manage users and their permissions, software installs, disks, filesystems, and so on. You’ll then see how to secure connection to remotely access a desktop and work with databases. Toward the end, you will find out how to manage DNS, e-mails, web servers, and more. You will also learn to detect threats by monitoring network intrusion. Finally, the book will cover virtualization techniques that will help you make the most of CentOS.
Table of Contents
What You Will Learn
See how to deploy CentOS easily and painlessly, even in multi-server environments
Configure various methods of remote access to the server so you don’t always have to be in the data center
Make changes to the default configuration of many services to harden them and increase the security of the system
Learn to manage DNS, emails and web servers
Protect yourself from threats by monitoring and logging network intrusion and system intrusion attempts, rootkits, and viruses
Take advantage of today’s powerful hardware by running multiple systems using virtualization
Authors
Timothy Boronczyk
Timothy Boronczyk is a native of Syracuse, New York, where he works as a lead developer at Optanix, Inc. (formerly ShoreGroup, Inc.). He’s been involved with web technologies since 1998, has a degree in Software Application Programming, and is a Zend Certified Engineer. In what little spare time he has left, Timothy enjoys hanging out with friends, studying Esperanto, and sleeping with his feet off the end of the bed. He’s easily distracted by shiny objects. 🐧 @iranopensource