Learning Apache Kafka - Second Edition

Author: Nishant Garg

Publisher: Packt Publishing Ltd

ISBN: 1784390275

Category: Computers

Page: 112

View: 675

This book is for readers who want to know more about Apache Kafka at a hands-on level; the key audience is those with software development experience but no prior exposure to Apache Kafka or similar technologies. It is also useful for enterprise application developers and big data enthusiasts who have worked with other publisher-subscriber-based systems and want to explore Apache Kafka as a futuristic solution.
Posted in Computers

Kafka: The Definitive Guide

Real-Time Data and Stream Processing at Scale

Author: Neha Narkhede,Gwen Shapira,Todd Palino

Publisher: "O'Reilly Media, Inc."

ISBN: 1491936118

Category: Computers

Page: 322

View: 8924

Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer. Understand publish-subscribe messaging and how it fits in the big data ecosystem. Explore Kafka producers and consumers for writing and reading messages Understand Kafka patterns and use-case requirements to ensure reliable data delivery Get best practices for building data pipelines and applications with Kafka Manage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasks Learn the most critical metrics among Kafka’s operational measurements Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems
Posted in Computers

Building Data Streaming Applications with Apache Kafka

Author: Manish Kumar,Chanchal Singh

Publisher: Packt Publishing Ltd

ISBN: 1787287637

Category: Computers

Page: 278

View: 4244

Design and administer fast, reliable enterprise messaging systems with Apache Kafka About This Book Build efficient real-time streaming applications in Apache Kafka to process data streams of data Master the core Kafka APIs to set up Apache Kafka clusters and start writing message producers and consumers A comprehensive guide to help you get a solid grasp of the Apache Kafka concepts in Apache Kafka with pracitcalpractical examples Who This Book Is For If you want to learn how to use Apache Kafka and the different tools in the Kafka ecosystem in the easiest possible manner, this book is for you. Some programming experience with Java is required to get the most out of this book What You Will Learn Learn the basics of Apache Kafka from scratch Use the basic building blocks of a streaming application Design effective streaming applications with Kafka using Spark, Storm &, and Heron Understand the importance of a low -latency , high- throughput, and fault-tolerant messaging system Make effective capacity planning while deploying your Kafka Application Understand and implement the best security practices In Detail Apache Kafka is a popular distributed streaming platform that acts as a messaging queue or an enterprise messaging system. It lets you publish and subscribe to a stream of records, and process them in a fault-tolerant way as they occur. This book is a comprehensive guide to designing and architecting enterprise-grade streaming applications using Apache Kafka and other big data tools. It includes best practices for building such applications, and tackles some common challenges such as how to use Kafka efficiently and handle high data volumes with ease. This book first takes you through understanding the type messaging system and then provides a thorough introduction to Apache Kafka and its internal details. The second part of the book takes you through designing streaming application using various frameworks and tools such as Apache Spark, Apache Storm, and more. Once you grasp the basics, we will take you through more advanced concepts in Apache Kafka such as capacity planning and security. By the end of this book, you will have all the information you need to be comfortable with using Apache Kafka, and to design efficient streaming data applications with it. Style and approach A step-by –step, comprehensive guide filled with practical and real- world examples
Posted in Computers

Apache Kafka

Author: Nishant Garg

Publisher: Packt Pub Limited

ISBN: 9781782167938

Category: Computers

Page: 88

View: 5931

The book will follow a step-by-step tutorial approach which will show the readers how to use Apache Kafka for messaging from scratch.Apache Kafka is for readers with software development experience, but no prior exposure to Apache Kafka or similar technologies is assumed. This book is also for enterprise application developers and big data enthusiasts who have worked with other publisher-subscriber based systems and now want to explore Apache Kafka as a futuristic scalable solution.
Posted in Computers

Learning Apache Cassandra

Author: Sandeep Yarabarla

Publisher: Packt Publishing Ltd

ISBN: 1787128407

Category: Computers

Page: 360

View: 4715

Build a scalable, fault-tolerant and highly available data layer for your applications using Apache Cassandra About This Book Install Cassandra and set up multi-node clusters Design rich schemas that capture the relationships between different data types Master the advanced features available in Cassandra 3.x through a step-by-step tutorial and build a scalable, high performance database layer Who This Book Is For If you are a NoSQL developer and new to Apache Cassandra who wants to learn its common as well as not-so-common features, this book is for you. Alternatively, a developer wanting to enter the world of NoSQL will find this book useful. It does not assume any prior experience in coding or any framework. What You Will Learn Install Cassandra Create keyspaces and tables with multiple clustering columns to organize related data Use secondary indexes and materialized views to avoid denormalization of data Effortlessly handle concurrent updates with collection columns Ensure data integrity with lightweight transactions and logged batches Understand eventual consistency and use the right consistency level for your situation Understand data distribution with Cassandra Develop simple application using Java driver and implement application-level optimizations In Detail Cassandra is a distributed database that stands out thanks to its robust feature set and intuitive interface, while providing high availability and scalability of a distributed data store. This book will introduce you to the rich feature set offered by Cassandra, and empower you to create and manage a highly scalable, performant and fault-tolerant database layer. The book starts by explaining the new features implemented in Cassandra 3.x and get you set up with Cassandra. Then you'll walk through data modeling in Cassandra and the rich feature set available to design a flexible schema. Next you'll learn to create tables with composite partition keys, collections and user-defined types and get to know different methods to avoid denormalization of data. You will then proceed to create user-defined functions and aggregates in Cassandra. Then, you will set up a multi node cluster and see how the dynamics of Cassandra change with it. Finally, you will implement some application-level optimizations using a Java client. By the end of this book, you'll be fully equipped to build powerful, scalable Cassandra database layers for your applications. Style and approach This book takes a step-by- step approach to give you basic to intermediate knowledge of Apache Cassandra. Every concept is explained in depth, and is supplemented with practical examples when required.
Posted in Computers

Apache Kafka Cookbook

Author: Saurabh Minni

Publisher: Packt Publishing Ltd

ISBN: 1785880187

Category: Computers

Page: 128

View: 5122

Over 50 hands-on recipes to efficiently administer, maintain, and use your Apache Kafka installation About This Book Quickly configure and manage your Kafka cluster Learn how to use the Apache Kafka cluster and connect it with tools for big data processing A practical guide to monitor your Apache Kafka installation Who This Book Is For If you are a programmer or big data engineer using or planning to use Apache Kafka, then this book is for you. This book has several recipes which will teach you how to effectively use Apache Kafka. You need to have some basic knowledge of Java. If you don't know big data tools, this would be your stepping stone for learning how to consume the data in these kind of systems. What You Will Learn Learn how to configure Kafka brokers for better efficiency Explore how to configure producers and consumers for optimal performance Set up tools for maintaining and operating Apache Kafka Create producers and consumers for Apache Kafka in Java Understand how Apache Kafka can be used by several third party system for big data processing, such as Apache Storm, Apache Spark, Hadoop, and more Monitor Apache Kafka using tools like graphite and Ganglia In Detail This book will give you details about how to manage and administer your Apache Kafka Cluster. We will cover topics like how to configure your broker, producer, and consumer for maximum efficiency for your situation. Also, you will learn how to maintain and administer your cluster for fault tolerance. We will also explore tools provided with Apache Kafka to do regular maintenance operations. We shall also look at how to easily integrate Apache Kafka with big data tools like Hadoop, Apache Spark, Apache Storm, and Elasticsearch. Style and approach Easy-to-follow, step-by-step recipes explaining from start to finish how to accomplish real-world tasks.
Posted in Computers

Conversations with Kafka (Second Edition)

Author: Gustav Janouch

Publisher: New Directions Publishing

ISBN: 0811221024

Category: Biography & Autobiography

Page: 228

View: 1067

A literary gem – a portrait from life of Franz Kafka – now with an ardent preface by Francine Prose, avowed “fan of Janouch’s odd and beautiful book.” Gustav Janouch met Franz Kafka, the celebrated author of The Metamorphosis, as a seventeen-year-old fledgling poet. As Francine Prose notes in her wonderful preface, “they fell into the habit of taking long strolls through the city, strolls on which Kafka seems to have said many amazing, incisive, literary, and per- things to his companion and interlocutor, the teenage Boswell of Prague. Crossing a windswept square, apropos of something or other, Kafka tells Janouch, ‘Life is infinitely great and profound as the immensity of the stars above us. One can only look at it through the narrow keyhole of one’s personal experience. But through it one perceives more than one can see. So above all one must keep the keyhole clean.’” They talk about writing (Kafka’s own, but also that of his favorite writers: Poe, Kleist, and Rimbaud, who “transforms vowels into colors”) as well as technology, film, crime, Darwinism, Chinese philosophy, carpentry, insomnia, street fights, Hindu scripture, art, suicide, and prayer. “Prayer,” Kafka notes, brings “its infinite radiance to bed in the frail little cradle of one’s own existence.”
Posted in Biography & Autobiography

Streaming Architecture

New Designs Using Apache Kafka and Mapr Streams

Author: Ted Dunning,Ellen Friedman, M.D.

Publisher: "O'Reilly Media, Inc."

ISBN: 149195390X

Category:

Page: 120

View: 1660

More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you'll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm. Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases. Ideal for developers and non-technical people alike, this book describes: Key elements in good design for streaming analytics, focusing on the essential characteristics of the messaging layerNew messaging technologies, including Apache Kafka and MapR Streams, with links to sample codeTechnology choices for streaming analytics: Apache Spark Streaming, Apache Flink, Apache Storm, and Apache ApexHow stream-based architectures are helpful to support microservicesSpecific use cases such as fraud detection and geo-distributed data streams Ted Dunning is Chief Applications Architect at MapR Technologies, and active in the open source community. He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and as committer and PMC member of the Apache ZooKeeper and Drill projects. Ted is on Twitter as @ted_dunning. Ellen Friedman, a committer for the Apache Drill and Apache Mahout projects, is a solutions consultant and well-known speaker and author, currently writing mainly about big data topics. With a PhD in Biochemistry, she has years of experience as a research scientist and has written about a variety of technical topics. Ellen is on Twitter as @Ellen_Friedman.
Posted in

Fast Data Processing With Spark

Author: Holden Karau

Publisher: Packt Publishing Ltd

ISBN: 1782167072

Category: Computers

Page: 120

View: 8410

This book will be a basic, step-by-step tutorial, which will help readers take advantage of all that Spark has to offer.Fastdata Processing with Spark is for software developers who want to learn how to write distributed programs with Spark. It will help developers who have had problems that were too much to be dealt with on a single computer. No previous experience with distributed programming is necessary. This book assumes knowledge of either Java, Scala, or Python.
Posted in Computers

Apache Kafka 1.0 Cookbook

Over 100 practical recipes on using distributed enterprise messaging to handle real-time data

Author: Raúl Estrada

Publisher: Packt Publishing Ltd

ISBN: 178728218X

Category: Computers

Page: 250

View: 3488

Simplify real-time data processing by leveraging the power of Apache Kafka 1.0 Key Features Use Kafka 1.0 features such as Confluent platforms and Kafka streams to build efficient streaming data applications to handle and process your data Integrate Kafka with other Big Data tools such as Apache Hadoop, Apache Spark, and more Hands-on recipes to help you design, operate, maintain, and secure your Apache Kafka cluster with ease Book Description Apache Kafka provides a unified, high-throughput, low-latency platform to handle real-time data feeds. This book will show you how to use Kafka efficiently, and contains practical solutions to the common problems that developers and administrators usually face while working with it. This practical guide contains easy-to-follow recipes to help you set up, configure, and use Apache Kafka in the best possible manner. You will use Apache Kafka Consumers and Producers to build effective real-time streaming applications. The book covers the recently released Kafka version 1.0, the Confluent Platform and Kafka Streams. The programming aspect covered in the book will teach you how to perform important tasks such as message validation, enrichment and composition.Recipes focusing on optimizing the performance of your Kafka cluster, and integrate Kafka with a variety of third-party tools such as Apache Hadoop, Apache Spark, and Elasticsearch will help ease your day to day collaboration with Kafka greatly. Finally, we cover tasks related to monitoring and securing your Apache Kafka cluster using tools such as Ganglia and Graphite. If you're looking to become the go-to person in your organization when it comes to working with Apache Kafka, this book is the only resource you need to have. What you will learn -Install and configure Apache Kafka 1.0 to get optimal performance -Create and configure Kafka Producers and Consumers -Operate your Kafka clusters efficiently by implementing the mirroring technique -Work with the new Confluent platform and Kafka streams, and achieve high availability with Kafka -Monitor Kafka using tools such as Graphite and Ganglia -Integrate Kafka with third-party tools such as Elasticsearch, Logstash, Apache Hadoop, Apache Spark, and more Who this book is for This book is for developers and Kafka administrators who are looking for quick, practical solutions to problems encountered while operating, managing or monitoring Apache Kafka. If you are a developer, some knowledge of Scala or Java will help, while for administrators, some working knowledge of Kafka will be useful.
Posted in Computers

Mastering Apache Spark 2.x

Author: Romeo Kienzler

Publisher: Packt Publishing Ltd

ISBN: 178528522X

Category: Computers

Page: 354

View: 7595

Advanced analytics on your Big Data with latest Apache Spark 2.x About This Book An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities. Extend your data processing capabilities to process huge chunk of data in minimum time using advanced concepts in Spark. Master the art of real-time processing with the help of Apache Spark 2.x Who This Book Is For If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected. What You Will Learn Examine Advanced Machine Learning and DeepLearning with MLlib, SparkML, SystemML, H2O and DeepLearning4J Study highly optimised unified batch and real-time data processing using SparkSQL and Structured Streaming Evaluate large-scale Graph Processing and Analysis using GraphX and GraphFrames Apply Apache Spark in Elastic deployments using Jupyter and Zeppelin Notebooks, Docker, Kubernetes and the IBM Cloud Understand internal details of cost based optimizers used in Catalyst, SystemML and GraphFrames Learn how specific parameter settings affect overall performance of an Apache Spark cluster Leverage Scala, R and python for your data science projects In Detail Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and SQL. This book aims to take your knowledge of Spark to the next level by teaching you how to expand Spark's functionality and implement your data flows and machine/deep learning programs on top of the platform. The book commences with an overview of the Spark ecosystem. It will introduce you to Project Tungsten and Catalyst, two of the major advancements of Apache Spark 2.x. You will understand how memory management and binary processing, cache-aware computation, and code generation are used to speed things up dramatically. The book extends to show how to incorporate H20, SystemML, and Deeplearning4j for machine learning, and Jupyter Notebooks and Kubernetes/Docker for cloud-based Spark. During the course of the book, you will learn about the latest enhancements to Apache Spark 2.x, such as interactive querying of live data and unifying DataFrames and Datasets. You will also learn about the updates on the APIs and how DataFrames and Datasets affect SQL, machine learning, graph processing, and streaming. You will learn to use Spark as a big data operating system, understand how to implement advanced analytics on the new APIs, and explore how easy it is to use Spark in day-to-day tasks. Style and approach This book is an extensive guide to Apache Spark modules and tools and shows how Spark's functionality can be extended for real-time processing and storage with worked examples.
Posted in Computers

Learning Storm

Author: Ankit Jain,Anand Nalya

Publisher: N.A

ISBN: 9781783981328

Category: Computers

Page: 252

View: 5210

If you are a Java developer who wants to enter into the world of real-time stream processing applications using Apache Storm, then this book is for you. No previous experience in Storm is required as this book starts from the basics. After finishing this book, you will be able to develop not-so-complex Storm applications.
Posted in Computers

Learning Apache Karaf

Author: Johan Edstrom,Jamie Goodyear,Heath Kesler

Publisher: Packt Publishing Ltd

ISBN: 178217205X

Category: Computers

Page: 128

View: 2593

The book is a fast-paced guide full of step-by-step instructions covering all aspects of application development using Apache Karaf.Learning Apache Karaf will benefit all Java developers and system administrators who need to develop for and/or operate Karaf’s OSGi-based runtime. Basic knowledge of Java is assumed.
Posted in Computers

Mastering Apache Cassandra

Author: Nishant Neeraj

Publisher: Packt Publishing Ltd

ISBN: 1782162690

Category: Computers

Page: 340

View: 9866

Mastering Apache Cassandra is a practical, hands-on guide with step-by-step instructions. The smooth and easy tutorial approach focuses on showing people how to utilize Cassandra to its full potential.This book is aimed at intermediate Cassandra users. It is best suited for startups where developers have to wear multiple hats: programmer, DevOps, release manager, convincing clients, and handling failures. No prior knowledge of Cassandra is required.
Posted in Computers

Mastering Apache Cassandra - Second Edition

Author: Nishant Neeraj

Publisher: Packt Publishing Ltd

ISBN: 1784396257

Category: Computers

Page: 350

View: 2923

The book is aimed at intermediate developers with an understanding of core database concepts who want to become a master at implementing Cassandra for their application.
Posted in Computers

Spark: The Definitive Guide

Big Data Processing Made Simple

Author: Bill Chambers,Matei Zaharia

Publisher: "O'Reilly Media, Inc."

ISBN: 1491912294

Category: Computers

Page: 606

View: 1519

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets—Spark’s core APIs—through worked examples Dive into Spark’s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark’s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation
Posted in Computers

Hadoop Real-World Solutions Cookbook

Author: Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1784398004

Category: Computers

Page: 290

View: 3181

Over 90 hands-on recipes to help you learn and master the intricacies of Apache Hadoop 2.X, YARN, Hive, Pig, Oozie, Flume, Sqoop, Apache Spark, and Mahout About This Book Implement outstanding Machine Learning use cases on your own analytics models and processes. Solutions to common problems when working with the Hadoop ecosystem. Step-by-step implementation of end-to-end big data use cases. Who This Book Is For Readers who have a basic knowledge of big data systems and want to advance their knowledge with hands-on recipes. What You Will Learn Installing and maintaining Hadoop 2.X cluster and its ecosystem. Write advanced Map Reduce programs and understand design patterns. Advanced Data Analysis using the Hive, Pig, and Map Reduce programs. Import and export data from various sources using Sqoop and Flume. Data storage in various file formats such as Text, Sequential, Parquet, ORC, and RC Files. Machine learning principles with libraries such as Mahout Batch and Stream data processing using Apache Spark In Detail Big data is the current requirement. Most organizations produce huge amount of data every day. With the arrival of Hadoop-like tools, it has become easier for everyone to solve big data problems with great efficiency and at minimal cost. Grasping Machine Learning techniques will help you greatly in building predictive models and using this data to make the right decisions for your organization. Hadoop Real World Solutions Cookbook gives readers insights into learning and mastering big data via recipes. The book not only clarifies most big data tools in the market but also provides best practices for using them. The book provides recipes that are based on the latest versions of Apache Hadoop 2.X, YARN, Hive, Pig, Sqoop, Flume, Apache Spark, Mahout and many more such ecosystem tools. This real-world-solution cookbook is packed with handy recipes you can apply to your own everyday issues. Each chapter provides in-depth recipes that can be referenced easily. This book provides detailed practices on the latest technologies such as YARN and Apache Spark. Readers will be able to consider themselves as big data experts on completion of this book. This guide is an invaluable tutorial if you are planning to implement a big data warehouse for your business. Style and approach An easy-to-follow guide that walks you through world of big data. Each tool in the Hadoop ecosystem is explained in detail and the recipes are placed in such a manner that readers can implement them sequentially. Plenty of reference links are provided for advanced reading.
Posted in Computers

Mastering Apache Spark

Author: Mike Frampton

Publisher: Packt Publishing Ltd

ISBN: 1783987154

Category: Computers

Page: 318

View: 430

Gain expertise in processing and storing data by using advanced techniques with Apache Spark About This Book Explore the integration of Apache Spark with third party applications such as H20, Databricks and Titan Evaluate how Cassandra and Hbase can be used for storage An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities Who This Book Is For If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected. What You Will Learn Extend the tools available for processing and storage Examine clustering and classification using MLlib Discover Spark stream processing via Flume, HDFS Create a schema in Spark SQL, and learn how a Spark schema can be populated with data Study Spark based graph processing using Spark GraphX Combine Spark with H20 and deep learning and learn why it is useful Evaluate how graph storage works with Apache Spark, Titan, HBase and Cassandra Use Apache Spark in the cloud with Databricks and AWS In Detail Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations. This book aims to take your limited knowledge of Spark to the next level by teaching you how to expand Spark functionality. The book commences with an overview of the Spark eco-system. You will learn how to use MLlib to create a fully working neural net for handwriting recognition. You will then discover how stream processing can be tuned for optimal performance and to ensure parallel processing. The book extends to show how to incorporate H20 for machine learning, Titan for graph based storage, Databricks for cloud-based Spark. Intermediate Scala based code examples are provided for Apache Spark module processing in a CentOS Linux and Databricks cloud environment. Style and approach This book is an extensive guide to Apache Spark modules and tools and shows how Spark's functionality can be extended for real-time processing and storage with worked examples.
Posted in Computers

Mastering NGINX

Author: Dimitri Aivaliotis

Publisher: Packt Publishing Ltd

ISBN: 1785283766

Category: Computers

Page: 320

View: 8718

An in-depth guide to configuring NGINX for your everyday server needs About This Book Get tips, tricks, and master insight to help you configure NGINX for any server situation Integrate NGINX into your applications architecture with is, using hands-on guidance and practical code samples that are free to use Troubleshoot configuration problems before and as they arise, for a seamless NGINX server experience Who This Book Is For This book is for system administrators and engineers who want to personalize NGINX, and design a robust configuration module to solve their hosting problems. Some knowledge of NGINX is a plus, but is not a prerequisite. What You Will Learn Compile the right third-party module to meet your needs Write an authentication server to use with the mail proxy module Create your own SSL certificates to encrypt connections Use try_files to solve your file-existence check problems Cache and compress responses to get speedier user interaction Integrate popular PHP frameworks with the FastCGI module Construct useful logging configurations In Detail NGINX is a high-performance HTTP server and mail proxy designed to use very few system resources. But despite its power it is often a challenge to properly configure NGINX to meet your expectations. Mastering Nginx is the solution – an insider's guide that will clarify the murky waters of NGINX's configuration. Tune NGINX for various situations, improve your NGINX experience with some of the more obscure configuration directives, and discover how to design and personalize a configuration to match your needs. To begin with, quickly brush up on installing and setting up the NGINX server on the OS and its integration with third-party modules. From here, move on to explain NGINX's mail proxy module and its authentication, and reverse proxy to solve scaling issues. Then see how to integrate NGINX with your applications to perform tasks. The latter part of the book focuses on working through techniques to solve common web issues and the know-hows using NGINX modules. Finally, we will also explore different configurations that will help you troubleshoot NGINX server and assist with performance tuning. Style and approach This is a mastering guide where you will follow an instructional, conversational approach working through problems and their solutions.
Posted in Computers

Apache Spark 2.x Cookbook

Author: Rishi Yadav

Publisher: Packt Publishing Ltd

ISBN: 1787127516

Category: Computers

Page: 294

View: 8648

Over 70 recipes to help you use Apache Spark as your single big data computing platform and master its libraries About This Book This book contains recipes on how to use Apache Spark as a unified compute engine Cover how to connect various source systems to Apache Spark Covers various parts of machine learning including supervised/unsupervised learning & recommendation engines Who This Book Is For This book is for data engineers, data scientists, and those who want to implement Spark for real-time data processing. Anyone who is using Spark (or is planning to) will benefit from this book. The book assumes you have a basic knowledge of Scala as a programming language. What You Will Learn Install and configure Apache Spark with various cluster managers & on AWS Set up a development environment for Apache Spark including Databricks Cloud notebook Find out how to operate on data in Spark with schemas Get to grips with real-time streaming analytics using Spark Streaming & Structured Streaming Master supervised learning and unsupervised learning using MLlib Build a recommendation engine using MLlib Graph processing using GraphX and GraphFrames libraries Develop a set of common applications or project types, and solutions that solve complex big data problems In Detail While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark. Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting. Style and approach This book is packed with intuitive recipes supported with line-by-line explanations to help you understand Spark 2.x's real-time processing capabilities and deploy scalable big data solutions. This is a valuable resource for data scientists and those working on large-scale data projects.
Posted in Computers