Spark: The Definitive Guide

Spark: The Definitive Guide

Author: Bill Chambers

Publisher: "O'Reilly Media, Inc."

Published: 2018-02-08

Total Pages: 594

ISBN-13: 1491912294

DOWNLOAD EBOOK

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation


Programming Hive

Programming Hive

Author: Edward Capriolo

Publisher: "O'Reilly Media, Inc."

Published: 2012-09-26

Total Pages: 351

ISBN-13: 1449319335

DOWNLOAD EBOOK

Need to move a relational database application to Hadoop? This comprehensive guide introduces you to Apache Hive, Hadoop’s data warehouse infrastructure. You’ll quickly learn how to use Hive’s SQL dialect—HiveQL—to summarize, query, and analyze large datasets stored in Hadoop’s distributed filesystem. This example-driven guide shows you how to set up and configure Hive in your environment, provides a detailed overview of Hadoop and MapReduce, and demonstrates how Hive works within the Hadoop ecosystem. You’ll also find real-world case studies that describe how companies have used Hive to solve unique problems involving petabytes of data. Use Hive to create, alter, and drop databases, tables, views, functions, and indexes Customize data formats and storage options, from files to external databases Load and extract data from tables—and use queries, grouping, filtering, joining, and other conventional query methods Gain best practices for creating user defined functions (UDFs) Learn Hive patterns you should use and anti-patterns you should avoid Integrate Hive with other data processing programs Use storage handlers for NoSQL databases and other datastores Learn the pros and cons of running Hive on Amazon’s Elastic MapReduce


Cassandra: The Definitive Guide

Cassandra: The Definitive Guide

Author: Jeff Carpenter

Publisher: "O'Reilly Media, Inc."

Published: 2016-06-29

Total Pages: 369

ISBN-13: 1491933631

DOWNLOAD EBOOK

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data Maintain a high level of performance in your cluster Deploy Cassandra on site, in the Cloud, or with Docker Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene


Programmer's Guide to Apache Thrift

Programmer's Guide to Apache Thrift

Author: William Abernethy

Publisher: Simon and Schuster

Published: 2019-03-17

Total Pages: 878

ISBN-13: 1638351643

DOWNLOAD EBOOK

Summary Programmer's Guide to Apache Thrift provides comprehensive coverage of the Apache Thrift framework along with a developer's-eye view of modern distributed application architecture. Foreword by Jens Geyer. About the Technology Thrift-based distributed software systems are built out of communicating components that use different languages, protocols, and message types. Sitting between them is Thrift, which handles data serialization, transport, and service implementation. Thrift supports many client and server environments and a host of languages ranging from PHP to JavaScript, and from C++ to Go. About the Book Programmer's Guide to Apache Thrift provides comprehensive coverage of distributed application communication using the Thrift framework. Packed with code examples and useful insight, this book presents best practices for multi-language distributed development. You'll take a guided tour through transports, protocols, IDL, and servers as you explore programs in C++, Java, and Python. You'll also learn how to work with platforms ranging from browser-based clients to enterprise servers. What's inside Complete coverage of Thrift's IDL Building and serializing complex user-defined types Plug-in protocols, transports, and data compression Creating cross-language services with RPC and messaging systems About the Reader Readers should be comfortable with a language like Python, Java, or C++ and the basics of service-oriented or microservice architectures. About the Author Randy Abernethy is an Apache Thrift Project Management Committee member and a partner at RX-M. Table of Contents Introduction to Apache Thrift Apache Thrift architecture Building, testing, and debugging Moving bytes with transports Serializing data with protocols Apache Thrift IDL User-defined types Implementing services Handling exceptions Servers Building clients and servers with C++ Building clients and servers with Java Building C# clients and servers with .NET Core and Windows Building Node.js clients and servers Apache Thrift and JavaScript Scripting Apache Thrift Thrift in the enterprise


Kafka: The Definitive Guide

Kafka: The Definitive Guide

Author: Neha Narkhede

Publisher: "O'Reilly Media, Inc."

Published: 2017-08-31

Total Pages: 315

ISBN-13: 1491936118

DOWNLOAD EBOOK

Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer. Understand publish-subscribe messaging and how it fits in the big data ecosystem. Explore Kafka producers and consumers for writing and reading messages Understand Kafka patterns and use-case requirements to ensure reliable data delivery Get best practices for building data pipelines and applications with Kafka Manage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasks Learn the most critical metrics among Kafka’s operational measurements Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems


Programmer's Guide to Apache Thrift

Programmer's Guide to Apache Thrift

Author: Randy Abernethy

Publisher: Manning Publications

Published: 2019-04-14

Total Pages: 592

ISBN-13: 9781617296161

DOWNLOAD EBOOK

Summary Programmer's Guide to Apache Thrift provides comprehensive coverage of the Apache Thrift framework along with a developer's-eye view of modern distributed application architecture. Foreword by Jens Geyer. Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications. About the Technology Thrift-based distributed software systems are built out of communicating components that use different languages, protocols, and message types. Sitting between them is Thrift, which handles data serialization, transport, and service implementation. Thrift supports many client and server environments and a host of languages ranging from PHP to JavaScript, and from C++ to Go. About the Book Programmer's Guide to Apache Thrift provides comprehensive coverage of distributed application communication using the Thrift framework. Packed with code examples and useful insight, this book presents best practices for multi-language distributed development. You'll take a guided tour through transports, protocols, IDL, and servers as you explore programs in C++, Java, and Python. You'll also learn how to work with platforms ranging from browser-based clients to enterprise servers. What's inside Complete coverage of Thrift's IDL Building and serializing complex user-defined types Plug-in protocols, transports, and data compression Creating cross-language services with RPC and messaging systems About the Reader Readers should be comfortable with a language like Python, Java, or C++ and the basics of service-oriented or microservice architectures. About the Author Randy Abernethy is an Apache Thrift Project Management Committee member and a partner at RX-M. Table of Contents PART 1 - APACHE THRIFT OVERVIEW Introduction to Apache Thrift Apache Thrift architecture Building, testing, and debugging PART 2 - PROGRAMMING APACHE THRIFT Moving bytes with transports Serializing data with protocols Apache Thrift IDL User-defined types Implementing services Handling exceptions Servers PART 3 - APACHE THRIFT LANGUAGES Building clients and servers with C++ Building clients and servers with Java Building C# clients and servers with .NET Core and Windows Building Node.js clients and servers Apache Thrift and JavaScript Scripting Apache Thrift Thrift in the enterprise


gRPC: Up and Running

gRPC: Up and Running

Author: Kasun Indrasiri

Publisher: O'Reilly Media

Published: 2020-01-23

Total Pages: 205

ISBN-13: 1492058300

DOWNLOAD EBOOK

Get a comprehensive understanding of gRPC fundamentals through real-world examples. With this practical guide, you’ll learn how this high-performance interprocess communication protocol is capable of connecting polyglot services in microservices architecture, while providing a rich framework for defining service contracts and data types. Complete with hands-on examples written in Go, Java, Node, and Python, this book also covers the essential techniques and best practices to use gRPC in production systems. Authors Kasun Indrasiri and Danesh Kuruppu discuss the importance of gRPC in the context of microservices development.


Client-Server Web Apps with JavaScript and Java

Client-Server Web Apps with JavaScript and Java

Author: Casimir Saternos

Publisher: "O'Reilly Media, Inc."

Published: 2014-03-28

Total Pages: 259

ISBN-13: 1449369316

DOWNLOAD EBOOK

As a Java programmer, how can you tackle the disruptive client-server approach to web development? With this comprehensive guide, you’ll learn how today’s client-side technologies and web APIs work with various Java tools. Author Casimir Saternos provides the big picture of client-server development, and then takes you through many practical client-server architectures. You’ll work with hands-on projects in several chapters to get a feel for the topics discussed. User habits, technologies, and development methods have drastically altered web app design in recent years. But the Web itself hasn’t changed. This book shows you how to build apps that conform to the web’s underlying architecture. Learn the advantages of using separate client and server tiers, including code organization and speedy prototyping Explore the major tools, frameworks, and starter projects used in JavaScript development Dive into web API design and REST style of software architecture Understand Java’s alternatives to traditional packaging methods and application server deployment Build projects with lightweight servers, using jQuery with Jython, and Sinatra with Angular Create client-server web apps with traditional Java web application servers and libraries


Software Architecture with C++

Software Architecture with C++

Author: Adrian Ostrowski

Publisher: Packt Publishing Ltd

Published: 2021-04-23

Total Pages: 522

ISBN-13: 1789612462

DOWNLOAD EBOOK

Apply business requirements to IT infrastructure and deliver a high-quality product by understanding architectures such as microservices, DevOps, and cloud-native using modern C++ standards and features Key FeaturesDesign scalable large-scale applications with the C++ programming languageArchitect software solutions in a cloud-based environment with continuous integration and continuous delivery (CI/CD)Achieve architectural goals by leveraging design patterns, language features, and useful toolsBook Description Software architecture refers to the high-level design of complex applications. It is evolving just like the languages we use, but there are architectural concepts and patterns that you can learn to write high-performance apps in a high-level language without sacrificing readability and maintainability. If you're working with modern C++, this practical guide will help you put your knowledge to work and design distributed, large-scale apps. You'll start by getting up to speed with architectural concepts, including established patterns and rising trends, then move on to understanding what software architecture actually is and start exploring its components. Next, you'll discover the design concepts involved in application architecture and the patterns in software development, before going on to learn how to build, package, integrate, and deploy your components. In the concluding chapters, you'll explore different architectural qualities, such as maintainability, reusability, testability, performance, scalability, and security. Finally, you will get an overview of distributed systems, such as service-oriented architecture, microservices, and cloud-native, and understand how to apply them in application development. By the end of this book, you'll be able to build distributed services using modern C++ and associated tools to deliver solutions as per your clients' requirements. What you will learnUnderstand how to apply the principles of software architectureApply design patterns and best practices to meet your architectural goalsWrite elegant, safe, and performant code using the latest C++ featuresBuild applications that are easy to maintain and deployExplore the different architectural approaches and learn to apply them as per your requirementSimplify development and operations using application containersDiscover various techniques to solve common problems in software design and developmentWho this book is for This software architecture C++ programming book is for experienced C++ developers looking to become software architects or develop enterprise-grade applications.


Hadoop: The Definitive Guide

Hadoop: The Definitive Guide

Author: Tom White

Publisher: "O'Reilly Media, Inc."

Published: 2012-05-10

Total Pages: 687

ISBN-13: 1449338771

DOWNLOAD EBOOK

Ready to unlock the power of your data? With this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters. You’ll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, as well as MapReduce 2 and its more flexible execution model (YARN). Store large datasets with the Hadoop Distributed File System (HDFS) Run distributed computations with MapReduce Use Hadoop’s data and I/O building blocks for compression, data integrity, serialization (including Avro), and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster—or run Hadoop in the cloud Load data from relational databases into HDFS, using Sqoop Perform large-scale data processing with the Pig query language Analyze datasets with Hive, Hadoop’s data warehousing system Take advantage of HBase for structured and semi-structured data, and ZooKeeper for building distributed systems