Explains how software reliability can be applied to software programs of all sizes, functions and languages, and businesses. This text provides real-life examples from industries such as defence engineering, and finance. It is aimed at software and quality assurance engineers and graduate students.
To make communication and computation secure against catastrophic failure and malicious interference, it is essential to build secure software systems and methods for their development. This book describes the ideas on how to meet these challenges in software engineering.
Can a system be considered truly reliable if it isn't fundamentally secure? Or can it be considered secure if it's unreliable? Security is crucial to the design and operation of scalable systems in production, as it plays an important part in product quality, performance, and availability. In this book, experts from Google share best practices to help your organization design scalable and reliable systems that are fundamentally secure. Two previous O’Reilly books from Google—Site Reliability Engineering and The Site Reliability Workbook—demonstrated how and why a commitment to the entire service lifecycle enables organizations to successfully build, deploy, monitor, and maintain software systems. In this latest guide, the authors offer insights into system design, implementation, and maintenance from practitioners who specialize in security and reliability. They also discuss how building and adopting their recommended best practices requires a culture that’s supportive of such change. You’ll learn about secure and reliable systems through: Design strategies Recommendations for coding, testing, and debugging practices Strategies to prepare for, respond to, and recover from incidents Cultural best practices that help teams across your organization collaborate effectively
The overwhelming majority of a software system’s lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems? In this collection of essays and articles, key members of Google’s Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You’ll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient—lessons directly applicable to your organization. This book is divided into four sections: Introduction—Learn what site reliability engineering is and why it differs from conventional IT industry practices Principles—Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE) Practices—Understand the theory and practice of an SRE’s day-to-day work: building and operating large distributed computing systems Management—Explore Google's best practices for training, communication, and meetings that your organization can use
Deals constructively with recognized software problems. Focuses on the unreliability of computer programs and offers state-of-the-art solutions. Covers—software development, software testing, structured programming, composite design, language design, proofs of program correctness, and mathematical reliability models. Written in an informal style for anyone whose work is affected by the unreliability of software. Examples illustrate key ideas, over 180 references.
Recent Advances in System Reliability Engineering describes and evaluates the latest tools, techniques, strategies, and methods in this topic for a variety of applications. Special emphasis is put on simulation and modelling technology which is growing in influence in industry, and presents challenges as well as opportunities to reliability and systems engineers. Several manufacturing engineering applications are addressed, making this a particularly valuable reference for readers in that sector. - Contains comprehensive discussions on state-of-the-art tools, techniques, and strategies from industry - Connects the latest academic research to applications in industry including system reliability, safety assessment, and preventive maintenance - Gives an in-depth analysis of the benefits and applications of modelling and simulation to reliability
A high percentage of defense systems fail to meet their reliability requirements. This is a serious problem for the U.S. Department of Defense (DOD), as well as the nation. Those systems are not only less likely to successfully carry out their intended missions, but they also could endanger the lives of the operators. Furthermore, reliability failures discovered after deployment can result in costly and strategic delays and the need for expensive redesign, which often limits the tactical situations in which the system can be used. Finally, systems that fail to meet their reliability requirements are much more likely to need additional scheduled and unscheduled maintenance and to need more spare parts and possibly replacement systems, all of which can substantially increase the life-cycle costs of a system. Beginning in 2008, DOD undertook a concerted effort to raise the priority of reliability through greater use of design for reliability techniques, reliability growth testing, and formal reliability growth modeling, by both the contractors and DOD units. To this end, handbooks, guidances, and formal memoranda were revised or newly issued to reduce the frequency of reliability deficiencies for defense systems in operational testing and the effects of those deficiencies. "Reliability Growth" evaluates these recent changes and, more generally, assesses how current DOD principles and practices could be modified to increase the likelihood that defense systems will satisfy their reliability requirements. This report examines changes to the reliability requirements for proposed systems; defines modern design and testing for reliability; discusses the contractor's role in reliability testing; and summarizes the current state of formal reliability growth modeling. The recommendations of "Reliability Growth" will improve the reliability of defense systems and protect the health of the valuable personnel who operate them.
SOFTWARE RELIABILITY TECHNIQUES FOR REAL-WORLD APPLICATIONS SOFTWARE RELIABILITY TECHNIQUES FOR REAL-WORLD APPLICATIONS Authoritative resource providing step-by-step guidance for producing reliable software to be tailored for specific projects Software Reliability Techniques for Real-World Applications is a practical, up to date, go-to source that can be referenced repeatedly to efficiently prevent software defects, find and correct defects if they occur, and create a higher level of confidence in software products. From content development to software support and maintenance, the author creates a depiction of each phase in a project such as design and coding, operation and maintenance, management, product production, and concept development and describes the activities and products needed for each. Software Reliability Techniques for Real-World Applications introduces clear ways to understand each process of software reliability and explains how it can be managed effectively and reliably. The book is supported by a plethora of detailed examples and systematic approaches, covering analogies between hardware and software reliability to ensure a clear understanding. Overall, this book helps readers create a higher level of confidence in software products. In Software Reliability Techniques for Real-World Applications, readers will find specific information on: Defects, including where defects enter the project system, effects, detection, and causes of defects, and how to handle defects Project phases, including concept development and planning, requirements and interfaces, design and coding, and integration, verification, and validation Roadmap and practical guidelines, including at the start of a project, as a member of an organization, and how to handle troubled projects Techniques, including an introduction to techniques in general, plus techniques by organization (systems engineering, software, and reliability engineering) Software Reliability Techniques for Real-World Applications is a practical text on software reliability, providing over sixty-five different techniques and step-by-step guidance for producing reliable software. It is an essential and complete resource on the subject for software developers, software maintainers, and producers of software.
Explains how software reliability can be applied to software programs of all sizes, functions and languages, and businesses. This text provides real-life examples from industries such as defence engineering, and finance. It is aimed at software and quality assurance engineers and graduate students.