www.information-safety.org Open in urlscan Pro
2606:4700:3034::6815:979  Public Scan

Submitted URL: https://information-safety.org/
Effective URL: https://www.information-safety.org/
Submission: On June 06 via automatic, source certstream-suspicious — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

INFORMATION SAFETY


IMPROVING TECHNOLOGY THROUGH LESSONS FROM SAFETY.

 * About
 * Blog
 * Resources
 * Contribute
 * Archive


SECURE360 2022

30 May 2021 · @jabenninghoff

A couple of weeks ago, I spoke at Secure360 2022! My talk, “What Safety Science
taught me about Information Risk” was an updated version of my SiRAcon 2021 talk
(available in the members area at https://www.societyinforisk.org).


SESSION DESCRIPTION

> Two years of study and research has changed how I see risk. Safety science
> taught me that improving performance is the key to managing risk, and studying
> successes is the key to risk analysis. The ‘New School’ of safety argues that
> you can’t have a science of non-events; safety comes through being successful
> more often, not failing less. Research in DevOps, Software Security, and
> Security Programs show a strong link between general and security performance.
> In many (but not all) cases, organizations most effectively reduce
> cybersecurity risk by improving general performance, not by improving
> one-dimensional security or reliability performance.
> 
> This talk presents a new model for security performance that informs how we
> can maximize the value of our security investments, by focusing on improving
> existing or creating new organizational capabilities in response to new and
> emerging threats, where general performance falls short. It will review both
> the theory that improving performance improves safety, how that relates to
> cybersecurity risk, evidence from my own and others’ research that supports
> this theory, and how it can be used to analyze and manage risk more
> effectively.


TALK

The talk is broken down into three sections, and covers both the theory as well
as how to apply the theory to best improve security performance.

 * Assumptions backed by accepted theory
   * Assumption 1: organizations are sociotechnical systems
   * Assumption 2: all failures are systems failures
 * Arguments for a new theoretical model backed by evidence
   * Argument 1: resilience improves through performance
   * Argument 2: security performance is correlated with general performance
 * Implications of the model for information risk management: optimize risk
   management based on your performance mode
   * Mode 1: improve general performance
   * Mode 2: add security enhancements to general performance
   * Mode 3: create security-specific systems
   * Guided Adaptability
   * Work against the adversary

Overall, I think the talk went better than I expected. While the theory supports
some potentially controversial conclusions, like “retire your vulnerability
management program”, I had good engagement from the audience, ran out of time
for questions and spent some time afterwards talking with a few attendees in the
hall.

I got the survey results back pretty quickly. Only 9 people responded, which was
maybe 10-20% (I’m not a good judge of crowd size), but those responses were very
positive, with ~90% of attendees saying they would attend my future talks. My
weakest score was, “I am Interested in hearing more of this topic” which scored
just below “agree”.


SLIDES

My slides with notes, including references, are here.

Slides from all presenters at Secure360 (who provided them) are available here,
and most of my past talks and security blog posts are available at
https://transvasive.com.

comment


WHAT IS RESILIENCE ENGINEERING?

06 Apr 2021 · @jabenninghoff

Last August, I took on a new role at my company, and changed my title to
Resilience Engineer. Which leads to an obvious question, what is Resilience
Engineering?

Resilience Engineering (RE) as a concept emerged from safety science in the
early 2000s. While the oldest reference to “Resilience Engineering” appears to
be a paper written by David Woods in 20031, the most-cited work is the book,
Resilience Engineering: Concepts and Precepts, a collection of chapters from the
first Resilience Engineering symposium in 2004.2 In that book and in subsequent
publications, there have been many definitions of RE. This post is my attempt to
succinctly define Resilience Engineering as I practice it, which is:

Resilience Engineering is the practice of working with people and technology to
build software systems that fail less often and recover faster by improving
system performance.

Let’s break that definition down further:


RESILIENCE AND RESILIENCE ENGINEERING

Resilience is a concept from ecology that describes a system’s ability to
dynamically withstand and recover from unexpected disruptions, rather than
maintain a predictable, static state.3 Whereas resilience in ecological systems
is the result of the interplay between variability and natural selection,
Resilience Engineering seeks to achieve the same results through deliberate
management of the variability of performance:

> “Since both failures and successes are the outcome of normal performance
> variability, safety cannot be achieved by constraining or eliminating that.
> Instead, it is necessary to study both successes and failures, and to find
> ways to reinforce the variability that leads to successes as well as dampen
> the variability that leads to adverse outcomes.” 4

As both definitions make clear, resilience isn’t achieved through stability,
rather, it is achieved through variability.


WORKING WITH PEOPLE AND TECHNOLOGY

Systems safety recognizes that people are an integral part of the system; one
can’t talk about aviation safety without talking about the technology of the
plane and air traffic control, the people - the pilots and controllers, and the
interplay of the people and the technology. Similarly, the software systems I
work with consist of the code, the machines running the code, and the people
that write and maintain the code. The software engineers and the systems they
build comprise a sociotechnical system, with both technological/process and
social/psychological components.

Further, while technology can’t be ignored, beyond a baseline level of
technology, people are the main contributor to resilience or lack thereof; most
advances in aviation safety over the past 50+ years have come from human factors
research, and it is not by accident that safety science is usually part of the
psychology department. For this reason, I focus my efforts on people, and the
relationship between people and technology.


SYSTEMS THAT FAIL LESS OFTEN AND RECOVER FASTER

‘Systems that fail less often and recover faster’ is an over-simplification of
resilience, but that statement accurately describes the value proposition of
Resilience Engineering in technology; organizations are increasingly reliant on
software systems, to the point where software has become safety-critical. We
have come to expect that our software systems just work, so that failures are
infrequent and systems (the software and the people together) are able to
recover from unexpected disruptions quickly.

This is a distinctly different goal than ecological resilience: it isn’t enough
to build systems that simply survive, they also need be productive. This is a
challenge unique to Resilience Engineering, as it requires both limiting and
encouraging variability.


IMPROVING SYSTEM PERFORMANCE

For me, the key to understanding Resilience Engineering is HOW to achieve
resilience. Historically within technology, security and operations have sought
to prevent failures (outages, breaches) through centralized control, which does
work, but suffers from limitations that RE seeks to overcome.5 The shift in
approach starts with the premise that we can’t have a science of non-events, a
science of accidents that don’t happen.6 Safety-II (an alternative to
traditional ‘Safety-I’) proposes that resilience is the result of factors that
make things go right more often - working safely, something that can be studied.
Under this model, there is no safety-productivity tradeoff, since improving
outcomes leads to improvements in both productivity and resilience.

The work of the DevOps Research and Assessment group at Google demonstrates this
concept within software: as organizations improve performance (deployment
frequency and lead time for changes) they also improve resilience (time to
restore service, change failure rate).7 I’ve found that this approach works more
generally, and through RE, seek to help teams improve their performance and help
leaders to improve the performance between teams by managing organizational
factors.


OTHER PERSPECTIVES

Resilience Engineering is a diverse space and there is a small but growing group
of practitioners and researchers that are applying it to software systems. Two
notable groups are the Resilience Engineering Association and the Learning From
Incidents community. I’ve also recently discovered the work of Dr Drew Rae and
Dr David Provan through their Safety of Work podcast. Their paper on Resilience
Engineering in practice is aimed at traditional safety professionals but I’ve
found its ideas easily adapted to software systems.

As a practitioner-researcher myself, I’m hoping to adapt and apply the science
to software systems, to improve the profession, as well as contribute to the
collective knowledge - of Resilience Engineering.


FUTURE ARTICLES

Update: I’ve been asked to elaborate on the ideas behind Resilience Engineering,
so I’ve added this section to cover a plan for future articles on the topic:

 * The origins and history of Resilience Engineering
 * Parallels between Cybersecurity, Operations, and Safety
 * Is DevOps culture High Reliability culture?
 * My research in software systems resilience

Updates and links will be posted here.

 1. Woods, D., & Wreathall, J. (2003). Managing Risk Proactively: The Emergence
    of Resilience Engineering.
    https://www.researchgate.net/publication/228711828_Managing_Risk_Proactively_The_Emergence_of_Resilience_Engineering ↩

 2. Hollnagel, E., Woods, D. D., & Leveson, N. (2006). Resilience engineering :
    concepts and precepts. Ashgate. ↩

 3. Holling, C. S. (1973). RESILIENCE AND STABILITY OF ECOLOGICAL SYSTEMS
    [Article]. Annual Review of Ecology & Systematics, 4, 1-23.
    https://doi.org/10.1146/annurev.es.04.110173.000245 ↩

 4. Hollnagel, E. (2008). Preface : Resilience Engineering in a Nutshell. In E.
    Hollnagel, C. P. Nemeth, & S. Dekker (Eds.), Resilience Engineering
    Perspectives, Volume 1: Remaining Sensitive to the Possibility of Failure
    (pp. ix-xii). Ashgate. ↩

 5. Provan, D. J., Woods, D. D., Dekker, S. W. A., & Rae, A. J. (2020). Safety
    II professionals: How resilience engineering can transform safety practice.
    Reliability Engineering & System Safety, 195, 106740.
    https://doi.org/10.1016/j.ress.2019.106740 ↩

 6. Hollnagel, E. (2014). Is safety a subject for science? Safety Science, 67,
    21-24. https://doi.org/10.1016/j.ssci.2013.07.025 ↩

 7. Forsgren, N., Smith, D., Humble, J., & Frazelle, J. (2019). 2019 Accelerate
    State of DevOps Report. DORA & Google Cloud.
    https://research.google/pubs/pub48455/ ↩

comment


WORKING WITH R

11 Sep 2020 · @jabenninghoff

Around the time of SIRACon 2020, I decided to start using R. I needed a data
analysis tool that would allow me to conduct traditional statistical analysis,
and I wanted a tool that would be valuable to learn and one that would allow me
to do exploratory analysis as well. Originally I considered SPSS (free to
students) and RStudio. The tradeoffs between the two were pretty clear: SPSS is
very easy to use, but expensive, proprietary, and old. RStudio and R have a
tougher learning curve, but are free and open source, under active development,
and have a large online community. After reading a thread on the SIRA mailing
list, I was leaning towards R, and re-watched Elliot Murphy’s 2019 SIRAcon
presentation on using notebooks, which led me to consider both R Markdown and
Python Jupyter Notebooks. I did more searching and reading, and finally settled
on R Notebooks for a few reasons: R Notebooks are more disciplined (no strange
side effects from running code out of order), fewer environment problems, the
support of the RStudio company, better visualizations, and just because R is the
more data-sciency language.

The SIRA community was quite supportive of this idea when I asked for
suggestions on getting started in the BOF session, and recommended Teacup
Giraffes and Tidy Tuesday for learning R, and on my own I found RStudio
recommendations. Of course, being a sysadmin at heart, I set out to figure out
how exactly to best install R and RStudio, and manage the notebooks in git.

Installation on macOS was easy enough, just brew install r and brew cask install
rstudio. GitHub published a tutorial in 2018 on getting RStudio integrated with
GitHub, and I started working on that. Quickly I discovered that while the
tutorial was helpful, it wasn’t quite the setup I wanted; it published R
Markdown through GitHub pages, but wouldn’t directly support the automatically
generated html of R Notebooks. Side note: the consensus was to use html_notebook
as a working document, and html_document to publish. After more searching, I was
able to get Notebooks working on GitHub, but I used the method described in
rstudio/rmarkdown #1020 - checking in the .nb.html into git, and using GitHub
Pages so that you can view the rendered HTML instead of just the HTML code.

Working through this, I noted that RStudio is quite good at automatically
downloading and installing packages as needed; it triggered installation of
rmarkdown and supporting pacakges when creating a new R Notebook, and also readr
when importing data from csv. Which got me thinking, what about package
management? While it seems that R doesn’t have the level of challenge posed by
Python or Ruby, managing packages on a per-project basis is a best practice I
learned from using Bundler to manage the code of this site. (the only gem I
install outside a project is bundler) So I went looking for the R equivalent…

I first found Packrat and then its replacement, renv (Packrat is maintained, but
all new development has shifted to renv). Setting it up is as simple as
install.packages("renv") and renv::init(), and RStudio has published:

 * A talk at rstudio::conf 2020, renv: Project Environments for R
 * A blog post, renv: Project Environments for R
 * An Introduction to renv

This left one final question: how exactly to install r? Homebrew itself offers 2
methods: install the official binaries using brew cask install r, and just brew
install r. Poking around further, I found that the cask method was sub-optimal
as it installs in /usr/local which causes issues with brew doctor.
Interestingly, I also found that Homebrew’s R doesn’t include all R features,
but the same author, Luis Puerto offered a solution to install all the things. I
haven’t tried it yet, but I may go with homebrew-r-srf as suggested by Luis (or
a fork of it).

What’s next? At some point I plan to try to integrate GitHub actions for
testing, and create a CI/CD pipeline of sorts for Pages, using GitHub actions.
And, of course, actually using R for data analysis…

Update: I tested homebrew-r-srf, and am going with homebrew r. There was some
weirdness with the install/uninstall (/usr/local/lib/R left over), I don’t know
if I’ll need the optional features, and homebrew r now uses openblas. If I find
I actaully need any of the missing capabilities, I’ll likely write my own
formula.

comment
Older Newer
© 2022. All rights reserved.