Friday, February 16, 2018

Robustness Testing of Autonomy Software (ASTAA Paper Published)

I'm very pleased that our research team will present a paper on Robustness Testing of Autonomy Software at the ICSE Software Engineering in Practice session in a late May. You can see a preprint of the paper here:

The work summarizes what we've learned across several years of research stress testing many robots, including self-driving cars.

As robotic and autonomy systems become progressively more present in industrial and human-interactive applications, it is increasingly critical for them to behave safely in the presence of unexpected inputs. While robustness testing for traditional software systems is long-studied, robustness testing for autonomy systems is relatively uncharted territory. In our role as engineers, testers, and researchers we have observed that autonomy systems are importantly different from traditional systems, requiring novel approaches to effectively test them. We present Automated Stress Testing for Autonomy Architectures (ASTAA), a system that effectively, automatically robustness tests autonomy systems by building on classic principles, with important innovations to support this new domain. Over five years, we have used ASTAA to test 17 real-world autonomy systems, robots, and robotics-oriented libraries, across commercial and academic applications, discovering hundreds of bugs. We outline the ASTAA approach and analyze more than 150 bugs we found in real systems. We discuss what we discovered about testing autonomy systems, specifically focusing on how doing so differs from and is similar to traditional software robustness testing and other high-level lessons.

Casidhe Hutchison
Milda Zizyte
Patrick Lanigan
David Guttendorf
Mike Wagner
Claire Le Guoes
Philip Koopman

Monday, January 29, 2018

New Peer Review Checklist for Embedded C Code

Here's a new peer review checklist to help improve the quality of your embedded C code.

To use the checklist, you should do a sit-down meeting with, ideally, three reviewers not including the code author. Divide the checklist up into three portions as indicated.  Be sure to run decent static analysis before the review to safe reviewer time -- let the tools find the easy stuff before spending human time on the review.

After an initial orientation to what the code is supposed to do and relevant background, the review process is:
  1. The review leader picks the next few lines of code to be reviewed and makes sure everyone is ONLY focused on those few lines.  Usually this is 5-10 lines encompassing a conditional structure, a basic block, or other generally unified small chunk within the code.
  2. Reviewers identify any code problems relevant to their part of the checklist.  It's OK if they notice others, but they should focus on individually considering each item in their part of the checklist and ask "do I see a violation of this item" in just the small chunk of code being considered.
  3. Reviewer comments should be recorded in the form: "Line X seems to violate Checklist Item Y for the following reason." Do NOT suggest a fix -- just record the issue.
  4. When all comments have been recorded, go back to step 1.  Continue to review up to a maximum of 2 hours. You should be covering about 100-200 lines of code per hour. Too fast and too slow are both a problem.
A text version of the checklist is below. You can also download an acrobat version here.  Additional pointers to support materials are after the checklist. If you have a static analysis tool that automates any of the checklist item, feel free to replace that item with something else that's important to you.

Peer Review Checklist: Embedded C Code
Before Review:
0    _____    Code compiles clean with extensive warning checks (e.g. MISRA C rules)
Reviewer #1:       
1    _____    Commenting:  top of file, start of function, code that needs an explanation
2    _____    Style is consistent and follows style guidelines
3    _____    Proper modularity, module size, use of .h files and #includes
4    _____    No orphans (redundant, dead, commented out, unused code & variables)
5    _____    Conditional expressions evaluate to a boolean value; no assignments
6    _____    Parentheses used to avoid operator precedence confusion
7    _____    All switch statements have a default clause; preferably an error trap
Reviewer #2:       
8    _____    Single point of exit from each function
9    _____    Loop entry and exit conditions correct; minimum continue/break complexity
10    _____    Conditionals should be minimally nested (generally only one or two deep)
11    _____    All functions can be unit tested; SCC or SF complexity less than 10 to 15
12    _____    Use const and inline instead of #define; minimize conditional compilation
13    _____    Avoid use of magic numbers (constant values embedded in code)
14    _____    Use strong typing (includes: sized types, structs for coupled data, const)
15    _____    Variables have well chosen names and are initialized at definition
Reviewer #3:       
16    _____    Minimum scope for all functions and variables; essentially no globals
17    _____    Concurrency issues? (locking, volatile keyword, minimize blocking time)
18    _____    Input parameter checking is done (style, completeness)
19    _____    Error handling for function returns is appropriate
20    _____    Null pointers, division by zero, null strings, boundary conditions handled
21    _____    Floating point use is OK (equality, NaN, INF, roundoff); use of fixed point
22    _____    Buffer overflow safety (bound checking, avoid unsafe string operations)
All Reviewers:     
23    _____    Does the code match the detailed design (correct functionality)?
24    _____    Is the code as simple, obvious, and easy to review as possible?
        For TWO Reviewers assign items:   Reviewer#1:  1-11; 23-24    Reviewer#2: 12-24
        Items that are covered with static analysis can be removed from checklist
        Template 1/28/2018:  Copyright CC BY 4.0, 2018, Philip Koopman

Additional material to help you with successful peer reviews:

Tuesday, December 26, 2017

Embedded System Safety Lecture Video Series

I've posted the full series of embedded system safety lecture videos on YouTube.  These are full-length narrated slides of the core set of safety topics from my new course.  They concentrate on getting the big picture about failures, redundancy management, and safety integrity.
Each of the videos is posted to YouTube as a playlist, with each video covering one substantive slide. The full lecture consists of playing the entire play list, with most lectures being 5-7 videos in sequence. (The slide download has been updated for my Fall 2017 course, so in general has a little more material than the original video. They'll get synchronized eventually, but for now this is what I have.)

Obviously there is more to safety than just these topics. Supporting topics such as code quality and development approaches are currently just available as slides.  You can see the full set of course slides here:

Monday, November 27, 2017

Embedded Software Course Notes On-Line

I'm just wrapping up my first semester teaching a new course on embedded system software. It covers code quality, safety, and security. Below is table of lecture handouts.

18-642 Embedded System Software Engineering
Prof. Philip Koopman, Carnegie Mellon University, Fall 2017

1Course IntroductionSoftware is eating the world; embedded applications and markets; bad code is a problem; coding is 0% of software; truths and management misconceptions
2Software Development ProcessesWaterfall; swiss cheese model; lessons learned in software; V model; design vs. code; agile methods; agile for embedded
3Global VariablesGlobal vs. static variables; avoiding and removing globals
4Spaghetti CodeMcCabe Cyclomatic Complexity (MCC); SCC; Spaghetti Factor (SF)
5Unit TestingBlack box testing; white box testing; unit testing strategies; MCDC coverage; unit testing frameworks (cunit)
6Modal Code/StatechartsStatechart elements; statechart example; statechart implementation
7Peer ReviewsEffective code quality practices, peer review efficiency and effectiveness; Fagan inspections; rules for peer review; review report; perspective-based reviews; review checklist; case study; economics of peer review
8Code Style/HumansMaking code easy to read; good code hygiene; avoiding premature optimization; coding style
9Code Style/LanguagePitfalls and problems with C; language use guidelines and analysis tools; using language wisely (strong typing); Mars Climate Orbiter; deviations & legacy code
10Testing QualitySmoke testing, exploratory testing; methodical test coverage; types of testing; testing philosophy; coverage; testing resources
11RequirementsAriane 5 flight 501; rules for good requirements; problematic requirements; extra-functional requirements; requirements approaches; ambiguity
12System-Level TestFirst bug story; effective test plans; testing won't find all bugs; F-22 Raptor date line bug; bug farms; risks of bad software
13SW ArchitectureHigh Level Design (HLD); boxes and arrows; sequence diagrams (SD); statechart to SD relationship; 2011 Health Plan chart
14Integration TestingIntegration test approaches; tracing integration tests to SDs; network message testing; using SDs to generate unit tests
15TraceabilityTraceability across the V; examples; best practices
16SQA isn't testingSQA elements; audits; SQA as coaching staff; cost of defect fixes over project cycle
17Lifecycle CMA400M crash; version control; configuration management; long lifecycles
18MaintenanceBug fix cycle; bug prioritization; maintenance as a large cost driver; technical debt
19Process Key MetricsTester to developer ratio; code productivity; peer review effectiveness
33Date Time ManagementKeeping time; time terminology; clock synchronization; time zones; DST; local time; sunrise/sunset; mobility and time; date line; GMT/UTC; leap years; leap seconds; time rollovers; Zune leap year bug; internationalization.
21Floating Point PitfallsFloating point formats; special values; NaN and robots; roundoff errors; Patriot Missile mishap
23Stack OverflowStack overflow mechanics; memory corruption; stack sentinels; static analysis; memory protection; avoid recursion
25Race ConditionsTherac 25; race condition example; disabling interrupts; mutex; blocking time; priority inversion; priority inheritance; Mars Pathfinder
27Data IntegritySources of faults; soft errors; Hamming distance; parity; mirroring; SECDED; checksum; CRC
20Safety+Security OverviewChallenges of embedded code; it only takes one line of bad code; problems with large scale production; your products live or die by their software; considering the worst case; designing for safety; security matters; industrial controls as targets; designing for security; testing isn't enough
Fiat Chrysler jeep hack; Ford Mytouch update; Toyota UA code quality; Heartbleed; Nest thermostats; Honda UA recall; Samsung keyboard bug; hospital infusion pumps; LIFX smart lightbulbs; German steel mill hack; Ukraine power hack; SCADA attack data; Shodan; traffic light control vulnerability; hydroelectric plant vulnerability; zero-day shopping list
22DependabilityDependability; availability; Windows 2000 server crash; reliability; serial and parallel reliability; example reliability calculation; other aspects of dependability
24Critical SystemsSafety critical vs. mission critical; worst case and safety; HVAC malfunction hazard; Safety Integrity Levels (SIL); Bhopal; IEC 61508; fleet exposure
26Safety PlanSafety plan elements; functional safety approaches; hazards & risks; safety goals & safety requirements; FMEA; FTA; safety case (GSN)
28Safety RequirementsIdentifying safety-related requirements; safety envelope; Doer/Checker pattern
29Single Points of FailureFault containment regions (FCR); Toyota UA single point failure; multi-channel pattern; monitor pattern; safety gate pattern; correlated & accumulated faults
30SIL IsolationIsolating different SILs, mixed-SIL interference sources; mitigating cross-SIL interference; isolation and security; CarShark hack
31Redundancy ManagementBellingham WA gasoline pipeline mishap; redundancy for availability; redundancy for fault detection; Ariane 5 Flight 501; fail operational; triplex modular redundancy (TMR) 2-of-3 pattern; dual 2-of-2 pattern; high-SIL Doer/Checker pattern; diagnostic effectiveness and proof tests
32Safety Architecture PatternsSupplemental lecture with more detail on patterns: low SIL; self-diagnosis; partitioning; fail operational; voting; fail silent; dual 2-of-2; Ariane 5 Flight 501; fail silent patterns (low, high, mixed SIL); high availability mixed SIL pattern
34Security PlanSecurity plan elements; Target Attack; security requirements; threats; vulnerabilities; mitigation; validation
35CryptographyConfusion & diffusion; Caesar cipher; frequency analysis; Enigma; Lorenz & Colossus; DES; AES; public key cryptography; secure hashing; digital signatures; certificates; PKI; encrypting vs. signing for firmware update
36Security ThreatsStuxnet; attack motivation; attacker threat levels; DirectTV piracy; operational environment; porous firewalls; Davis Besse incident; BlueSniper rifle; integrity; authentication; secrecy; privacy; LG Smart TV privacy; DoS/DDos; feature activation; St. Jude pacemaker recall
37Security VulnerabilitiesExploit vs. attack; Kettle spambot; weak passwords; master passwords; crypto key length; Mirai botnet attack; crypto mistakes; LIFX revisited; CarShark revisited; chip peels; hidden functionality; counterfeit systems; cloud connected devices; embedded-specific attacks
38Security Mitigation ValidationPassword strength; storing passwords & salt/pepper/key stretching; Adobe password hack; least privilege; Jeep firewall hack; secure update; secure boot; encryption vs. signing revisited; penetration testing; code analysis; other security approaches; rubber hose attack
39Security PitfallsKonami code; security via obscurity; hotel lock USB hack; Kerckhoff's principle; hospital WPA setup hack; DECSS; Lodz tram attack; proper use of cryptography; zero day exploits; security snake oil; realities of in-system firewalls; aircraft infotainment and firewalls; zombie road sign hack

Note that in Spring 2018 these are likely to be updated, so if want to see the latest also check the main course page:   For other lectures and copyright notes, please see my general lecture notes & video page:

Friday, November 17, 2017

Highly Autonomous Vehicle Validation

Here are the slides from my TechAD talk today.

Highly Autonomous Vehicle Validation from Philip Koopman

Highly Autonomous Vehicle Validation: it's more than just road testing!
- Why a billion miles of testing might not be enough to ensure self-driving car safety.
- Why it's important to distinguish testing for requirements validation vs. testing for implementation validation.
- Why machine learning is the hard part of mapping autonomy validation to ISO 26262