This content originally appeared on DEV Community and was authored by Kyle
I personally know somebody that has broken production, which cost their company hundreds of thousands of dollars in a matter of hours. Definitely not a good look, but this didn’t happen because of them, it happened because that person’s IT Director and senior engineer were completely fine without unit or component tests in the front end. As developers, we’re really only as good as the systems we find ourselves in. Regardless, I know that person felt terrible about what happened. Still, however horrible they must have felt, it couldn’t have been any worse than the Lockheed Martin engineer that introduced a $500 million dollar bug in 1998.
$500 Million Down the Drain
On December 11, 1998, the Mars Climate Orbiter took off for space from the launch pad in Cape Canaveral near Orlando, Florida. A year later, the Orbiter dove into Mars’ atmosphere at a top speed of 12,000 mph and the orbiter was likely ripped apart in seconds. All communication with the orbiter was lost and $500 million dollars was flushed down the drain. The work of 200+ engineers, many of whom considered this their life’s work, was gone. While tragic at the time, this story provides a lesson that highlights one area of software engineering where I think the magnitude of its importance goes overlooked.
The Origins of C++ Object Orientation
If you’re a developer, you know what object-oriented programming is and you know about the four pillars and how they can be used to keep our applications modular and flexible. Up until recently, I didn’t truly understand what this “modularity” and “flexibility” provides for our applications. At its core, object-oriented principles give us a safety net, both literally and figuratively. If you know anything about the relation between C and C++, you know that C++ is basically C with object orientation. If you know that, you probably also know that the creator of C++ was none other than Bjarne Stroustrup, who was very intentional with what he wanted to achieve with his creation of an “object-oriented version” of C.
I recently completed Dr. Chuck’s C Programming for Everybody course and ran into fascinating insights regarding the true impact of OOP within our software. In the last section, I learned about how, in theory, you could replicate object orientation in C and how languages like Python replicate object-oriented patterns and implementations for their data structures. In this last section of the course, there was a video where Bjarne was asked, “Why did you create C++ and what were you trying to achieve?” and he basically said he wanted to deliver true reliability for applications that need it the most. By “the most,” Bjarne was referring to things that, if the software were to fail, could be the result of life or death for human beings or result in catastrophic events such as the collapse of the Mars Climate Orbiter.
Bjarne, in addition to using the Mars Climate Orbiter example, mentioned the systems that our phones and cars run on. What if the software running our cell phones goes out? People might not be able to reach 911. And the brakes in our cars? We wouldn’t want those to go out. These are just a couple of the scenarios that Bjarne built C++ for, which I think create a palpable feeling of their importance to us. If you’re a programmer, this makes you appreciate the reason behind object-oriented principles (including interfaces) beyond just memorizing them for a job interview.
The Mars Climate Orbiter
The two main companies involved in the launch and management of the Mars Climate Orbiter were none other than Lockheed Martin (located in Denver, Colorado) and NASA’s Jet Propulsion Laboratory (JPL) (located in Pasadena, California). In typical space-exploration scenarios, everything is measured in metric units. In this scenario, the specifications for the project called for metric units, but Lockheed Martin wrote code that would send imperial units to NASA at which point NASA would insert these units into their model (which expected metric units). When NASA thought the orbiter was in Mars’ orbit, it was actually in the atmosphere, which was irrecoverable.
Throughout the process, it was Lockheed Martin’s job to control the “attitude” of the orbiter, which is basically just the direction it’s facing. For example, you are looking at a computer screen right now. You would change the attitude of your head by keeping it in the same spot but just twisting your neck to the left or right. In space, when Lockheed Martin adjusted the attitude, that would affect the momentum of the orbiter, which was measured as an “impulse”. Lockheed would store information about the “impulse” in a file, send it to NASA over their internal networks, and then NASA would process that data into their “orbital dynamics model.”
In the postmortem of the incident, it would have been easy to point a finger at Lockheed considering that the specs called for metric units. However, software development is a two-way street, so both parties correctly took ownership.
The Fix
In Bjarne’s own words, “This could have been avoided by an ever so slight improvement in the interfaces between the parts of that program.” So, what might this have looked like in practice? Since C# is my go-to, we’ll roll with that here. If you’re not interested in the code, skip past this section for a high-level overview.
In physics terms, Impulse (J) = Force (F) × Time (Δt).
The following is a BAD example from the vantage point of Lockheed Martin where they would have passed a naked double into the impulse generator and then sent that information to NASA’s model—no way of checking the unit of measurement for the incoming value:
public static class LockheedBad
{
public static double GetThrusterForce() => 55.0; // lbf, but type says nothing
public static double GetBurnDuration() => 2.0; // s
public static double ComputeImpulse() => GetThrusterForce() * GetBurnDuration(); // 110 (lb*s)
public static void WriteImpulseForNasa()
=> System.IO.File.WriteAllText("impulse.txt", ComputeImpulse().ToString()); // "110" with no unit
}
On the flip side, the following would be a GOOD example of what Lockheed could and should have done. If you plan on looking at the below code, just focus on the IQuantity interface, the enums, and the Force and Impulse structs, as these were the main areas of concern in this example and show how type safety might have been implemented.
In theory, there are 3 main steps that would have been taken in this process (in actuality, this is not what happened):
Lockheed would verify the units of measurement in their own system using the IQuantity interface to set the value of force and to SI units, no matter which unit of measurement was being used as the generic type. Then, that Force, which would be certain to be in SI units, would be used to calculate the impulse.
Lockheed would pass the impulse data to NASA. In the below code, the unit of measure is included with the value in whatever data transfer method they used (JSON, XML, a file, a message, Protobuf, etc.).
NASA would receive this information (value of impulse and unit of measurement) and they would implement a similar type-checking interface. If both Lockheed and NASA were using the same language, they could have used the same library to do this.
using System;
using System.Text.Json;
using System.Text.Json.Serialization;
// Shared contract: "every quantity must expose its SI value"
public interface IQuantity<TUnit>
{
double ValueInBaseUnit { get; } // stored in SI (base unit)
}
public enum ForceUnit { Newton, PoundForce }
public enum TimeUnit { Second }
public enum ImpulseUnit { NewtonSecond, PoundForceSecond }
public readonly struct Force : IQuantity<ForceUnit>
{
public double ValueInBaseUnit { get; } // Newtons
private Force(double newtons) => ValueInBaseUnit = newtons;
// Caller must declare the unit; constructor converts to SI (Newtons).
public static Force From(double value, ForceUnit unit) => unit switch
{
ForceUnit.Newton => new Force(value),
ForceUnit.PoundForce => new Force(value * 4.4482216152605),
_ => throw new ArgumentOutOfRangeException(nameof(unit))
};
}
public readonly struct Duration : IQuantity<TimeUnit>
{
public double ValueInBaseUnit { get; } // Seconds
private Duration(double seconds) => ValueInBaseUnit = seconds;
public static Duration From(double value, TimeUnit unit) => unit switch
{
TimeUnit.Second => new Duration(value),
_ => throw new ArgumentOutOfRangeException(nameof(unit))
};
}
public readonly struct Impulse : IQuantity<ImpulseUnit>
{
public double ValueInBaseUnit { get; } // Newton-seconds
private Impulse(double newtonSeconds) => ValueInBaseUnit = newtonSeconds;
public static Impulse From(double value, ImpulseUnit unit) => unit switch
{
ImpulseUnit.NewtonSecond => new Impulse(value),
ImpulseUnit.PoundForceSecond => new Impulse(value * 4.4482216152605),
_ => throw new ArgumentOutOfRangeException(nameof(unit))
};
// Safe composition from Force × Duration (already SI inside those)
public static Impulse From(Force force, Duration time) =>
new Impulse(force.ValueInBaseUnit * time.ValueInBaseUnit);
}
At this point, Lockheed would have been able to use the structs (and methods within) that implemented the IQuantity interface. They would have then sent the correct data over to NASA, much like the below code:
// ================= Lockheed ground software =================
public static class LockheedGood
{
// Hardware gives Lockheed lbf and seconds. Factories convert to SI on entry.
public static Force ReadThrusterForce() => Force.From(55.0, ForceUnit.PoundForce); // 55 lbf
public static Duration ReadBurnDuration() => Duration.From(2.0, TimeUnit.Second); // 2 s
public static Impulse ComputeImpulse()
{
var F = ReadThrusterForce(); // internally stored in N
var t = ReadBurnDuration(); // stored in s
return Impulse.From(F, t); // internally stored in N*s
}
// Handoff to NASA in canonical SI (N*s)
public static void WriteImpulseForNasa()
{
var J = ComputeImpulse();
var payload = new ImpulsePayload
{
value = J.ValueInBaseUnit, // always N*s
unit = "N*s"
};
string json = JsonSerializer.Serialize(payload);
System.IO.File.WriteAllText("impulse.json", json);
}
public record ImpulsePayload
{
[JsonPropertyName("value")] public double value { get; init; }
[JsonPropertyName("unit")] public string unit { get; init; } = "N*s";
}
}
Conclusion
In hindsight, vision is 20/20, but this is a great example that shows how a simple interface, or lack thereof, could have completely changed the course of an entire project. At the end of the day, both NASA and Lockheed were at fault. If just one of the programs checked for the unit of measurement, this would have never happened. Again, hindsight is 20/20, but I think this is a really cool example that highlights the importance of object-oriented programming.
When building programs that humans rely on for survival, object orientation becomes a much more serious topic. This, at its core, is what Bjarne was trying to achieve back in the 1980s, and I think it’s necessary to step back and appreciate the “why” behind much of what we’re taught to know as developers, including OOP.
This content originally appeared on DEV Community and was authored by Kyle