What is binary API compatibility?

So imagine you are a crusty long term user of the internet and one of the early adopters of pegasus email and haven’t wanted to move with the times and use one of the more mainstream email clients.

How does an operating system vendor like Microsoft make it possible for you to install your ancient version of Pegasus email on Windows 10 and not have it croak with a segmentation fault saying it cannot find the version of the Windows socket API it relied on to implement it’s TCP/IP logic?

The answer is preserving binary API compatibility. How do you keep a C API binary compatible?

What it comes down to:

  • Do not remove or rename old C API calls.

  • Keep the arguments and calling convention exactly the same.

That can be an interesting design challenge when it comes to what techniques can be used to allow a operating system API to be upgraded to handle new requirements in manner which doesn’t break old applications.

A good example is the berkeley sockets API. This API was purposely designed to be an abstraction which could fit over many many different networking protocols with different implementations and addressing scheme. One problem it was able to solve was to have two different address conventions for TCP/IP:

  • IpV4 - the traditional 0-255 number octet - i.e. 244.12.223.122 (32 bits)

  • IpV6 - which extends the addressable range i.e. ac:de:48:00:11:22 (128 bits)

Berkeley socket API calls use these three patterns when dealing with socket addresses:

  • They take a sockaddr* which is an undefined pointer to an opaque address object.

  • They take the length of the given sockaddr structure

  • The first element of the sockaddr gives a value which identifies the struct type - the operating system can use this to understand what type of address it has been given and alter it’s logic accordingly

So socket applications then have to explicitly pick between the struct used for a IPv4 address:

 

struct sockaddr_in { sa_family_t sin_family; /* address family: AF_INET */ in_port_t sin_port; /* port in network byte order */ struct in_addr sin_addr; /* internet address */ }; /* Internet address. */ struct in_addr { uint32_t s_addr; /* address in network byte order */ };

versus an IPv6 address:

struct sockaddr_in6 { sa_family_t sin6_family; /* AF_INET6 */ in_port_t sin6_port; /* port number */ uint32_t sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* Scope ID (new in 2.4) */ }; struct in6_addr { unsigned char s6_addr[16]; /* IPv6 address */ };

With sa_family_t I think it is defined as a 32 unsigned integer in POSIX and a 16 unsigned integer in windows see https://stackoverflow.com/questions/11924068/what-is-sa-family-t. The purpose is that if it equals AF_NET6 then the operating system knows it is a IPv6 address if it is AF_INET then it is a IPv4 address.