Imagine if Gmail was unencrypted… like Ebay

Update 03/07/2017: https is starting to appear on webmail but http still seems to work in places and until it is removed and cookies are secure, it is probably best to avoid using Ebay.
Ebay mocks Data Protection laws by sending emails over http. When its site states its email service is secure.


This is already illegal under Data Protection laws. It doesn’t even need upcoming GDPR: the way eBay describes their security means they are misleading users and even without GDPR, this kind of messaging system should be governed under laws like PECR.

eBay https security is vulnerable to attack from http

This problem is fixable by forcing session cookies to secure only and upgrading to https with hsts to reduce the risk of this problem returning.

I tested this against instead of (made using an /etc/host entry to localhost and a SimpleHTTPServer) and a request to that works too. However, I’m on a shared network  (at Fosdem) and pretty sure they don’t want me to run Wireshark on their wifi, so can only demonstrate it using copy as curl from Chrome to show the cookies sent plaintext.

NPM is lying to you and Facebook misses copyright attribution

Update: Originally titled “NPM is lying to you and Facebook is stealing copyright” I’ve amended it out of respect to those who weren’t happy with this, but this error should reflect on Facebook audit processes (due diligence) of copyright attribution, which would hopefully have caught this. Regarding concerns about attribution to Mozilla in the issue ( I think there is a misrepresentation of CC0/dedicated to public domain in the comments: it is not the same as copyright expiry and it’s important that the rights holder (which I believe is still Mozilla) is tracked by Facebook even if not attributed in published bundles. If nobody tracked that came from Mozilla, then when the page goes, the first to copy the page can sue everyone.

Firstly, copyright is complicated and getting this right is difficult and I don’t believe that the npm website is trying to lie to you, but that some of the projects on there are (hopefully accidentally) doing so.

No billion dollar company has the right to get this wrong and they should all be running regular audits, but even they might slip up and if they do, SCO vs open source  and Google’s 9 lines were painful moments, so if they could lead by example it would be great. I do hope everyone believes individual developers should be given a little room on accuracy in this domain, we’re unlikely to be lawyers, but if you do spot this kind of thing… please please please let the parties affected know in a respectful fashion that allows them to resolve it sooner rather than later, it is one thing to slip up for a short period of time and another as it gets longer: the longer it is left without resolution, the more dependent projects that might be affected too.

When you look at the licences in a library in npm, you think great it is Apache, BSD, MIT, etc and I can probably use it pretty freely.

When it’s LGPL, GPL, AGPL or EPL it gets more complicated, but may not be impossible… it might even be okay if you wish to adopt these.

Well, those licences aren’t complete in npm for many libraries. Partly because of wonderful technologies like webpack that bundle your code with your dependent code, but don’t, by default, facilitate creating a combined licence file in the process.

npm isn’t the only party getting this wrong, too many open source tools encourage you to label a project as one licence, when in truth it is more likely that your project’s direct code is one licence, but when packaged it is a multi-licence project.

To make matters more complicated, some source code repositories include third party code directly in their source repository (perhaps because it isn’t available from the repository they choose for the project, like npm) and this results in the source code repository itself being mixed licence… how do I fit that in the Github licence option?

If you publish code that is a mix of others work, including in a bundle or even as just accompanying assets, please ensure that the licences are published too. At least we don’t have to make printable booklets to ship with physical products.


Facebook is a big multinational software company. They obviously know about copyright law in their legal teams.

Well they’ve missed something… their current version of the React website uses this wonderful JavaScript file  which is full of copyright statements about Facebook, but none for third party libraries.

Hmm… strange, their library has dependencies on object-assign (amongst others).

Let’s npm install it and see what’s in the dist folder. There’s a basic react.min.js file and there’s an add-ons one that’s also available online at the version I’m seeing locally: 15.4.2

Strange, again it only has Facebook copyright in, but no third parties.

Their add-ons page doesn’t exactly tell you about the embedded object-assign copyright licence which is MIT and requires that if you include object-assign in your own works you need to include their MIT licence with it so that users know that parts of the React software include object-assign.

Bad Facebook, not only breaching copyright, but as developers often use them as a reference for how to build web pages, they risk setting a bad example for how to manage copyright. Their legal team should be on top this, ensuring a regular audit happens and helping to oversee it.

They have a similar issue with Draft.js


I spotted jsrsasign did this, but I’ve seen it before. Sorry to out jsrsasign, it looks like a great project… Javascript encryption enables client-side private keys and object level security instead of passwords over only network level https (mutual auth is great for your enterprises’ servers, but isn’t catching on for the open web).

Make sure you understand encryption export law if you wish to use it, I won’t pretend I know enough to offer advice and ThoughtWorks have been good enough to offer some, but you should check with a legal expert.

This has a hidden ext folder when attempting to determine how to reference open source licences that you would need to publish with your end product, because this isn’t referenced in npm. I think it can be, but unluckily jsrsasign haven’t yet… hopefully they will soon.





Complex Primitives

I have a crazy idea: create a cross-platform language, no not Java: something better. Primitives are supposed to the simplest form of data in a programming language. So how hard can it be to work with them…

Typical representations

  • References (pointers)
  • Boolean
  • Integer numbers
  • Floating point numbers (binary and sometimes decimal)
  • Primitive structures (array, list)
  • Character(s)

Boolean is complex

In typical computing systems, everything is a 0 or  a 1, except usually nothing is.

CPUs typically look at numbers at bit sets of length 8, 16, 32 or 64: not usually 1.

Although most have somewhere that this doesn’t hold true, either with longer primitive sizes (128/256), special floating point versions (56, 80, …) or slightly weird 31 bitset sizes (bad IBM).

The easiest way to manage boolean is to choose 0 as true or false (often false) and anything else as the opposite. However, what size of bitset do you use? If you use the defacto int then it might be different in different compilers (32bit vs 64bit).

Luckily, all you need to know is the size of the bitset and the offset in memory.

Integer is much more complex

So boolean is an offset in ram and a size of the bitset to use: all 0s then it’s false and anything else it is true.

Integers share the problem that you need to know the size of the bitset, but suffer a further problem: order and signing.

The signed number part is simple: a bit is reserved at the most significant bit to be used for representing positive (zero) or negative (1: with 2s complement), but order gets complex..

Bits and Byte Ordering

So what is order: well that “most significant bit” is the problem. Endian of bits and bytes comes into play (and they’re not always the same as each other).

The order of bits varies between processors and usually this problem is something that is more likely to affect you at a very low level (drivers, hardware, etc): to make it more fun most computers have more than one cpu. Your sound card, graphics card, network card, etc might all see bits a different way around: nevermind the busses.

At the software level you usually find everything is the same (let me know if I’ve got this wrong), but here you suffer byte order differences where different protocols (network, inter-process, file  formats) can each represent things larger than a byte in different order.  This isn’t too difficult to solve ( you just need to remember to use it everywhere your program interfaces with the world.

Floating Points

Luckily floating points are strictly described in IEEE-754, well I say luckily: except the complexities of implementing it mean that not all languages actually adhere to it:


Characters and Strings are terrifying.

There are hundreds (maybe thousands) of character sets: in a large part because some base character sets (Latin1) have multiple versions for different languages. Not all are easy to work with (I remember something odd about Turkish EBCDIC and xml processing problems as symbols can be remapped). The simplest solution is to make everyone fit into a box and force UTF-8: then hope that nobody adds a BOM, let’s hope UTF-8 never gets deprecated.


So you have a reference to some data…. how do you reference it and what kind of data might you have to reference:

The reference could be a nice compile time fixed size (like a pair of integers, I mean a pair of 64 bit integers).

It might be a variable size (String of characters) holding some JSON: so maybe a block of memory.

Or it might be a continuous stream (/dev/urandom)

Or it might be a channel of offset data (File on disk) with parts that might no longer be available later in the day or new parts that arrive whilst reading.

It’s easiest to manage the fixed size case (c style) and then re-use the fixed size blocks for streams of data, but sometimes you need more complex references like File handles.

So a VLQ ( might do for the simple case, and then a VLQ that contains References to further VLQs might be usable for the rest of the use cases.

Arrays and Lists

I don’t think I really consider these to be primitive types

Great, I don’t need to write these in my language: I can borrow them? Well maybe, except when it comes to mixing primitives with polymorphic types: well maybe I can still use them I guess I can just put the type into the box as the first entry.

It’s all in the bus

Eventually, much of the data you use ends up moving through the buses on your system and they have different sizes and then on top of that you get fixed sized pages that move over buses, which you hope are an integer multiple of the bus size: typically the one everyone knows that isn’t is the MTU (which varies between broadband, ethernet and modem systems).

So when you use these complex primitives you might not want to just use the language primitives: but optimise for the bus/packet/page sizes involved. Should these be primitives? Well that might depend on your architecture and for a cross platform language I guess you should let it be a language specific optimization to curry in an outside primitive.

So how to solve this for writing a new language?

Copy Scala and Groovy: use the JVM to solve this for you and give you a consistent view of the world and force everyone to map using Java data structure until later… although I’m tempted to checkout the CLR/Mono too.

Keeping in Time

Now it is 15:31 on the 11-09-2016 and I’m in London.

Writing and Reading dates

Always use the order or full year, month then day of month.

  • Text ordering is now time ordered
    • 2016-06-06 is always after 2015-06-07 in text and time
    • 06-07-2016 is before 07-06-2015 and after 05-06-2014 in text ordering, but not in time.
    • This useful in table ordering, like when listing files.
  • Local differences don’t matter so much (Europe vs USA format)

Offset Time

The UK has British Summer Time, so that time point is 1 hour ahead of where it should be, so the time is actually 2016-09-11T15:31:00+01:00 (in ISO 8601 format).

Zoned Time

If I asked you to call me at this time in 6 months, I might not appreciate a call at 2017-03-11T15:31:00+01:00 because I’m then out of Daylight Savings, so instead it’s useful to capture the time zone.

Like this, 2017-03-11T15:31:00 [Europe/London] where Europe/London is an official designation of my time zone. Except this isn’t quite good enough…

What if I wanted a call at 2am on 29th of October, 2017?

2017-10-29T02:00:00 [Europe/London] happens twice and so to be as distinct as possible then perhaps 2017-10-29T02:00:00+00:00 [Europe/London] and 2017-10-29T02:00:00+01:00 [Europe/London] would be enough to know which is which.

GPS Time is ordered?

Be wary of using GPS time. Although GPS time is an unadjusted series of seconds since a point in time, will it always be so? It’s use case is for positioning and if there ever were a need, I would guess adjusting time to fix positioning would be preferred to adjusting positioning to fix time.

Use TAI time when you need time order or better use incrementing ids.

You want to know about the series of events, such as, in a server log or transaction log.

The problem is UTC goes back in time and as Unix time typically uses UTC, your logs will too.

But, it doesn’t have to be this way: TAI time is ordered and not affected by adjustments to keep time in order with our solar cycle, so no leap seconds… that doesn’t mean there aren’t adjustments, but the adjustments should be such much smaller fractions of a second.

But, even with TAI, that fraction of a second can still get bigger: you don’t poll NTP at a rate of a fraction of a second and PC clocks fall out of sync and on VMs sometimes more than expected.

So why not just use an incrementing counter? The order of log lines effectively does this, but assumes that they are written in order and kept in order, so I guess the question here is now that you aggregate your log files to central servers (Splunk, Logstash, etc) are they still in order of events or are they in order of adjustable time stamps.Summary

  • For everyday usage use a Zoned Offset
    • 1999-12-31T23:59:59-04:00 [America/New_York]
  • For scientific calculations use TAI, but how to distinguish from UTC?
    • 2017-03-11T14:31:36
  • For strict ordering of events, don’t use time: but keep a reference for indexing (finding the log lines)

ICO has no powers over webcams

ICO published a letter to Webcam manufacturers… well you don’t have to pay much attention to it if you are one.

Dell decided to break https encryption on their laptops by installed a vulnerable root certificate.

If you run a business and store personal data, you must go through heaps of hoops to ensure you are compliant with data protection law. But the manufacturer of the server, network equipment and laptops you have to use has no requirements: so they can be as insecure as they like and you pick up the bill when the ICO chases down their breach.

Case Reference Number RFA0606701

I write in relation to your concerns about Dell’s new equipment security fault, about which my colleague has previously responded to you.

The DPA works by placing obligations on organisations that hold personal information. The DPA does not however place any obligations on the manufacturers of equipment that may be used for storing personal information.

The security requirement of the DPA (the seventh data protection principle) requires an organisation holding personal data to have adequate technical and organisational measures in place to protect the personal data (taking account of the nature of the information being held, the availability of technology, and the cost of implementing those measures).

As such, an organisation that has purchased Dell equipment subject to the fault for the storage of personal data may be contravening the DPA if they have failed to keep personal data secure as a result of their use of insecure equipment for the storage of personal data.

Dell is not contravening any requirements of the DPA by selling insecure equipment. The DPA does not, in any way, require suppliers of equipment to ensure their products are secure. The obligations arising from the DPA are for organisations using the equipment for the storage of personal data.

Because our powers are specific to the DPA there is therefore no punitive or other action we can take against Dell over its failure to sell secure computer equipment.

CompletableFuture: does it block?

What happens with this (full code below)?

    public void obviouslySecondFirst() {
                supplyAsync(() -> first).thenAccept(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenAccept(concurrentLinkedQueue::add)
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));

This will randomly swap between returning [“second”,”first”] and [“first”,”second”] and therefore, randomly block second


Repeat it…

    public void obviouslySecondFirstWithWaitBeforeCall() {
        final CompletableFuture<String> suppliedFirst = supplyAsyncFirst();
                suppliedFirst.thenAccept(addDelayed(concurrentLinkedQueue, delay)),
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));

This is a bad design!


If the methods were split between two classes (AsyncCompletableFuture and SyncCompletableFuture) then I might forgive this as I could easily code review the differences, but they’re all thrown in the same one.

To make matters worse, some methods don’t explicitely have an async option.

So there’s a method exceptionally(), but no exceptionallyAsync(), will that block when you do supplyAsync(()->x).exceptionally(t->blockingLogging(t))?

EDIT: 22/06/16:10am

Confusing chaining

    public void secondsecondShouldBeFirstFirst() {
                supplyAsync(() -> first).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)),
                supplyAsync(() -> second).thenApply(addDelayed(concurrentLinkedQueue, delay)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2))
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, second, first, first)));

This will randomly block and randomly not: sometimes returning [first,first,second,second] and sometimes [second,first,second,first]

The code…

import org.junit.Test;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ConcurrentLinkedQueue;
import java.util.function.Function;
import java.util.function.Supplier;
import static;
import static;
import static java.util.concurrent.CompletableFuture.allOf;
import static java.util.concurrent.CompletableFuture.supplyAsync;
import static java.util.concurrent.TimeUnit.SECONDS;
import static org.hamcrest.CoreMatchers.equalTo;
import static org.junit.Assert.assertThat;
public class SupplyItAsyncMaybe {
    private void delay(int seconds) {
        try {
        } catch (InterruptedException e1) {
    final String first = "first", second = "second";
    final int delay = 2;
    final ConcurrentLinkedQueue<String> concurrentLinkedQueue = new ConcurrentLinkedQueue<>();
    private Supplier<String> supplyFirstAfterDelay(int seconds, final String initalValue) {
        return () -> {
            return initalValue;
    private Function<String, String> addDelayed(final ConcurrentLinkedQueue<String> concurrentLinkedQueue, final int seconds) {
        return (e) -> {
            return e;
    public void secondShouldBeFirst() {
                supplyAsync(() -> first).thenApply(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenApply(concurrentLinkedQueue::add)
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));
    public void secondsecondShouldBeFirstFirst() {
                supplyAsync(() -> first).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)),
                supplyAsync(() -> second).thenApply(addDelayed(concurrentLinkedQueue, delay)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2))
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, second, first, first)));
    public void secondsecondShouldBeFirstFirstAlways() {
        CompletableFuture<String> stringCompletableFuture = supplyAsync(() -> first);
                stringCompletableFuture.thenApply(addDelayed(concurrentLinkedQueue, delay * 2)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)),
                supplyAsync(() -> second).thenApply(addDelayed(concurrentLinkedQueue, delay)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2))
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, second, first, first)));
    public void secondsecondShouldBeFirstFirstDelayedFutureSupplier() {
                supplyAsync(supplyFirstAfterDelay(delay, first)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2)),
                supplyAsync(supplyFirstAfterDelay(delay, second)).thenApply(addDelayed(concurrentLinkedQueue, delay)).thenApply(addDelayed(concurrentLinkedQueue, delay * 2))
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, second, first, first)));
    public void secondIsNeverFirst() {
        final CompletableFuture<String> suppliedFirst = supplyAsync(() -> first);
                suppliedFirst.thenApply(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenAccept(concurrentLinkedQueue::add)
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));
    public void secondIsNeverFirstWhenDelayIsLonger() {
        final CompletableFuture<String> suppliedFirst = supplyAsync(supplyFirstAfterDelay(delay, first));
        delay(delay * 2);
                suppliedFirst.thenApply(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenAccept(concurrentLinkedQueue::add)
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));
    public void asyncSimple() {
                supplyAsync(() -> first).thenApply(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenAcceptAsync(concurrentLinkedQueue::add)
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));
    public void asyncWithADelay() {
        final CompletableFuture<String> suppliedFirst = supplyAsync(() -> first);
        delay(delay * 2);
                suppliedFirst.thenApply(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenAcceptAsync(concurrentLinkedQueue::add)
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));
    public void asyncWithMultipleDelays() {
        CompletableFuture<String> stringCompletableFuture = supplyAsync(supplyFirstAfterDelay(delay, first));
        delay(delay * 2);
                stringCompletableFuture.thenApply(addDelayed(concurrentLinkedQueue, delay)),
                supplyAsync(() -> second).thenApply(concurrentLinkedQueue::add)).join();
        assertThat(copyOf(concurrentLinkedQueue), equalTo(of(second, first)));

Programming Languages are Broken

In the context of this post, immutability is the surface of the feature that stays the same, allowing it to be reused with reliability.

It’s not just Left-pad

It’s left-pad; dependency, jar or dll hell; segfault, a conflict warning, a NoClassDefFoundError or unexpected behaviour; HTTP errors, marshalling errors and can even be unexpected timeouts, infinite loops and any other unexpected behaviour.

What did SQL get right?

If you model an order system in SQL, you could  contract a SQL guru to do it in 2003 and years later it’d likely still work. I’d be suprised if the average Node app can last months without some form of npm dependency problem.

Banks didn’t trust us

Enterprise software industry was maybe making progress on this: W3C (in the old days), JSRs, OMG and OASIS were making immutable standards with backwards compatibility.

But outside the “Enterprise” umbrella, the rest decided to shun strict xhtml, ebXML, SOAP, CORBA IDL and jumped into HTML5, REST, JSON and agile moving targets that steer and depend on many open source software projects.

Most businesses aren’t going to have a business model that changes very much; so why does the software that supposedly represents it?

Microsoft’s did do something good (never let me say that again), with relatively immutable contracts in their API layers and they weren’t the only ones, resulting in the famous acronym: VRMF.

Enterprise computing was dominated by some degree of immutable contracts from Oracle, Microsoft, IBM, Sun, Intel, etc and we still enjoy their efforts. Now, they weren’t doing all of this for fun… regulators liked interfaces.

So what is relatively immutable?

Most of these…

  • Instruction Sets
  • Assembly
  • Enterprise programming languages
  • Enterprise Document Formats
  • Network protocols
  • Enterprise database interfaces
  • Filesystem data structures
  • Games console libraries

They share something in common. They are either used in regulated environments like for major industries’ core business (banking systems, medical uses, etc) or governments, run on embedded systems that are hard to update or tied to hardware.

And the rest …

  • Application code: but to be fair, this may not have any consumers except Human Beings
  • Custom, internal integration services and models developed internally in companies. From startup to multinational, their internal services are only as good as the care dedicated to the project. Sometimes you’re lucky to have a spec and other times that specification isn’t as long lasting as you’d hoped.
  • .., and I hate to say it, but if feels like most open source libraries and applications.

Open source projects seem to thrive on the ability to break users of their interfaces and those that don’t often have strong relationships with Enterprise businesses… not always, sometimes ties to Enterprise don’t help either.

Hacking safety in

Some build systems (usually rather poorly) try to enforce immutable versions on top of a programming language and at runtime plugin systems can try to do the same. But it’s not an easy process to work with. Both often require a lot of hand-holding to ensure that migration between immutable models, interfaces or services happens without disruption. If you’re lucky, they’ll warn you about problems and that is great… Maven would be a lot worse without Enforcer, but even in this case the tools aren’t always there by default. They highlight the other problem of strange behaviours in that why would you ever allow multiple versions of a library to be imported at the same time: this isn’t unique to Java or Maven, in fact they are probably a better pairing then most.

Why programming languages are to blame?

Although the languages themselves are relatively Immutable (when did java.lang.String not have backwards compatibility) they encourage software development that isn’t. The first mistake is to use text files for programming languages and depend on REST for build systems. Neither of which are immutable and yet both of which usually underpin the dependencies for a language, either the packaged library modules or the import/require statements use them, but if you imagine a language that only imported by torrent hashes, then breaking compatibility would be much harder, maybe impossible? Using a hash based database might work quite well, code might be unreadable:

import sha512[23123213...]

But then you can map readability:

identify sha512[23123213...] as listbuilder
import listbuilder

Other problems are programming languages encourage mutable design patterns with abstract classes, implicits, annotation preprocessors/dynamic dependency injection.

Things often get worse you start using a languages’ custom DSL for XML or JSON: are you creating an XSD first?

Sometimes they embed auto serialization/deserialization to object formats, so code (that mutates) becomes the contract for middleware services, taking a problem at a language level and turning it into one that affects libraries and service layers alike.

Fixing it

  • Let’s create immutable build dependencies and imports.
  • Ban version conflicts
  • Let’s drop JSON and REST
  • Code to interfaces, don’t interface from code
  • Ban inheritance of mutable features and implicits that can modify  the runtime behaviour unexpectedly
  • Ban Javascript
  • Simplify languages
  • Aim for code to last 10 years for business domain logic, or maybe just ask the business more often about whether they realise the risks in the choices the development team is making.