Types upon types: Making plaintext coding safer

2021-05-16

In a previous post, I ran through how we were able to make coding of the Azure Data Factory (ADF) dynamic language safer by extracting the plain text language into equivalent Typescript functions. What this means is that something like "concat()" can instead be written as concat() (notice the lack of quotes, meaning it's a real Typescript function).

Because the Typescript function will eventually just "compile" down to the equivalent string, the immediate benefit is that it's easier to code because we can take advantage of the Typescript compiler to identify glaring issues (parentheses, commas, misspellings, etc.). It was, however, really easy to add a layer on top of this that provides type safety.

The crux of our type-safety is the use of the following classes:

export class ADFExpression {
    constructor(private value: string) {}

    toString(): string {
        return this.value;
    }
}

export class StringExpression extends ADFExpression {
    private readonly _phantom: "string" = "string";
}
export const rawString = (str: string): StringExpression =>
    new StringExpression(`'${str}'`);

export class IntegerExpression extends ADFExpression {
    private readonly _phantom: "integer" = "integer";
}
export const rawInt = (n: number): IntegerExpression =>
    new IntegerExpression(n.toString());

// and so on for float, boolean, etc.

The key thing to note here is the use of a _phantom field on the class which ensures that Typescript's structural type checking can disambiguate between the different types. With the above, it then becomes really easy to create the aforementioned concat function:

/**
 * Combine two or more strings, and return the combined string.
 * @param args At least two strings to combine
 * @returns The string created from the combined input strings
 */
export const concat = (...args: StringExpression[]): StringExpression => {
    if (args.length < 1) {
        throw new Error("must provide at least 1 argument to concat");
    }
    return new StringExpression(
        "concat(" + args.map((x) => x.toString()).join(", ") + ")"
    );
};

test("concat() throws", () => {
    expect(concat).toThrowError("must provide at least 1 argument to concat");
});

test("concat('hello')", () => {
    expect(concat(rawString("hello")).toString()).toEqual("concat('hello')");
});

test("concat('hello', 'world')", () => {
    expect(concat(rawString("hello"), rawString("world")).toString()).toEqual(
        "concat('hello', 'world')"
    );
});

The trick here is that the concat function is only allowed to accept StringExpression arguments, and also only return a StringExpression. Consider how we can then combine these functions together in a type-safe manner:

// Compiles successfully
concat(rawString("prefix"), rawString("-"), rawString("suffix"));

// Fails to compile with:
//   Argument of type 'IntegerExpression' is not assignable to parameter of type 'StringExpression'.
//   Types have separate declarations of a private property '_phantom'.ts(2345)
concat(rawString("suffix"), rawInt(2));

This becomes particularly valuable in the larger dynamic property sections which might look something like this:

// This code creates a path to a blob storage endpoint using a json object
concat(
    if(
        equals(json(pipeline<Params>.parameters.config).targetIsSecure),
        'https://',
        'http://'.
    ),
    json(pipeline<Params>.parameters.config).targetEndpoint,
    rawString("/"),
    json(pipeline<Params>.parameters.config).targetPath,
    rawString("/"),
    guid('D'),
)

Creating the above as a plain text string would present tangible maintainability issues, but with our approach, we get the following:

Code is type-checked
We can make use of native testing facilities
Intellisense of all ADF functions
JSDoc documentation provides explanations of functions in code
Additional compile-time logic can be added as TS code (see args.length < 1 check in concat).

In the future, we might revisit this and consider creating a parser & interpreter, allowing us to execute these expressions locally (instead of relying on the ADF service). For now, however, this approach only took a small amount of time and gave us a significant maintainability improvement.