Normalising messy API payloads

Take a raw third-party response and reshape it into a consistent internal format — trim strings, cast types, rename keys, drop nulls — as one reusable compose() pipeline. The same pipeline works for one record or a list via array_map.

Functions used

The problem

External APIs rarely hand back data in the shape your app wants. Fields are differently cased, numbers arrive as strings, optional values show up as empty strings instead of null, and deeply nested bits need flattening. You typically end up with a page of foreach ($data as &$row) { ... } scaffolding.

Source data

$raw = [
    [
        'ID'         => '1',
        'FullName'   => '  Ada Lovelace  ',
        'EmailAddr'  => 'ADA@EXAMPLE.COM ',
        'Age'        => '42',
        'Notes'      => '',           // empty string — treat as null
    ],
    [
        'ID'         => '2',
        'FullName'   => 'Bea Smith',
        'EmailAddr'  => 'bea@example.com',
        'Age'        => 'unknown',     // invalid age — drop the row
        'Notes'      => 'VIP',
    ],
];

Target shape:

[
    'id'    => 1,          // cast to int
    'name'  => 'Ada Lovelace',
    'email' => 'ada@example.com',
    'age'   => 42,
    'notes' => null,
]

Build the per-row pipeline from small steps

Each step is a pure function. Each one is named for what it does. No intermediate variables.

use PinkCrab\FunctionConstructors\GeneralFunctions as F;
use PinkCrab\FunctionConstructors\Arrays as A;
use PinkCrab\FunctionConstructors\Strings as Str;

// Rename + shape: source field → output field, with a value transform.
$shape = fn($row) => [
    'id'    => (int) $row['ID'],
    'name'  => trim($row['FullName']),
    'email' => strtolower(trim($row['EmailAddr'])),
    'age'   => is_numeric($row['Age']) ? (int) $row['Age'] : null,
    'notes' => $row['Notes'] === '' ? null : $row['Notes'],
];

// The row is useful only if the age cast succeeded.
$hasValidAge = fn($row) => $row['age'] !== null;

// One pipeline that shapes every row, drops invalid ones.
$normalise = F\compose(
    A\map($shape),
    A\filter($hasValidAge)
);

$clean = $normalise($raw);

Result:

[
    ['id' => 1, 'name' => 'Ada Lovelace', 'email' => 'ada@example.com', 'age' => 42, 'notes' => null],
]

Bea is dropped because her age failed the numeric check.

Why this reads well

Using it on a single record

pipe is the immediate-value counterpart to compose:

$one = F\pipe(
    $raw[0],
    $shape,
    $hasValidAge,           // returns bool, but pipe just passes the value through if truthy
);

Actually that’s not quite right — pipe doesn’t short-circuit on bool. For a single-record version use ifThen to gate:

$normaliseOne = F\compose(
    $shape,
    F\ifThen($hasValidAge, fn($row) => $row)   // identity when valid, original row when invalid
);

$normaliseOne($raw[0]);  // shaped row

Adding a new field later

Six months from now, the API gains a Department field. Only $shape changes — the filter, compose, callers all stay the same:

$shape = fn($row) => [
    /* existing fields ... */
    'department' => trim($row['Department'] ?? 'Unknown'),
];