<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Strata Blog]]></title><description><![CDATA[Building reliable agents for API integrations]]></description><link>https://blog.connectstrata.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Q3DZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58b0c53c-45be-4bfd-b46d-b27ada7af172_300x300.png</url><title>Strata Blog</title><link>https://blog.connectstrata.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 30 May 2026 06:11:03 GMT</lastBuildDate><atom:link href="https://blog.connectstrata.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Julian Mclain]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[connectstrata@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[connectstrata@substack.com]]></itunes:email><itunes:name><![CDATA[Julian Mclain]]></itunes:name></itunes:owner><itunes:author><![CDATA[Julian Mclain]]></itunes:author><googleplay:owner><![CDATA[connectstrata@substack.com]]></googleplay:owner><googleplay:email><![CDATA[connectstrata@substack.com]]></googleplay:email><googleplay:author><![CDATA[Julian Mclain]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Slack killed their OpenAPI spec, so we built one from their TypeScript SDK]]></title><description><![CDATA[272 endpoints, 134 webhooks, zero web scraping &#8212; generated from the TypeScript AST]]></description><link>https://blog.connectstrata.com/p/slack-killed-their-openapi-spec-so</link><guid isPermaLink="false">https://blog.connectstrata.com/p/slack-killed-their-openapi-spec-so</guid><dc:creator><![CDATA[Ilya Brin]]></dc:creator><pubDate>Wed, 15 Apr 2026 14:31:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Q3DZ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58b0c53c-45be-4bfd-b46d-b27ada7af172_300x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;re building <a href="https://connectstrata.com">Strata</a>, a tool that lets you connect software providers and plumb data between them. Under the hood, Strata&#8217;s AI agents rely on OpenAPI schemas to understand what each provider can do &#8212; what API endpoints are available, what the request/response shapes look like, and which webhooks you can subscribe to. These agents allow even non-technical users to quickly (and most importantly, accurately) connect various service providers and translate fields between them.</p><p>This works great for providers that publish Swagger/OpenAPI specs. Unfortunately, Slack happens to be amongst those that don&#8217;t.</p><div><hr></div><p><strong>Just here for the spec?</strong> Grab it directly: <a href="https://github.com/connectstrata/slack-openapi-generator/blob/main/slack-openapi-spec.json">slack-openapi-spec.json</a> &#8212; 272 endpoints, 134 webhooks, 1,526 schemas, updated weekly. </p><p><strong>Or, play around with it, directly in your browser</strong>, in <a href="https://editor.swagger.io/?url=https://raw.githubusercontent.com/connectstrata/slack-openapi-generator/main/slack-openapi-spec.json">Swagger&#8217;s online editor</a>.</p><h2>The problem</h2><p>In 2024, Slack officially &#8220;<a href="https://github.com/slackapi/node-slack-sdk/issues/1836#issuecomment-2198906867">terminated</a>&#8221; their <a href="https://github.com/slackapi/slack-api-specs">public OpenAPI specification</a>. If you go looking for one now, you&#8217;ll find a graveyard of <a href="https://github.com/APIs-guru/openapi-directory/blob/main/APIs/slack.com/openai/v1/openapi.yaml">stale community specs</a>, half-working scrapers, and specs that silently drop fields because Slack&#8217;s API types are genuinely complex &#8212; intersection types, literal string discriminators, deeply nested unions, dynamic RPC payloads.</p><p>We needed a complete, accurate spec for Strata&#8217;s Slack integration. Nothing we found was complete and up-to-date. So we built a generator!</p><h2>The approach: read the source of truth</h2><p>Here&#8217;s the key insight: Slack may have abandoned their OpenAPI spec, but their official Node SDK (<a href="https://www.npmjs.com/package/@slack/web-api">@slack/web-api</a>, <a href="https://www.npmjs.com/package/@slack/bolt">@slack/bolt</a>, <a href="https://www.npmjs.com/package/@slack/types">@slack/types</a>) is actively maintained and meticulously typed. The TypeScript type definitions <em>are</em> the spec &#8212; they&#8217;re just in an inconvenient format.</p><p>So rather than scraping docs or reverse-engineering API responses, the generator uses <a href="https://github.com/dsherret/ts-morph">ts-morph</a> to parse the Abstract Syntax Tree of the SDK&#8217;s TypeScript declarations and transpile them into OpenAPI 3.1.0.</p><p>No web scraping. No API calls. No guessing. Just reading the types (and docs!) that Slack&#8217;s own engineers maintain.</p><h2>How it works under the hood</h2><p>The generator runs through a few stages:</p><p><strong>1. Load the SDK&#8217;s type declarations.</strong> ts-morph ingests all the <code>.d.ts</code> files from the three Slack packages into an in-memory AST with strict null checks enabled.</p><p><strong>2. Walk the </strong><code>Methods</code><strong> class.</strong> The SDK declares every API endpoint as a nested property on a <code>Methods</code> class. The generator recursively walks this tree &#8212; each leaf is typed as <code>MethodWithRequiredArgument&lt;Args, Response&gt;</code> or <code>MethodWithOptionalArgument&lt;Args, Response&gt;</code>, so extracting the request and response types is straightforward. Property paths map directly to endpoint names: <code>files.uploadV2</code> becomes <code>/api/files.uploadV2</code>.</p><p><strong>3. Convert TypeScript types to OpenAPI schemas.</strong> This is the hard part. The type mapper handles:</p><ul><li><p><strong>Unions</strong> &#8594; <code>anyOf</code>, with optimizations like collapsing string literal unions into a single <code>enum</code> and hoisting common properties across branches</p></li><li><p><strong>Intersections</strong> &#8594; <code>allOf</code>, with inline object squashing so you don&#8217;t get needlessly nested schemas</p></li><li><p><strong>Generics</strong> &#8594; resolved by following defaults and constraints</p></li><li><p><strong>Circular references</strong> &#8594; broken with placeholder schemas and lazy thunks</p></li><li><p><strong>Name collisions</strong> &#8594; when two packages export a type with the same name (e.g., Bolt&#8217;s <code>Authorization</code> vs. web-api&#8217;s <code>Authorization</code>), the registry auto-disambiguates with numbered variants</p></li></ul><p><strong>4. Extract webhooks.</strong> Interfaces ending in <code>Event</code>, <code>Action</code>, or <code>Shortcut</code> from Bolt&#8217;s type definitions become webhook schemas, wrapped in the standard <code>EnvelopedEvent</code> envelope.</p><p><strong>5. Detect binary fields.</strong> The generator recursively scans schemas for <code>format: "binary"</code> and automatically adds <code>multipart/form-data</code> as a content type for those endpoints &#8212; so file upload endpoints Just Work.</p><h2>Staying up to date</h2><p>A schema is only useful if it&#8217;s current. A <a href="https://github.com/connectstrata/slack-openapi-generator/blob/main/.github/workflows/update-spec.yml">GitHub Action</a> runs weekly: it installs the latest versions of all three Slack packages, regenerates the spec, and auto-commits if anything changed. No manual maintenance required.</p><h2>Why we open-sourced it</h2><p>We built this for Strata, but there&#8217;s no reason to keep it proprietary. We&#8217;ve pulled numerous community-maintained OpenAPI specs for use in our product, and we&#8217;d love to be a good neighbor and return the favor.</p><p>Anyone building Slack integrations &#8212; whether you&#8217;re generating client SDKs, feeding API context to LLMs, setting up Postman collections, or building documentation &#8212; can use the generated spec directly.</p><p>You can grab the latest spec from the repo: <a href="https://github.com/connectstrata/slack-openapi-generator/blob/main/slack-openapi-spec.json">slack-openapi-spec.json</a></p><p>Or clone the repo and generate it yourself:</p><pre><code><code>git clone https://github.com/connectstrata/slack-openapi-generator.git
cd slack-openapi-generator
npm install
node generate.js</code></code></pre><h2>What&#8217;s next</h2><p>If you use the spec and run into edge cases &#8212; missing endpoints, type inaccuracies, TypeScript patterns that aren&#8217;t handled &#8212; we&#8217;d love to hear about it. <a href="https://github.com/connectstrata/slack-openapi-generator/issues">Open an issue</a> or submit a PR.</p><p>And if the idea of connecting APIs without writing glue code sounds interesting, check out <a href="https://connectstrata.com">Strata</a> &#8212; it&#8217;s the project that started all of this.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.connectstrata.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Strata Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Data Transformation Benchmark]]></title><description><![CDATA[Can LLMs reliably transform data between SaaS APIs?]]></description><link>https://blog.connectstrata.com/p/data-transformation-benchmark</link><guid isPermaLink="false">https://blog.connectstrata.com/p/data-transformation-benchmark</guid><dc:creator><![CDATA[Julian Mclain]]></dc:creator><pubDate>Thu, 19 Feb 2026 23:11:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f68f205f-45dc-483f-807d-c22c6fb79d78_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tR0G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tR0G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!tR0G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!tR0G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!tR0G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tR0G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png" width="470" height="470" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:470,&quot;bytes&quot;:608031,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.connectstrata.com/i/188554476?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tR0G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!tR0G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!tR0G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!tR0G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e43ab8f-8314-4aaf-be1c-1fd89811d054_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Data transformations are the heart of any integration. A good integration developer builds a mental model for each system and create pipelines that account for data in all of its glorious messiness.</p><p>We built an evaluation suite that tests LLMs on their ability to assume the integration developer role. We constructed 20 real-world data transformation scenarios spanning popular marketing platforms like Shopify, Salesforce, HubSpot, Braze, Klaviyo, and more. Each scenario requires the model to read source and target JSON schemas, interpret natural-language mapping instructions, and produce a working transformation that passes strict schema validation.</p><h2>Best models for data integration mapping</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LKl_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LKl_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 424w, https://substackcdn.com/image/fetch/$s_!LKl_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 848w, https://substackcdn.com/image/fetch/$s_!LKl_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 1272w, https://substackcdn.com/image/fetch/$s_!LKl_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LKl_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png" width="1512" height="1311" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1311,&quot;width&quot;:1512,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:157565,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.connectstrata.com/i/188554476?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff09dfcaa-3ae7-4c9d-a87c-010f6f04d4a6_1600x1412.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LKl_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 424w, https://substackcdn.com/image/fetch/$s_!LKl_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 848w, https://substackcdn.com/image/fetch/$s_!LKl_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 1272w, https://substackcdn.com/image/fetch/$s_!LKl_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5601d5a-55cf-41a8-a855-676845bff379_1512x1311.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>You don&#8217;t always need the biggest model</h2><p>With accurate schemas, frontier LLMs reliably generate correct transformation code, even across complex, deeply nested structures. Surprisingly, Claude Haiku 4.5 scored a 98.3% which matched the performance of the larger and more capable Opus 4.6 model. This indicates that the data transformation tasks are unlikely to benefit from further model improvements. As model capabilities continue to improve, reaching for the highest intelligence option will be increasingly unnecessary. Evals are critical for identifying the right level of capability for a given task.</p><h2>Evaluation structure</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wYOu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wYOu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 424w, https://substackcdn.com/image/fetch/$s_!wYOu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 848w, https://substackcdn.com/image/fetch/$s_!wYOu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 1272w, https://substackcdn.com/image/fetch/$s_!wYOu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wYOu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png" width="1456" height="657" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:657,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:299273,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://connectstrata.substack.com/i/188554476?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wYOu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 424w, https://substackcdn.com/image/fetch/$s_!wYOu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 848w, https://substackcdn.com/image/fetch/$s_!wYOu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 1272w, https://substackcdn.com/image/fetch/$s_!wYOu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a7e947d-54d0-4efe-a1dd-190242ceb873_2048x924.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The evaluation measures whether an LLM can produce a JavaScript function that correctly transforms a set of SaaS webhook payload and API response objects into a desired SaaS API request body. </p><p>For each completion, the LLM is provided:</p><ol><li><p>Source schemas &#8212; JSON schemas for one or more integration resources, such as a Shopify customer created webhook payload or a Contact object from the Salesforce API</p></li><li><p>Target schema &#8212; The JSON schema for an integration resource to which the output must conform (e.g. a Braze update user API request body)</p></li><li><p>Natural-language instructions &#8212; Plain english instructions describing which fields to map. The instructions are intentionally terse and do not use precise field names in order to reflect real-world user prompts.</p></li><li><p>Common data mapping guidelines &#8212; A checklist of things to ensure before responding to the user (e.g. type conversions, date-time formatting, null handling, etc&#8230;)</p></li></ol><p>The response is scored by executing the generated JavaScript function against real example data from the source API or webhook. A binary pass / fail score is assigned based on whether the resulting object passes JSON validation for the target schema. To measure reliability rather than single-pass results, each test case is evaluated 3 times for a total of 60 test case evaluations per model.</p><h3>Test case design</h3><p>The test cases are drawn from real customer integration scenarios and span three difficulty levels:</p><ul><li><p>Low (7 tests): Single-source, straightforward field mapping. Tests basic schema comprehension and field correspondence.</p></li><li><p>Medium (7 tests): Multi-source merging, date formatting, type conversion, array construction, and distractor filtering (irrelevant sources included in the input).</p></li><li><p>High (6 tests): Complex nested structures, multi-source cross-referencing, deep flattening, dual-target outputs, and date/format transformations.</p></li></ul><h3>Future plans</h3><p>This benchmark is a starting point, and there are many planned improvements including:</p><ul><li><p>More models &#8212; Expand to include the Gemini model family and GLM-5</p></li><li><p>More test cases &#8212; Add scenarios covering additional API platforms and edge cases like malformed data</p></li><li><p>More inputs &#8212; Execute the LLM-generated data transformation against more than one example input</p></li></ul><p>We&#8217;re also exploring whether a purpose-built DSL for JSON transformations can match the performance of general purpose programming languages while providing better static analysis capabilities.</p><p><strong>The full evaluation suite and results are available on <a href="https://github.com/connectstrata/data-transformation-benchmark">GitHub</a>.</strong></p>]]></content:encoded></item></channel></rss>