docs.snowplow.io
Open in
urlscan Pro
2606:4700::6812:6b3
Public Scan
Submitted URL: https://djq1bt04.eu1.hubspotlinks.com/Ctc/I9+113/djq1bt04/VVNypr9l0Dx2W6nCh9d2M0t5YW6y9Q6956v8XkN3Bm-7x5n4LbW6N1X8z6lZ3mhW7GrHcl7cZD08...
Effective URL: https://docs.snowplow.io/docs/understanding-tracking-design/versioning-your-data-structures/amending/?utm_campaign=INB.T3...
Submission: On December 06 via api from ES — Scanned from ES
Effective URL: https://docs.snowplow.io/docs/understanding-tracking-design/versioning-your-data-structures/amending/?utm_campaign=INB.T3...
Submission: On December 06 via api from ES — Scanned from ES
Form analysis
0 forms found in the DOMText Content
Skip to main content AcceleratorsDiscourseGitHub Try Snowplow for freeBook a demo SearchK * Introduction * Feature comparison * Getting started * Snowplow fundamentals * First steps * Recipes and tutorials * Installing Snowplow * Setting up BDP Enterprise * Setting up BDP Cloud * Setting up Community Edition * Try Snowplow * Using Snowplow * Defining the data to collect * Introduction to tracking design * Creating a tracking plan 🆕 * Managing data structures * Versioning data structures * Using the UI * Using the Data Structures Builder * Using Iglu * Amending schemas * Managing tracking scenarios 🆕 * Collecting data * Testing and debugging * Enriching data * Storing and querying data * Routing data elsewhere * Modeling data * Managing data quality * Discovering data * 🆕 Visualizing your data * Managing your account * Reference * Components & applications * Community & contributing * * Defining the data to collect * Versioning data structures * Amending schemas On this page AMENDING SCHEMAS info This documentation only applies to Snowplow BDP Enterprise and Snowplow Community Edition. See the feature comparison page for more information about the different Snowplow offerings. Sometimes, small mistakes creep into your schemas. For example, you might mark an optional field as required, or make a typo in the name of one of the fields. In these cases, you will want to update the schema to correct the mistake. TREAT SCHEMAS AS IMMUTABLE It might be tempting to somehow “overwrite” the schema without updating the version. But this can bring several problems: * Events that were previously valid could become invalid against the new changes. * Your warehouse loader, which updates the table according to the schema, could get stuck if it’s not possible to cast the data in the existing table column to the new definition (e.g. if you change a field type from a string to a number). * Similarly, data models or other applications consuming the data downstream might not be able to deal with the changes. The best approach is to just create a new schema version and update your tracking code to use it. However, there are two alternatives for when it’s not ideal. PATCHING THE SCHEMA If you are working on a new schema version in a development environment, there is usually little risk in overwriting the schema instead of creating a new version. That’s because the new schema version has not made it to production, so changing it will not corrupt any production data. Moreover, if you overwrite all incorrect schema versions, you will be left with a neat and tidy schema version history. Before: 1-0-2 (incorrect) 1-0-0 1-0-1 After: 1-0-2 (corrected) 1-0-0 1-0-1 We call this approach “patching”. To patch the schema, i.e. apply changes to it without updating the version: * If you are using Snowplow BDP, select the “Patch” option in the UI when saving the schema * If you are using Snowplow Community Edition, do not increment the schema version when uploading it with igluctl danger Never patch schemas in a production environment. This can break your loading, especially if your patch contains breaking changes (see above). Also, never patch a schema version that exists in a production environment, even if you are doing the patching in a development environment. This will lead to problems later when you try to promote that schema to production. For Snowplow BDP customers, patching is disabled for production pipelines. Community Edition users have to explicitly enable patching (if desired) in the Iglu Server configuration (patchesAllowed) at their own risk. MARKING THE SCHEMA AS SUPERSEDED If your events are failing in production because of an incorrect schema, you might not be able to instantly update the tracking code to use a new schema version. This is a common situation for mobile tracking, for example. You can resolve this by marking the old schema version as superseded by the new schema version. note You need to be on Enrich 3.8.0+ and Iglu Server 0.11.0+ to use this feature. Additionally, if you are using Snowplow Mini or Snowplow Micro, you will need version 0.17.0+ or 1.7.1+ respectively. Before: 1-0-2 (incorrect) 1-0-1 1-0-0 After: supersedes 1-0-2 (incorrect) 1-0-3 (corrected) 1-0-1 1-0-0 Here’s how this works, at a glance: * Suppose schema 1-0-2 is wrong. * Draft a new schema version correcting the issue. * In the new schema, add the following field at the root: "$supersedes": ["1-0-2"]. * Set the version of the new schema as usual, i.e. 1-0-3 if there are no breaking changes or 2-0-0 if there are. * Add the new schema to your production environment. * Events or entities that use schema 1-0-2 will now be automatically updated (in the Enrich application) to use version 1-0-3, and will be validated against that version. (A special entity will be added to these events to record this fact.) EXAMPLE Let’s say we have a mobile application. We are sending certain events from this application, and these events contain entities with following schema: Geolocation 1-0-2 { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "description": "Schema for client geolocation contexts", "self": { "vendor": "com.acme", "name": "geolocation", "format": "jsonschema", "version": "1-0-2" }, "type": "object", "properties": { "latitude": { "type": "number", }, "longitude": { "type": "number", } }, "additionalProperties": false } Later, we realize that when implementing tracking, we have mistakenly included an altitude field in the entity objects: Wrong tracking code (iOS) let event = ScreenView(name: "Screen") event.entities.add( SelfDescribingJson(schema: "iglu:com.acme/geolocation/jsonschema/1-0-2", andDictionary: [ "latitude": 38.7223, "longitude": 9.1393, "altitude": 20 // extra field not defined in the schema ])!) tracker.track(event) Since additionalProperties is set to false, all events with the altitude field end up as failed events. We can create a new schema with version 1-0-3 that contains the altitude field and then use this schema in the next version of the application. This would make the events valid. However, users will not update their application to the new version all at once. Events from the older version will continue to come, therefore there will still be failed events until all users start to use a newer version. To solve this problem, we simply add the $supersedes definition to the new schema. Geolocation 1-0-3 with $supersedes { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "$supersedes": ["1-0-2"], "description": "Schema for client geolocation contexts", "self": { "vendor": "com.acme", "name": "geolocation", "format": "jsonschema", "version": "1-0-3" }, "type": "object", "properties": { "latitude": { "type": "number", }, "longitude": { "type": "number", }, "altitude": { "type": "number", } }, "additionalProperties": false } Now, when we receive events from the mobile application that use schema 1-0-2, these events will be updated to use schema 1-0-3 and will be validated against that schema. Therefore, these events will be valid. To record this fact, an extra entity will be added to all such events: { "schema": "iglu:com.snowplowanalytics.iglu/validation_info/jsonschema/1-0-0", "data": { "originalSchema": "iglu:com.acme/geolocation/jsonschema/1-0-2", "validatedWith": "1-0-3" } } Finally, if we browse schema version 1-0-2, we will see that Iglu Server automatically keeps track of which schema supersedes which. Specifically, it will now contain a $supersededBy definition: Geolocation 1-0-2 with $supersededBy { "$schema": "http://iglucentral.com/schemas/com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0#", "$supersededBy": "1-0-3", "description": "Schema for client geolocation contexts", "self": { "vendor": "com.acme", "name": "geolocation", "format": "jsonschema", "version": "1-0-2" }, "type": "object", "properties": { "latitude": { "type": "number", }, "longitude": { "type": "number", } }, "additionalProperties": false } USAGE The $supersedes field states that the schema version defined in the self part supersedes the schema versions listed in the $supersedes field (one or more). Its value must be an array of strings (even if it only includes one item). For example: ... "$supersedes": ["1-0-2", "1-0-3"], ... Patching and superseding Once you’ve defined the $supersedes field for a schema version, you can’t update it — even in the development environment where patching is allowed. However, you can change which schema version supersedes which by creating new schema versions. For example, if version 1-0-2 is defined to supersede version 1-0-1, and you create version 1-0-3 which also supersedes 1-0-1, then 1-0-1 will be superseded by the newest version, i.e. 1-0-3. See diagrams below for more information on how this is determined. RULES A SCHEMA VERSION CAN ONLY SUPERSEDE PREVIOUS VERSIONS For example, 1-0-2 can supersede 1-0-1, but can’t supersede 1-0-3, 1-1-0, or 2-0-0. Iglu Server will reject a schema with a definition that breaks this rule. ✅ OK❌ Invalid supersedes 1-0-2 1-0-1 supersedes 1-0-2 2-0-0 A SCHEMA VERSION CAN SUPERSEDE MULTIPLE PREVIOUS VERSIONS AT ONCE Events referencing either of those previous versions will be treated as explained above. ✅ OK supersedes supersedes 1-0-2 1-0-1 1-0-3 AT ANY GIVEN MOMENT, A SCHEMA VERSION CAN ONLY BE SUPERSEDED BY A SINGLE SCHEMA VERSION Iglu Server automatically upholds this rule. For example, if you specify that 1-0-3 supersedes 1-0-2 and (later) that 1-0-4 also supersedes 1-0-2, the latest schema — 1-0-4 — will automatically become the one that supersedes 1-0-2. SpecifiedBecomes supersedes supersedes 1-0-3 1-0-2 1-0-4 supersedes 1-0-4 1-0-3 1-0-2 The same happens if you specify “chains”, e.g. 1-0-3 supersedes 1-0-2 and 1-0-4 supersedes 1-0-3. This will be automatically updated so that 1-0-4 supersedes 1-0-2 and 1-0-3. SpecifiedBecomes supersedes supersedes 1-0-3 1-0-2 1-0-4 supersedes supersedes 1-0-3 1-0-2 1-0-4 Edit this page Last updated on Nov 27, 2023 Was this page helpful?YesNo Previous Using Iglu Next Managing tracking scenarios 🆕 * Treat schemas as immutable * Patching the schema * Marking the schema as superseded * Example * Usage * Rules Change cookie preferences·Terms and conditions Copyright © 2023 Snowplow Analytics Ltd. Built with Docusaurus.