Centralize your analytics using Segment proxies
--
Why?
Certain Segment destinations, such as Braze and Intercom, are more restrictive than others when it comes to how much data they can process for you. This can mean a limited number of unique events, arbitrary property rules, etc.
Segment offers a few different tools to deal with such problems:
- Destination Filters API, which allows you to create allow-lists and block-lists for both user properties and event payloads per destination.
- Protocols API (business plan), which allows you to run transformations on user attributes and event payloads.
These APIs can be used either on their dashboard or through their REST API, via JSON payloads written in a custom syntax.
This may be fine for simple cases but can quickly become hard to manage whenever you have more than a few sources, especially combined with different environments (development, staging and production).
Keep in mind that you might still find yourself needing to send a common payload that will please all destinations which you will then process using the aforementioned tools.
This means you will need to adjust each destination filter (multiplied by the amount of different environments) every time you want to track a new event or new user properties.
This is where having a single point in your infrastructure where all Segment events can transit through starts to look appealing.
Here are some benefits of using this approach:
- Single source of truth for your data
- Much more scalable than the built-in Segment solution
- Ability to perform any operation on any event (transformations, rejections, etc.)
You will keep the ability to use any Segment API on top of your implementation, for hot fixes and quicker modifications from your non-technical team members.
How?
The first step is to read through Segment’s documentation and look for more information regarding their proxy feature. You’ll quickly notice that the few available resources are not very helpful for what we want to accomplish.
They show how to use proxies as a way to route all events to your domain in addition to the normal behaviour, which means you will not be able to properly intercept them before they are sent to Segment.
What we need is actually much simpler than what is shown in the docs and the example repository.
Let’s start with our client side application(s).
All Segment libraries (analytics.js, analytics-react-native, analytics-node, etc.) allow you to change the destination of events, although the API is not consistent across the board, it generally looks like this:
All you need to do is point it to an endpoint on your server capable of handling POST requests with a JSON body. All events will now automatically be sent to your server, complete with Segment’s configurable batching and offline mode capabilities.
You can optionally forward your client write key as part of the event,
Now let’s look at what we can do on the server.
Here is a minimalistic implementation which only accepts identify
events, adds a few common properties to all of them while restricting which destinations will receive them.
You may have noticed the endpoint URL is different than what we specified on the client side, this is due to the Segment SDKs automatically appending /v1/batch
at the end of the provided proxy path.
That’s it! You can now intercept every event sent from any of your sources, without losing the built-in features provided by the Segment SDKs such as batching and auto retry.
This opens up a whole new realm of possibilities for you data, especially when it comes to implementing custom logic on a per-destination basis.
I hope you enjoyed this article and learned something new, I wrote this to educate others since I had to implement a complex system using this mechanism (which is poorly documented).
I’m considering publishing a part two where we can dive a little deeper into custom destination implementations and how to control your events in a more scalable way.
Is this something you’d like to read? Let me know in the comments!