Why does every shared tweet need to end in ?s=20
? Why has Amazon been tacking has WorkingJavaScript=1 onto URLs for nearly 15 years?
If you’re like me, there’s nothing you love more than a tidy URL. So if there are any query parameters dangling on the end (e.g. ?variables=like&this=etc
), they had better be meaningful. This isn’t about aesthetics—promoting users' ability to comprehend and work with URLs is a vital component of preserving the Open Web. That so many developers litter so much nonsense into their users' location bars is a source of endless irritation when I surf the web.
For most of Test Double’s history, a combination of principled minimalism and our own incompetence with respect to marketing resulted in our site’s URLs not using any query parameters at all. But earlier this year, our Marketing Director Cathy embarked on a mission to incorporate an ethical analytics platform, which led to our adoption of Matomo, along with a handful of other tools like Rebrandly and Campaign Monitor.
[Note: Matomo has the distinction of enabling the option to use only first-party cookies and anonymize IP addresses so it's not collecting personally identifiable information, or sharing that info with other companies.]
Unfortunately, the only way for us to have any idea whether somebody is checking out a blog post because @testdouble tweeted about it or because we shared it in our newsletter is to append one or more query parameters (usually starting with utm_
like utm_campaign
or utm_source
). If you’re not going to rely on privacy-invading third-party cookies, query parameters seem to be an ugly-but-necessary ingredient in understanding how our message is reaching people. (And, let’s be honest, Cathy still wants to know things about what content is being read, and in what context—because marketing.)
But, here’s the thing: once a user has loaded the page and the analytics platform has logged the visit, what’s the point of leaving all those utm_
query parameters to clog up the browser’s location bar?
Couldn’t you fix this in 17 lines of JavaScript?
That question led to another: was there anything preventing us from reading any analytics-related query parameters on page load and then immediately rewriting the URL to clean it up?
Turns out, no. If you don’t care about your analytics-specific query parameters after the visit is initially logged, there’s nothing stopping you from filtering them out and rewriting the URL using History.replaceState()
Here’s the snippet I added to our existing Matomo JavaScript tracker code (but this approach could probably be adapted to Google Analytics, Plausible, etc).
// EXISTING tracker code, copy-and-pasted from the analytics platform:
var _paq = window._paq = window._paq || [];
_paq.push(["setDocumentTitle", document.domain + "/" + document.title]);
_paq.push(["setCookieDomain", "*.testdouble.com"]);
_paq.push(["setDomains", "*.testdouble.com"]);
// NEW snippet I added to clean up the URL if necessary:
if (window.URLSearchParams && window.history) {
var originalUrl = window.location.href
var params = new URLSearchParams(window.location.search)
var urlShouldBeCleaned = false
Array.from(params.keys()).forEach(function (key) {
if (key.indexOf('utm_') === 0) {
urlShouldBeCleaned = true
params.delete(key)
}
})
if (urlShouldBeCleaned) {
var query = params.toString() ? '?' + params.toString() : ''
var cleanUrl = window.location.pathname + query + window.location.hash
window.history.replaceState(null, '', cleanUrl)
_paq.push(['setCustomUrl', originalUrl])
}
}
// EXISTING tracker code to wrap things up:
_paq.push(['trackPageView']);
_paq.push(['enableLinkTracking']);
(function() {
var u="https://testdouble.matomo.cloud/";
_paq.push(['setTrackerUrl', u+'matomo.php']);
_paq.push(['setSiteId', '1']);
var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
g.type='text/javascript'; g.async=true; g.src='//cdn.matomo.cloud/testdouble.matomo.cloud/matomo.js'; s.parentNode.insertBefore(g,s);
})();
If you happen to be using Matomo and are able to just copy-paste that little if
block, you’re probably already off to the races. But if you’re using another platform or want to understand a bit about what’s going on here, let’s break down what the code above is actually doing.
What the code above is actually doing
First of all, since this approach relies on relatively new browser features, URLSearchParams and the History API, we wrap everything in an if
statement that detects the presence of both, to try to guard against any errors that users might encounter on older browsers.
if (window.URLSearchParams && window.history) {
// The good stuff
}
Before we muck around with changing the URL, we store the originalUrl
from the Window.location API and instantiate a search params object using the actual query string (available via window.location.search):
var originalUrl = window.location.href
var params = new URLSearchParams(window.location.search)
Next, we grab all the parameter keys (i.e. 'foo'
and 'bar'
given foo=4&bar=24
) using URLSearchParams.keys() and then convert the resulting Iterator into an array using Array.from and iterate over them with Array.forEach.
The function we pass to the forEach
simply checks to see if the given parameter key
starts with 'utm_'
and, if it does, ❶ sets a boolean that indicates we need to change the URL and ❷ deletes it from our URLSearchParams
instance (params
):
var urlShouldBeCleaned = false
Array.from(params.keys()).forEach(function (key) {
if (key.indexOf('utm_') === 0) {
urlShouldBeCleaned = true
params.delete(key)
}
})
After checking the keys, we only proceed if we actually deleted any utm_
query parameters:
if (urlShouldBeCleaned) {
// The main event
}
Inside that nested if
, we do two last things. First, we replace the current URL in the location bar by constructing a new URL by calling window.history.replaceState(). (The query
variable produced via a ternary is just there because we don’t want to erroneously tack '?'
characters onto URLs that never had them.)
The immediate effect of the code below will be an newly-tidy URL in the browser’s location bar:
var query = params.toString() ? '?' + params.toString() : ''
var cleanUrl = window.location.pathname + query + window.location.hash
window.history.replaceState(null, '', cleanUrl)
And finally, we set a special variable (actually, queue up a function call) with the Matomo JavaScript tracker that allows us to override the URL. This way, Matomo will track a page view for the originally-loaded URL (utm_
query parameters and all):
_paq.push(['setCustomUrl', originalUrl])
If you’re using an analytics platform other than Matomo, figuring out how to accomplish the above is the only work that remains. Maybe your platform provides an option analogous to Matomo’s setCustomUrl
. Maybe you’ll be able to simply defer the call to replaceState
using setImmediate. Maybe you’ll need to wait some additional amount of time until you’re sure your tracker’s home has been phoned.
Little touches like this are worth it
This entire endeavor only took me an hour or two, which honestly just leaves me more confused as to why more web sites and applications don’t do more to actively tidy up the implementation details they tack onto their URLs.
Writing great software is all about sweating the small stuff, and savvy users appreciate the attention to detail—even the ones they’ll never explicitly notice, like the absence of bogus query parameters that don’t mean anything to them.