Tracking the trackers. Draw connections between scripts and domains on website.18th February 2019
Script is here → https://github.com/woj-ciech/kupa3
Example graph for Reddit.com
I’m still pretty amazed to see how many advertisement scripts are on any website. It tracks your every click, create a heat map of your mouse movements, gets information about plugins, browser, resolution or battery among others. If you have enough information about user it’s easy to deanonymise him based on multiple factories, which are collected with every visit. Additional cross referencing to IP addresses allows them to track you no matter of incognito mode or used browser. Do you know what companies are the biggest fishes in the business? What they do with collected information? How theirs scripts look like?
The tool makes it easier to follow each script on websites with all dependencies. Let’s start from the beginning and check how kupa3 can help in your investigation.
Some of the scripts are legitimate and really improving performance and do not collect information.
As an example, let’s try with vice service and their main website.
On the first sight it may be a little unreadable. The best way to interpret the graph is Overview tab in Gephi. You can point on one of the node and rest will be grayed out, which help you to define connections and read clear nodes labels. We extracted subdomain prometheus.vice.com, as you may notice, there are more subdomains included in in js code.
Let’s jump in to main topic, i.e. tracking advertisement companies via scripts and connections between them. As previously mentioned trackers are placed everywhere, whatever site you go to, someone will be tracking you. Before that research I was aware about Google Analytics as a key platform of tracking and serving ads, but it turns out that there are lot of players in this field. Majority of them are hard to trace to specific owner or company. Moreover, some of the scripts returns 404 or 403 errors, which means that they are not accessible directly or there is no proper cookie or referer attribute set up in the request. What is interesting, some bots start tracking you when you add something to your basket or cooperate in other significant way with the website. One of the coolest graph is for nike.com. It has so many trackers that it is a good example.
One of the first loaded script is anti-bot detection, in this case Akamai is used. It is highly obfuscated and eyes are bleeding from first sight. Actually, I’ve learned that small war exist between people that creates bots to generate verified Nike+ account and bot detection mechanisms. These accounts can be resold in order to get discount.
Funny thing was, when I was searching for advertisement domain, google was giving me results like „How to remove
Link that arouse my suspicion the most, was https://gridsumdissector[.]com/js/Clients/GWD-000673-204DB5/gs.js, which does not respond and it’s served from web.nike.com/neo/main/neo.js and nike.com/neo/main/neo.js. Probably it’s only for Chinese citizens.
Main website, redirects to SSO login and it’s all in Chinese by default. Last announcement on their website was from 2016 and API documentation is also in Chinese. Domain is registered to Beijing Innovative Linkage Technology Ltd with Bejing as a residence. The registrar have deserved his place in fraud-reports here http://fraud-reports.wikia.com/wiki/Beijing_Innovative. One of the subdomain allows directory listing with ton of scripts contains references to cntv.cn with registrar as CCTV International Network Co., Ltd. https://www.bloomberg.com/research/stocks/private/snapshot.asp?privcapId=99156085 .
Moreover IP 218.202.xxx.xx was found as a link and it belongs to China Mobile. Having this proof, we can say that these two companies cooperates with themselves. It’s tough to get any information about Chinese companies, especially if it’s related to marketing, advertising and tracking.
Google indexed some of the docs from their SSO login page and theme for dashboard for customer. Thanks to this, we can take a look what is behind the curtain.
It’s not the only one, which is hard to trace and make full sense out of the company and theirs scripts. Other tracking companies that are connected with nike.com:
Do you recognize some of the companies? Do you trust them and allow executing code in your browser?
The gridsumdissector was used only by example, there are lot more with shady connections and adware activities. You can check by yourself, by looking into adrttt[.]com domain (registered in Istanbul). With little OSINT skills, you can take a look on their infrastructure, linked companies and then follow the rabbit hole.
The moral from this story is that you never know where your data is going, lot of companies are engaged in the race for your browsing habits to makes advertisement more and more personal and targeted. Additionally, you can’t be 100% sure what data they collect because of obfuscation of java script code. It can be reversed but it’s very time consuming and not worth the effort. My suggest is to block them all.
Tracking the trackers. Draw connections between scripts and domains on website. was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.