Archive for the 'Censorship' Category

Browser fingerprinting attack and defense with PhantomJS

May 18 2015 Published by antitree under Censorship,Intelligence,privacy

PhantomJS is a headless browser that when you use Selenium, turns into a powerful, scriptable tool for scraping or automated web testing in even JavaScript heavy applications. We’ve known that browsers are being fingerprinted and used for identifying individual visits on a website for a long time. This technology is a common feature of your web analytics tools. They want to know as much as possible about their users so why not collect identifying information.

Attack (or active defense)

The scenario here is you, as a privacy conscious Internet user, have taken the various steps to hide your IP, maybe using Tor or a VPN service, and you’ve changed the default UserAgent string in your browser but by using your browser and visiting similar pages across different IP’s, the web site can track your activities even when your IP changes. Say for instance you go on Reddit and you have your same 5 subreddits that you look at. You switch IP’s all the time but because your browser is so individualistic, they can track your visits across multiple sessions.

Lots of interesting companies are jumping on this not only for web analytics, but from the security point of view. The now Juniper owned company, Mykonos, built it’s business around this idea. It would fingerprint individual users, and if one of them launched an attack, they’d be able to track them across multiple sessions or IP’s by fingerprinting those browsers. They call this an active defense tactic because they are actively collecting information about you and defending the web application.

The best proof-of-concepts I know of are BrowserSpy.dk and the EFF’s Panopticlick project. These sites show what kind of passive information can be collected from your browser and used to connect you to an individual browsing session.

Defense

The defense to these fingerprinting attacks are in a lot of cases to disable JavaScript. But as the Tor Project accepts, disabling JavaScript in itself is a fingerprintable property. The Tor Browser has been working on this problem for years; it’s a difficult game. If you look through BrowserSpy’s library of examples, there are common and tough to fight POC’s. One is to read the fonts installed on your computer. If you’ve ever installed that custom cute font, it suddenly makes your browser exponentially more identifiable. One of my favorites is the screen resolution; This doesn’t refer to window size which is separate, this means the resolution of your monitor or screen. Unfortunately, in the standard browser there’s no way to control this beyond running your system as a different resolution. You might say this isn’t that big of a deal because you’re running at 1980×1080 but think about mobile devices which have model-specific resolutions that could tell an attacker the exact make and model of your phone.

PhantomJS

There’s no fix. But like all fix-less things, it’s fun to at least try. I used PhantomJS in the past for automating interactions to web applications. You can write scripts for Selenium to automate all kinds of stuff like visiting a web page, clicking a button, and taking a screenshot of the result. Security Bods (as they’re calling them now) have been using it for years.

To create a simple web page screen scraper , it’s as easy as a few lines of Python. This ends up being pretty nice especially when your friends send you all kinds of malicious stuff to see if you’ll click it. 🙂 This is very simple in Selenium but I wanted to attempt to not look so script-y. The example below is how you would change the useragent string using Selenium:

        ua = 'Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0'
        dc = dict(DesiredCapabilities.PHANTOMJS)
        dc["phantomjs.page.settings.userAgent"] = ua
        browser = webdriver.PhantomJS(
            "phantomjs",
            service_args=self.proxysettings,
            desired_capabilities=dc
        )
        browser.get("")

ua = 'Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Firefox/31.0'

dc = dict(DesiredCapabilities.PHANTOMJS)

dc["phantomjs.page.settings.userAgent"] = ua

browser = webdriver.PhantomJS(

"phantomjs",

service_args=self.proxysettings,

desired_capabilities=dc

)

browser.get("")

Playing around with this started bring up questions like: Since PhantomJS doesn’t in fact have a screen, what would my screen resolution be? The answer is 1024×768.

This arbitrarily assigned value is pretty great. That means we can replace this value with something else. It should be noted that even though you set this value to something different, it doesn’t affect the size of your window. To defend against being “Actively Defended” against, you can change the PhantomJS code and recompile.

PhantomIntegration::PhantomIntegration()
{
    PhantomScreen *mPrimaryScreen = new PhantomScreen();

    // Simulate typical desktop screen
    int widths [5] = { 1024, 1920, 1366, 1280, 1600 };
    int heights [5] = { 768, 1080, 768, 1024, 900};
    int ranres = rand() % 5 + 1;
    int width = widths[ranres];
    int height = heights[ranres];

    int dpi = 72;
    qreal physicalWidth = width * 25.4 / dpi;
    qreal physicalHeight = height * 25.4 / dpi;
    mPrimaryScreen->mGeometry = QRect(0, 0, width, height);
    mPrimaryScreen->mPhysicalSize = QSizeF(physicalWidth, physicalHeight);

    mPrimaryScreen->mDepth = 32;
    mPrimaryScreen->mFormat = QImage::Format_ARGB32_Premultiplied;

PhantomIntegration::PhantomIntegration()

{

PhantomScreen *mPrimaryScreen = new PhantomScreen();

// Simulate typical desktop screen

int widths [5] = { 1024, 1920, 1366, 1280, 1600 };

int heights [5] = { 768, 1080, 768, 1024, 900};

int ranres = rand() % 5 + 1;

int width = widths[ranres];

int height = heights[ranres];

int dpi = 72;

qreal physicalWidth = width * 25.4 / dpi;

qreal physicalHeight = height * 25.4 / dpi;

mPrimaryScreen->mGeometry = QRect(0, 0, width, height);

mPrimaryScreen->mPhysicalSize = QSizeF(physicalWidth, physicalHeight);

mPrimaryScreen->mDepth = 32;

mPrimaryScreen->mFormat = QImage::Format_ARGB32_Premultiplied;

This will take a few extra screen resolutions every time a new webdriver browser is created. You can test it back at BrowserSpy.

Old:

New:

And so on…

And we’ve now spoofed a single fingerprintable value only another few thousand to go. In the end, is this better than scripting something like Firefox? Unknown. But the offer still stands that if someone at Juniper wants to provide me with a demo, I’d provide free feedback on how well it stands up to edge cases like me.

Meek Protocol

Sep 07 2014 Published by antitree under Censorship,Tor

The Meek Protocol has recently been getting a lot of attention since the Tor project made a few blog posts about it. Meek is a censorship evasion protocol that users a tactic called “domain fronting” to evade DPI-based censorship tactics. The idea is that using a CDN such as Google, Akamai, or Cloudflare, you can proxy connections (using the TLS SNI extension) so that if an adversary wanted to block or drop your connection, they would need to block connections to the CDN, like Google; mutually assured destruction. The goal being, a way of connecting to the Tor Network that is unblockable even from nation state adversaries.

SNI and Domain Fronting

SNI is a TLS extension that’s been around for about nine years, and has been implemented in all modern browsers at this point. This is the TLS version of virtual hosting where you send an HTTP request to a server, and inside is a request to another host. Similar to virtual hosting’s host headers, SNI provides a host inside it’s extension during the client hello request:

Extension: server_name
  Type: server_name (0x0000)
  Length: 21
  Server Name Indicator extension
    Server Name list length: 19
    Length: 21
    Server Name Indication extension
      Server Name list length: 19
      Server Name Type: host_name (0)
      Server Name length: 16
      Server Name: www.antitree.com

Extension: server_name

Type: server_name (0x0000)

Length: 21

Server Name Indicator extension

Server Name list length: 19

Length: 21

Server Name Indication extension

Server Name list length: 19

Server Name Type: host_name (0)

Server Name length: 16

Server Name: www.antitree.com

This would be a request to https://www.google.com but the server receiving this request would look up the record to www.antitree.com to see if it was fronted, and forward the request to that host.

You can try this using the actual Meek server that Tor uses:

wget -O - -q https://www.google.com/ --header 'Host: meek-reflect.appspot.com'

1	wget -O - -q https://www.google.com/ --header 'Host: meek-reflect.appspot.com'

You should get a response of “I’m just a happy little web server.” which is what the meek-server default response is.

In terms of Internet censorship, the idea of using SNI to proxy a request through a CDN is called Domain Fronting and AFAIK, is currently only implemented by the Meek Protocol. (That being said, the idea can apply to just about any other protocol or tool. I’ve seen other projects use Meek or something like it. ) What Meek provides is a way of using Domain Fronting to create a tunnel for any protocol that needs to be proxied.

Tor and Meek

The Meek Protocol was designed by some of the people involved with the Tor Project as one of the pluggable transports and is currently used to send the entire Tor protocol over a Meek tunnel. It does this using a little bit of infrastructure:

meek-client: This is what a client will use to initiate a tunnel over the Meek protocol
meek-server: corresponding server portion that will funnel requests and responses back over the Meek tunnel
web reflector: In its current form, this takes an SNI request, sees that it is a Meek request, and redirects it to the meek-server. This also makes sure that the tunnel is still running using polling requests.
CDN: the important cloud service that will be fronting the domain. The most common example is Google’s App-Spot.
Meek Browser Plugin: In order to make a meek-client request look like a standard SNI request (same TLS extensions) that your browser would make, a browser plugin is used.

Here’s a diagram of it all wrapped together:

This is how just a request is made to a Tor Bridge Node that’s running the meek-server software. Right now, if you download the latest Alpha release of the Tor Browser Bundle, this is how you could optionally connect using Meek.

Polling

You might notice, that due to the fact HTTP (by design) doesn’t maintain any kind of state to keep a connection open for as long as you would like to tunnel your Tor traffic, the Meek protocol needs to compensate. It does this by implementing a polling method where a POST request is sent from the client to the server at a specified (algorithmic) interval. This is the main way that data is delivered once the connection has been established. If the server has something to send, it’s done in the POST response body, otherwise the message is still sent with a 0 byte body.

Success Rate

You might notice that there are a few extra hops in your circuit and it’s true that there is a decent amount of overhead, but for those in China, Iran, Egypt or the ever-expanding list of other nations implementing DPI based blocking as well as active probing, this is the difference between being able to use Tor, and not. The benefit here is that if you’re watching the connection, you’ll be able to see that a client IP made an HTTPS connection to a server IP owned by Google or Akamai. You cannot see if TLS handshake decide to support the SNI extension, and you cannot see whether or not the client HELLO contained a SNI “server_name” value. Without this, the connection is indistinguishable from a request to say Youtube or Google.

As of now, there does not seem to be a lot (compared to all Tor users) of users connecting over the Meek bridge but it does seem to be increasing in popularity.

Updated Graph

Attacks

While no known attacks exist (besides an adversary blocking the entire CDN), there are some potential weaknesses that are being reviewed. One of the interesting ones is if an adversary is able to inject a RST packet into the connection, the tunnel would collapse and not re-establish itself. This is unlike a normal HTTP/S request that would just re-issue the request, and not care. This may be a way of fingerprinting the connections over time but there would be a fairly large cost to other connections in order to perform an attack like this. The other attack of note is traffic correlation based on the polling interval. If the polling interval was static at, for example, 50ms, it would be fairly easy to define a pattern for the meek protocol over time. Of course that’s not the case in the current implementation as the polling interval dynamically changes. The other attacks and mitigations can be found on the Tor wiki page.

Resources:

https://trac.torproject.org/projects/tor/wiki/doc/meek – main wiki page documenting how to use Tor with Meek

https://trac.torproject.org/projects/tor/wiki/doc/AChildsGardenOfPluggableTransports#meek – in depth explanation of the protocol compared to a standard Tor connection