Added Accept header to image requests

The canonical bot tweeted an image (https://twitter.com/differencebot/status/721395886291558400) containing an advertisement instead of the requisite object. Previously, the only defense against servers serving the wrong image was that we ignore 300 response codes. This image, when loaded in Google Chrome, loaded a document with a content type of text/html, which is also ignored by difference, and which executed JavaScript redirecting Chrome to a malware-infested page. difference, however, saw the response as an image with content type image/gif (notably different from the URL, which indicated a JPEG image). It turned out that Chrome was using an Accept header that prioritized text/html documents over most other content types, which the malicious server used to decide what content to serve. Changing difference to send the same header caused the malicious server to also serve the text/html document to difference, which difference then discarded. Whilst the Accept header being used now does prioritize text/html documents over images, servers with legitimate content will not use that information when deciding what document to serve. The malicious test URL is http://www.northvalleymedicalsupply.com/shop/products_pictures/adj%20hinge%20knee%20brace.jpg.
author: Kelly Rauchenberger <fefferburbia@gmail.com> 2016-04-16 14:22:25 -0400
committer: Kelly Rauchenberger <fefferburbia@gmail.com> 2016-04-16 14:22:25 -0400
commit: ff6d29e7f6b587a2536227834950986dbbcd580b (patch)
tree: 11baa1d8a29ccbc6bdfd7a1323361ab28a26d499
parent: f7b91944738e732ab4bfea50ea0a2fffd92a51a6 (diff)
download: difference-ff6d29e7f6b587a2536227834950986dbbcd580b.tar.gz
difference-ff6d29e7f6b587a2536227834950986dbbcd580b.tar.bz2
difference-ff6d29e7f6b587a2536227834950986dbbcd580b.zip
1 files changed, 9 insertions, 0 deletions
diff --git a/difference.cpp b/difference.cpp
index 66e4550..7ea8b74 100644
--- a/difference.cpp
+++ b/difference.cpp

@@ -94,6 +94,11 @@ int main(int argc, char** argv)
    }
  }
  
+  // Accept string from Google Chrome
+  std::string accept = "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8";
+  curl::curl_header headers;
+  headers.add(accept);
+  
  std::cout << "Started!" << std::endl;
  for (;;)
  {
@@ -153,6 +158,8 @@ int main(int argc, char** argv)
      curl::curl_ios<std::ostringstream> img1ios(img1buf);
      curl::curl_easy img1handle(img1ios);
      std::string img1url = lstvec[curind];
+      
+      img1handle.add<CURLOPT_HTTPHEADER>(headers.get());
      img1handle.add<CURLOPT_URL>(img1url.c_str());
      img1handle.add<CURLOPT_CONNECTTIMEOUT>(30);
      
@@ -202,6 +209,8 @@ int main(int argc, char** argv)
      curl::curl_ios<std::ostringstream> img2ios(img2buf);
      curl::curl_easy img2handle(img2ios);
      std::string img2url = lstvec[curind];
+      
+      img2handle.add<CURLOPT_HTTPHEADER>(headers.get());
      img2handle.add<CURLOPT_URL>(img2url.c_str());
      img2handle.add<CURLOPT_CONNECTTIMEOUT>(30);
author	Kelly Rauchenberger <fefferburbia@gmail.com>	2016-04-16 14:22:25 -0400
committer	Kelly Rauchenberger <fefferburbia@gmail.com>	2016-04-16 14:22:25 -0400
commit	ff6d29e7f6b587a2536227834950986dbbcd580b (patch)
tree	11baa1d8a29ccbc6bdfd7a1323361ab28a26d499
parent	f7b91944738e732ab4bfea50ea0a2fffd92a51a6 (diff)
download	difference-ff6d29e7f6b587a2536227834950986dbbcd580b.tar.gz difference-ff6d29e7f6b587a2536227834950986dbbcd580b.tar.bz2 difference-ff6d29e7f6b587a2536227834950986dbbcd580b.zip