Difference between revisions of "Dropbox Crawler"
|Line 33:||Line 33:|
=== Java Version ===
=== Java Version ===
If you have
If you have , you can try the Java version. For running you need the [http://www.java.com/en/download/index.jsp Java Runtime Environment 6+] in your computer. This version does not support manual proxy configuration. If you are behind a proxy, try to use the native versions, or contact us.
Revision as of 21:07, 17 February 2013
Personal cloud storage is becoming more and more popular, with Dropbox certainly being the best known example. It generates a huge amount of Internet traffic, but how does it works? How is it used? What are the possible improvements?
We have been doing research on the usage of Dropbox (see our results here). As a next step, we need to know what type of files people store in the service. This would allow us to understand the impact of some technologies on the system performance and on network traffic, among other things.
We are looking for volunteers to provide us basic statistics (see below) of what files are stored in their Dropbox folders.
Be part of the crowd - Click on the logos to download our client
|Window - 8.2M||Mac OS X - 34M||Linux 32bits - 7.6M|
|Linux 64bits - 8.4M|
How to run it
- Download the application by clicking on the logo of your operating system
- Decompress the client (only Linux and OS X)
- Double click on the file to run it
If you have OS X Mountain Lion, you may need to right-click on the application after decompressing it, select "Open", and confirm that you want to run the application.
If you have difficulties with the native versions, you can try the Java version. For running it you need the Java Runtime Environment 6+ in your computer. This version does not support manual proxy configuration. If you are behind a proxy, try to use the native versions, or contact us.
|Java (requires JRE) - 270K|
What our application will do?
- Scan your Dropbox folder
- Calculate basic statistics
- Show you what has been collected for your approval
- Send the statistics to us
The application has been designed to be as simple as possible. In case you have any difficult, please contact us.
What will be logged?
For each file/folder in your Dropbox, the program will collect:
* Size in bytes * Last modification time * Mime type of the file * File extension * MD5 Hash of both initial and final 8 kbytes of the file * MD5 Hash of the file name
The program will also send to us:
* MD5 Hash of Dropbox configuration files (or MAC address if we cannot read the former) * MD5 Hash of the path of your Dropbox home folder * Your IP address and operating system version * Error logs, in case something goes wrong during the data collection
Collected information is sent via plain HTTP (let Wireshark be with you!) to a centralized collection server.
How will we use this information?
Collected data, postprocessing scripts, and all results will be submitted to publication and made freely available in this website. Thus, anyone will be able to use our data sources for further researches.
We will, however, take extra actions to ensure that no sensitive information will be in these datasets. Note that the only information that could potentially reveal your identity is your IP address, which we will anonymize. All other statistics cannot be related to the person owning the files.
What this program will NOT do?
- Copy any file or folder out of your computer
- Copy any other information than what is listed above
- Install or store anything in your computer
You can also take a look on the source code if you have any doubts about the program, recompile it on your own (and improve it :))
Client source code
- You can find more information about our work on this paper:
Drago, I. and Mellia, M. and Munafò, M. M. and Sperotto, A. and Sadre, R. and Pras, A. (2012) Inside Dropbox: Understanding Personal Cloud Storage Services. Proceedings of the 12th ACM Internet Measurement Conference - IMC'12, Boston, Nov. 2012
- This page has more information about the data we used in our research so far.
These institutes are running this research: