29 Aug 2014

A Terabyte of Cloud Storage!

(Disclosure: I work for Google, but not for the Drive team.)

With Dropbox upgrading all paying users to 1TB quota, I realised that I now have a terabyte of quota on two cloud storage services: Dropbox and Google Drive. Funny enough, my Macbook has only a 256GB SSD [1] [2].

Imagine for a second the idea of having more storage on the cloud than on your local hard disc. Cloud storage has been expensive and limited, but the cloud has evolved very fast. I remember when at the turn of the century, I had a few MB of quota on Yahoo Mail and Hotmail, which was an order or two of magnitude less than the 100 - 200 MB hard disc the computer I used to use had. Now that cloud has a lot more storage at an affordable price than my laptop — and my laptop is a high-end Retina Macbook.

This means that I can’t make use of my full quota by syncing files using the Dropbox or Google Drive app. The only place where I can locally store hundreds of gigabytes, to say nothing of a terabyte, are my external hard discs. And the Dropbox and Drive sync apps don’t work and aren’t designed to work with external hard discs [3].

Which means having to use the web UI to drag and drop folders to upload them. Google Drive’s web UI successfully uploads folders that are a few hundred gigs. It may take a few days, even on a high-bandwidth connection, but as long as you keep the tab open, and you’re on a stable network, and you’re using a desktop rather than a laptop that goes to sleep often, the upload will eventually complete.

Dropbox is not as successful. When I drag in a folder that’s more than 200 GB, Dropbox claims it’s 180 bytes, takes quite a while to upload it, and fails. And when I try it in Chrome, it immediately refuses saying it can upload only 300 files at once.

I have had to come up with hacky workarounds to get the native app to work. Before I explain how I did it, you need to understand that I have a folder named “backups not synced down” in my Dropbox, which I use for backing up stuff, and which itself does not sync down to my Dropbox folder on my machines. I wanted to add a few hundred gigs to this folder, but since the web UI didn’t work, the native UI wouldn’t let me add things to this folder, because the folder itself is not available in Dropbox.

I had to come up with a hack of creating a temporary folder in Dropbox on my desktop, configuring my laptop to not sync that folder (because the laptop doesn’t have that much space), and on my desktop, copying the few hundred gigs from my external hard disc to the temporary folder in my Dropbox. Thankfully, the desktop has almost a gigabyte of free space on its internal hard disc (where my Dropbox is stored), for this game to work. And then, after a few days, when the upload completes, use the web UI to move the newly uploaded stuff to another folder, and deleting the temporary folder.

Dropbox should just make its web UI work. And not just for uploading files. The Dropbox and Drive web UIs lack basic functionality such as being able to see the size of a folder. If users are going to be extensively using the web UIs, the web UIs should become as powerful as native file managers: being able to see the size of a folder, being able to sort folders by size or by any other criterion, spring-loaded folders, tabs, and so on.

Not only is local storage far more limited than cloud storage, but the pipe between them is the bigger bottleneck [4] than either of the ends. This means that even if I somehow manage to upload a terabyte to Dropbox, it will be hard to download it. I have an Internet connection with a 40GB fair-use limit, which means that it will take me more than two years to download my entire Dropbox, even if I use my connection for nothing else, and use my entire fair-use limit each month for downloading Dropbox.

If it’s hard to download your files, it forces you into a backup than a sync usage model. Sure, you could sync a few tens of GB, but not a terabyte. Those will just be uploaded once and left alone after that, to be restored in an emergency [5].

Finally, ISPs should also evolve. Connections that are slow and have low data caps prevent users from taking full advantage of the cloud storage they paid for [6] [7]. If I had a slow connection, like 10 mbps, that was unmetered, that may not be ideal, but it would at least be acceptable. Or if I have a 100 mbps connection, with a 300GB or so limit, that’s also acceptable. But when you combine low speeds with low data caps, it prevents you from making use of the cloud storage you’ve paid for [8] [9] [10].

Taking a step back from these details, it’s amazing that you now a terabyte of cloud storage at an affordable price.

[1] To say nothing of tablets and phones. Despite splurging for 64GB tablets, that’s still a fraction of a terabyte.

[2] OS X should support compression of internal storage. With SSDs, we have limited space, at a high price. Make the most of this space, using the relatively abundant RAM and CPU to compress the storage. And it should be automatic. If my mom buys a Macbook at a shop, she should be able to add more data than fits, and have the OS automatically compress it to fit it all in.

[3] For good reasons: external discs keep appearing and disappearing, they can be connected to different machines at different times, all of which may have Dropbox running, maybe even a Dropbox signed in to another account. Even if a given external disc is always connected to the same machine, rather than to different machines at different times, the simplicity of having a single folder that syncs goes away. What data from Dropbox should sync with your internal storage, and what with your external storage?

And would this mapping be the same on all your devices? If I have a Macbook with 256GB internal storage and a 2TB external hard disc, I might choose to put all the big folders on the external disc. Now, suppose I buy a desktop PC with a 4TB internal hard disc, I may want to sync my entire Dropbox to the internal hard disc.

And so on. Supporting external hard discs natively in the Dropbox and Drive sync apps opens a can of worms that destroys the simplicity of these services, resulting in something even geeks will find hard to use. In other words, this will make the services fail.

[4] Dropbox thankfully uses the network efficiently, by compressing files before transferring them, and by using delta encoding to transfer only changed parts of files.

[5] As a backup tool, Dropbox should not charge your quota twice if you upload the same large file twice, say by backing up your external hard disc every few months. Since Dropbox anyway dedupes files you upload, and does not incur server-side storage or network costs, they should pass the savings on to you.

[6] Dropbox consolidated all their tiers of paid accounts into only one: 1 TB for $100 per year. I don’t need a terabyte, and I would prefer to pay a smaller amount of money than $100. Say $50 per year. The storage can be proportionately less: 500GB. Or even 250GB — that’s more cloud storage than I really need, anyway. Google Drive does the right thing by offering 100GB of storage for $24 per year.

[7] Dropbox and Google Drive should also handle trash better. When I move something into the trash, keep it in trash indefinitely, instead of for just a month or something, as long as I’m under quota. After all, I’ve already paid for the storage, so let me use it. Trash should get emptied only when you’re about to exceed your quota, and never otherwise.

[8] Your unused data limit should get rolled over to the next month, and the month after that, for at least an year. If I have a 200GB fair use cap, but I use only 20GB, except for once in an year that I upload a terabyte, that should be okay. After all, the point of data caps is to deter excessive use of the network. ISPs should do that with the minimum pain caused to customers.

[9] Dropbox and friends should have a way for them to tell me which networks are metered, and which are not, like my office network. If Dropbox is going to sync a hundred GB, it had better not use up my entire month’s quota for it.

[10] There should be a way for ISPs to work with these sync apps to schedule background transfers during off-peak times, when there’s plenty of network capacity, in return for the transfers not counting towards your fair use limit. Of course, it would be completely under your control as a user whether you want to take advantage of this offer. But it will help a lot of use cases: Dropbox syncs, OS and app updates, downloading movies and the like, whether licensed or not…

No comments:

Post a Comment