From File Systems To The Cloud And Back

Cloud storage solutions today are a great alternative to storing data on local computer or in NAS storage. Started with Amazon S3, such solutions are offered by a dozen of companies, including Microsoft with their Azure Blob Storage.

The advantages of cloud storages are nearly infinite storage capacity (use as much as you need, not as you have), the distance between the storage and your location (the data won’t be lost in an accident or fire, and access of third parties to your data is severely limited), lowered cost of data management.

At the same time cloud storages work in the way that doesn’t match regular approaches to storage access, such as hierarchical file systems and relational databases. Internally designed as huge tables with an index and BLOB field for data, they don’t give enough flexibility that file systems or database management systems can offer to the developer and user. The developer needs to perform translation between the data he has in the application and the back-end cloud storage.

One more significant disadvantage is a difference between APIs, offered by different services. While most of services offer so-called REST API, this API is in fact a format for requests and responses sent over HTTP. Request commands, parameters and functions offered by services, differ significantly. Due to this switching between cloud services requires writing of separate code for each API.

Finally, the main factor of (in)acceptance of cloud storage-based solutions is a question of guaranteeing data safety. Though service providers tell us about encryption used on their side, such encryption is performed on their systems and there’s no guarantee that it’s really reliable and if it is even performed. So safety of the data is a real problem and not a fantasy of cloud storage opponents.

Luckily, there exists a possibility to address all of the above problems in a simple and very cost-effective way.

Solid File System (SolFS) offers the missing pieces that fit well into cloud storage architecture.

As most file systems, SolFS is page-based. This means that it operates not with random sequences of bytes, but with blocks (sectors on the disk, pages in memory) of fixed size. This makes it easy to back SolFS with almost any storage.

To make such backing possible SolFS supports callback mode, in which it asks your application to store or retrieve the block to or from the back-end storage. So all you need to do is implement two simple functions “put the page #X to the cloud storage” and “retrieve the page #X from the storage” in your code, and that’s all you have a file system in the cloud!
But that’s not all SolFS can offer. The file system offers several advanced features, such as built-in encryption and compression (performed on your side, if you remember the cloud security problem referenced above), nearly unlimited possibilities for storing metadata (various supplementary information about the main file or data), and to perform SQL-like search for files. Moreover, if you need custom encryption (eg. using keys stored on cryptographic hardware tokens), this is possible with two other callbacks “encrypt page #X” and “decrypt page #X”.

And what if you need not a file system, but a relational database? No problems either! You can use your favorite DBMS and have it store it’s files on the virtual disk, created by SolFS (System Edition). This way the database files are stored in the cloud storage, and your application works with them via database management system of your choice.

One more benefit of SolFS is that moving from one cloud storage service to another is as simple as rewriting two basic functions for storing and retrieving of pages to and from the cloud storage.

You can say that you still need the code, that works with the cloud. This is correct, but it’s much easier to write the code that stores and retrieves fixed-sized files (each page has the same size) by page number, than to try to implement a relational database or a file system in the cloud yourself.

If you don’t want to write cloud-specific code at all, we have a solution for you too. It’s CloudBlackbox the components that provide uniform access to various cloud storage services. These components both provide uniform access to cloud storages (Amazon S3, Microsoft Azure at the moment with more to come) and provide enhanced encryption capabilities, such as certificate-based encryption of data.

So if you are moving to the cloud, you don’t need to discard established paradigms and existing code. Updating them to modern industry offerings is easy and fast.

Tags: , , ,