Saturday, April 12, 2014

Shredded storage does not work well when restoring a previous version

As you know in SharePoint 2013 there is a new cool feature called Shredded Storage. It basically means that when you save a document it is decomposed in shreds and in the SQL DB only these shreds are saved. So for example a 1 MB document can be decomposed into 10 shreds. Then if you do a modification on this file only the "affected shreds" are modified. So for example if in the above example you modified only a small portion of the document then only 1 shred would be affected so instead of storing 2 MB (1 MB for each version) only 1.1 MB is stored.

The technology came from SharePoint 2010 but in SP2010 the shreds were used by Office to only send the "delta" to the server... its name was different (Cobalt) but as you see, in SP2010 the focus was put on the bandwidth rather than on the storage since in the server side more IO had to be made to join the delta with the previous file version and in the storage the entire file was saved. It was not a good implementation.

In SP2013 it works fine but it suffers from what in my opinion is a big "logical leak" that is that it does not work if you are restoring a previous version.

Let me show it easily.
I have two files that are completely different. One has 1 MB and the other 5 MB. The name of these files (in both cases) is test_shreds.docx.

I first upload the 5 MB version thus creating version 1. When I go to MS SQL Server I get this:








So you see that at the SQL level it consumes 5 MB.
Then I upload a second version but using the 1 MB file and I get:








Here see that the SQL storage has increased by 1 MB. That's fine until now.
I now upload again the 1 MB file thus creating version 3 that is the same than version 2:












Here see that version 3 is the same than version 2. If shredded storage works fine then the size in SQL should not be increased. Let's try:







   
It works as expected! Now let me upload (not modify, just upload) the initial 5 MB file as version 4:
 

Remember that version 1 is exactly the same than version 4. However if I perform the SQL query I get:



 





See that 5 MB had been added to the storage. In my opinion this is not good since it would be great if shredded storage works accross ALL versions and not across the last version only. However now the situation is even worse. I will take version 3 and I will restore it. So this restore operation will create version 5:


Doing the SQL query this is what is seen:









It means that Shredded Storage does not work when restoring a new file. Just to confirm I restored version 4:







        










And I repeated the SQL query:









And here it is clear that when restoring a file Shredded Storage is not working.

When you restore a file you NEVER restore the last version but ALWAYS a previous version. So  by definition Shredded Storage NEVER works when restoring a file. Just a shame becase it means that for each file a user restore there could be 0 cost in terms of storage and now there is a 100% cost. Even more, in some organizations (maybe in most organizations) the savings that could be obtained by well implementing Shredded Storage to work when restoring a file could be bigger than what Shredded Storage saves now, because now SS (Shredded Storage) only works well if the delta between version N-1 and version N is meaningful, while when restoring a file this is always true.

In my opinion, Microsoft should enable Shredded Storage to work across ALL versions and not only on the last version.



0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home