This is purely off topic. Not dropbox related at all, but really this is a sync design question. (if this is against the rules, My apologies to the mods).
I've been struggling with an idea to avoid saving duplicate DBRecords.
We know that record.recordID is the main identifier. You can create it once in its life time and Never change it. (Obviously, why would anyone?)
Each DBRecord has a text. record["text"], All incoming DBRecords deposit these records in my local container. Using its recordID as the identifier.
At some point the User:
- Delinks Dropbox
- Links Dropbox
My understanding is that delinking will delete all cache (which is good). Newly relinked DBDatastores will try to send all those records.
It is here that I don't really know whats the right move here, thats in line with the end-user's expectation.
So this is what I planned.
Upon Incoming Datastores and its Records.
- Check whether I have any local data at all. If I don't, assume its a new app and/or device installation. Add all DBRecords. Check for Nothing
I do have some local data. The next move is:
Add all DBRecords as if they're NEW OR **
- Check for Duplication, only add those that are completely new.
Both doable. But now I'm thinking to myself, what If the user intentionally wants duplicated entries. To make modifications later. Finally, here i arrive at a conclusion that I give users a choice. Same entires found. Continue to add them as new or Cancel?
So my question is: Is this unnecessary complexity? Should I be even bothering about this? It bothers me because I don't like to have these duplications. I use this app myself. I don't like deleting them later.
How did the Dropbox engineers handle sync (localdatastores transfer etc..)
And on a side note, is MD5 still fast enough for simple text check? Or I should look at SHA or something
Thanks