I'm writing a file-transfer service that uses the dropbox API via the SDK (8.0.0 release). Most of the time it works, but there have been a couple times where I've encountered errors that I'd like to handle gracefully. I'm looking to see how others might have handled it, and also some potential reasons why they are occurring.
Namely, I've received two errors that I'd like to handle:
1. timeout errors when transferring larger chunks (here 140MB although API states 150MB is the max size). The timeout has not yet happened if I reduce the chunk size (to say, 100MB) but obviously I need something more robust than a heuristic. The Python requests library has some timeout kwargs, but I don't see a way to invoke those with extra kwargs via the dropbox library. Is there any way to play with that? StackOverflow seems to suggest there's a setting in my OS that I could change to alter the default timeout (I would rather not do that...).
2. The more common error I've received (even on very small chunks of 10MB) are connection reset "errors", e.g. from my logger:
2017-08-17 15:50:49,218:ERROR:<class 'requests.exceptions.ConnectionError'>
2017-08-17 15:50:49,218:ERROR:('Connection aborted.', error(104, 'Connection reset by peer'))
It's so hit-or-miss that it's ****** to debug, especially since I am not sure what happens to the upload session (see below).
Now that the problems are described, the minimum/simplified working code snippet (before adding any exception handling) is:
import os
import dropbox
DEFAULT_CHUNK_SIZE=100*1024*1024
token = 'api token goes here'
client = dropbox.dropbox.Dropbox(token)
local_filepath = '/home/foo/bar/baz.txt'
file_size = os.path.getsize(local_filepath)
path_in_dropbox = '/%s' % os.path.basename(local_filepath)
file_obj = open(local_filepath)
if file_size <= DEFAULT_CHUNK_SIZE:
client.files_upload(file_obj.read(), path_in_dropbox)
else:
i = 1
session_start_result = client.files_upload_session_start(file_obj.read(DEFAULT_CHUNK_SIZE))
cursor=dropbox.files.UploadSessionCursor(session_start_result.session_id, offset=file_obj.tell())
commit=dropbox.files.CommitInfo(path=path_in_dropbox)
while file_obj.tell() < file_size:
print 'Sending chunk %s' % i
if (file_size-file_obj.tell()) <= DEFAULT_CHUNK_SIZE:
print 'Finishing transfer and committing'
client.files_upload_session_finish(file_obj.read(DEFAULT_CHUNK_SIZE), cursor, commit)
else:
print 'Before append, cusor.offset=%d, file_obj is at %d' % (cursor.offset, file_obj.tell())
client.files_upload_session_append_v2(file_obj.read(DEFAULT_CHUNK_SIZE), cursor)
cursor.offset = file_obj.tell()
i += 1
file_obj.close()
For the case of the connection reset (item 2), my main question hinges on what happens to the upload session if the ConnectionError is raised. Is the session "preserved" or do I need to start a new one? If the former, I figure I can do something like the following:
<...snip...>
try:
current_offset = file_obj.tell()
client.files_upload_session_append_v2(file_obj.read(DEFAULT_CHUNK_SIZE), cursor)
except requests.exceptions.ConnectionError as ex:
file_obj.seek(current_offset)
cursor.offset = current_offset
<...snip...>
The idea here is that if the current chunk fails due to the reset error, catch the exception and ensure that both the cursor (dropbox.files.UploadSessionCursor) and the file object get reset to the previous spot so it can try again. This is assuming that the failed call to files_upload_session_v2 moves the pointer in the file object (file_obj)....I'm not sure if it does, but this seems OK even if it does not. Is there a better/more graceful way to get the chunk to retry?