Hi everyone,
I hope I can get some ideas here as I can't figure this out.
Background
We upload all our projects to Dropbox. Since we use software for project creation, they generally have similar folder structures (though they can vary slightly depending on the version). The software creates SQLite databases along with additional files in subfolders (Video, Pictures, PDF Reports). The database holds information about the name of each file created for each inspection. Each project has a unique ID (internal at a company level), and I have a list of these project IDs that are active, and that i need to create links on hourly/daily basis or sometimes after a QC for a delivery of the whole project. However, I usually don't know who did the project or in which folder it is located, as they are a different department and pretty ****** team across many states.
To solve this, I wrote an initial version of the scripts to work on a local server where I have the Dropbox app syncing constantly, and it works really fast. I used the os module to search for a keyword and retrieve the path to the databases and each file created (so I am using a fast search on anything in the project folder by keyword), updating this in the exports.
DB Search:
def find_files():
excluded_folders = ['BACKUP', 'DOCU', 'MISC']
excluded_file_keywords = ['COPY', 'TRASH', 'ISSUED WORKS']
included_file_keyword = ['N1023011'] #change the keywords
for x in location:
for root, dirnames, files in os.walk(x):
for file in files:
if file.endswith(".db3"):
fname = os.path.splitext(file)[0].upper()
root = root.upper()
filelocation = os.path.join(root, file).upper()
if all(keyword not in root for keyword in excluded_folders):
if all(keyword not in filelocation for keyword in excluded_file_keywords):
for keyword in included_file_keyword:
if keyword in fname:
if filelocation not in filelist_gc:
filelist_gc.append(filelocation)
if fname not in databases:
databases.append(fname)
find_files()
Here is for the files that I've exported from the DB:
for directory in directories:
for root, dirs, files in os.walk(directory):
for file in files:
if file.endswith(('.pdf', '.PDF', '.mpg', '.MPG', '.mp4', '.MP4','.jpg','.JPG', '.png', '.PNG', '.avi','.AVI')):
file_paths[file] = os.path.join(root, file)
# Iterate over the rows to update
for index, row in rows_to_update.iterrows():
# Get the project name and file name
project_name = row['Project_Name']
file_name = row['FileName']
# Check if the file is in the dictionary
if file_name in file_paths:
# Found the file, update the Local_Path column
local_path = file_paths[file_name]
#if str(project_name) in local_path and "Trash" not in local_path return the path, the file is not in the deleted files
if str(project_name) in local_path and "Trash" not in local_path:
Afterward, I remov the part from the PC (for example C:/Users/ScriptingUser leaving me with the Dropbox path /root/folder/folder/file.pdf, and using this, I am creating the shared links with settings. Here is the code for that (I am sharing them for a period of 10 years):
if pd.isna(dropbox_url):
expires = datetime.datetime.now() + datetime.timedelta(days=3650)
requested_visibility = dropbox.sharing.RequestedVisibility.public
desired_shared_link_settings = dropbox.sharing.SharedLinkSettings(
requested_visibility=requested_visibility, expires=expires)
try:
shared_link = create_shared_link_with_retry(dbx, account_id, path, desired_shared_link_settings)
if shared_link != "AuthError":
df.at[index, 'Dropbox_URL'] = shared_link
print(f"{row['Local_Path']}|{shared_link}", '|',
f"Processed row {index + 1}/{len(project_group)}. URL {'created' if shared_link else 'failed'}.")
success_links[path] = shared_link
if shared_link == "AuthError":
print(f"Authentication error {path}.")
for success_path, link in success_links.items():
df.loc[df['Local_Path'] == success_path, 'Dropbox_URL'] = link
with pd.ExcelWriter(excel_file_path) as writer:
df.to_excel(writer, sheet_name='Sheet1', index=False)
break
except Exception as e:
with pd.ExcelWriter(excel_file_path) as writer:
df.to_excel(writer, sheet_name='Sheet1', index=False)
for success_path, link in success_links.items():
df.loc[df['Local_Path'] == success_path, 'Dropbox_URL'] = link
print(f"Error occurred while processing row {index}: {e}")
break
As I am the only person in the company doing this, and I have a lot of additional tasks, I started re-writing all the automations for a cloud environment, we want to shut down the server. Long story short, I managed to migrate everything except the links creation (we need this for invoicing, analytics, etc.).
Any ideas on how I can achieve the same using only the API without the local searches?
I tried this, but it takes a lot of time to search anything as the total Projects folder contains milions of files and more than 70TB of data:
def list_folder_contents(dbx, folder_path, extensions):
contents = []
try:
result = dbx.files_list_folder(folder_path, recursive=True)
while True:
for entry in result.entries:
if isinstance(entry, dropbox.files.FileMetadata):
if any(entry.path_display.lower().endswith(ext) for ext in extensions):
contents.append({
'path': entry.path_display,
'name': entry.name,
'type': 'file'
})
if not result.has_more:
break
result = dbx.files_list_folder_continue(result.cursor)
except dropbox.exceptions.ApiError as err:
print(f"Failed to list folder contents: {err}")
return contents
def search_file(contents, folder_name, file_name):
for item in contents:
if folder_name in item['path'] and file_name in item['name']:
return item['path']
return None
def create_shared_link(dbx, file_path):
try:
link_metadata = dbx.sharing_create_shared_link_with_settings(file_path)
return link_metadata.url
except dropbox.exceptions.ApiError as err:
print(f"Failed to create shared link: {err}")
return None
Another thing that i was thinking about is maybe triggering the link creation with webhooks, all the time if a .db3, .pdf, .avi, .mp4, .mpg file is uploaded in Dropbox to run and create them in the DB (filename, project, link)