Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: truncate parsed uploads to prevent database and frontend blocking caused by excessively large files #3914

Merged
merged 53 commits into from
Sep 27, 2024

Conversation

Cristhianzl
Copy link
Collaborator

@Cristhianzl Cristhianzl commented Sep 25, 2024

This pull request introduces a feature to limit the size of parsed files during uploads. The primary goal is to prevent the database and the front end from being blocked by excessively large files, which can cause performance issues and degrade user experience.

  • Implemented a recursive function to analyze schemas for fields containing excessive character lengths and truncate them accordingly.

  • Introduced a frontend and backend safeguard to truncate excessively long values, preventing potential crashes in the user interface.

  • Added a new .env variable LANGFLOW_MAX_FILE_SIZE_UPLOAD with a default value of 100 MB. This allows users to customize the maximum file upload size according to their needs.

…t message to display the correct file size limit of 100 bytes instead of 10 bytes
…n VertexBuildResponse class

📝 (schemas.py): Add a new truncate_text helper function to safely truncate text in nested dictionaries
📝 (model.py): Add a new field_serializer method to serialize outputs in TransactionBase class
📝 (model.py): Add a new truncate_text helper function to safely truncate text in nested dictionaries
📝 (model.py): Add a new field_serializer method to serialize data and artifacts in VertexBuildBase class
📝 (model.py): Add a new truncate_text helper function to safely truncate text in nested dictionaries
… instead of 99999

🐛 (model.py): fix truncation length of text fields to 10 characters instead of 99999
🐛 (model.py): fix truncation length of text fields to 10 characters instead of 99999
🐛 (index.tsx): truncate resultMessage to 99999 characters and add message if text is too long
🐛 (model.py): Fix typo in the path for 'base_retriever' data field
🐛 (model.py): Fix typo in the path for 'base_retriever' data field
🐛 (model.py): Fix typo in the path for 'base_retriever' data field
🐛 (index.tsx): Fix logic to correctly handle resultMessageMemoized when it is an object
…s for better clarity and consistency

📝 (model.py): update serialize_outputs and serialize_artifacts functions to use truncate_long_strings for string truncation
📝 (model.py): introduce MAX_TEXT_LENGTH constant for defining the maximum length of text to truncate in the models
… class to use a new helper function truncate_long_strings for better code readability and maintainability
@Cristhianzl Cristhianzl self-assigned this Sep 25, 2024
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Sep 25, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 25, 2024
…te module to improve code organization and reusability

🔧 (model.py): Import the `truncate_long_strings` function from the correct module to fix the reference error
🔧 (model.py): Import the `truncate_long_strings` function from the correct module to fix the reference error
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Sep 25, 2024
…te long strings in dictionaries and lists to prevent exceeding the maximum text length.
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Sep 25, 2024
@Cristhianzl Cristhianzl marked this pull request as draft September 25, 2024 17:53
Cristhianzl and others added 5 commits September 25, 2024 15:03
…function truncate_long_strings to ensure correct behavior when truncating long strings in various data structures

🐛 (switchOutputView/index.tsx): Fix truncation logic to correctly truncate long strings by adding ellipsis at the end instead of displaying additional text about truncation.
…truncate_long_strings function

📝 (test_truncate_long_strings_on_objects.py): Add additional tests for handling negative, zero, and small max_length values in truncate_long_strings function
♻️ (AssemblyAIFormatTranscript.py): Change typing annotations to lowercase for consistency
♻️ (AssemblyAIListTranscripts.py): Change typing annotations to lowercase for consistency
♻️ (LangChainHubPrompt.py): Remove duplicate import statement to improve code readability
♻️ (model.py): Remove unnecessary blank lines to enhance code readability
♻️ (model.py): Remove unnecessary blank lines to enhance code readability
♻️ (constants.py): Update MAX_TEXT_LENGTH constant to have consistent spacing
♻️ (util.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string quotes to be consistent throughout the file
♻️ (test_truncate_long_strings_on_objects.py): Update string
…ettings function to allow configuring maximum file size for uploads
… size upload from utility store to improve code modularity and reusability

🐛 (inputFileComponent/index.tsx): fix error handling logic to display error message when uploading a file fails
…ger used and update MAX_TEXT_LENGTH constant to a higher value
…ileSizeUpload function to handle maximum file size upload in bytes
…Upload method to handle maximum file size upload functionality in the UtilityStoreType
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 27, 2024
Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 27, 2024
@Cristhianzl Cristhianzl enabled auto-merge (squash) September 27, 2024 14:22
…e is larger than the maximum allowed size in MB
…pload state

🐛 (chatView/index.tsx): add validation to check if file size exceeds maxFileSizeUpload limit before uploading
🐛 (chatView/index.tsx): handle error response when uploading file and display error message
…yStore to check file size before uploading to prevent exceeding the maximum allowed file size

📝 (chatInput/index.tsx): add error handling for file size exceeding the maximum allowed size and display an error alert with the appropriate message
…ng file size validation errors

🔧 (FileInput/index.tsx): Update file input component to use global alert store and utility store for error handling and max file size configuration
…load a file larger than the specified limit of 0.001MB. This test includes mocking API response, checking for required environment variables, interacting with the page elements, and validating error message display.
@Cristhianzl Cristhianzl merged commit 948b150 into main Sep 27, 2024
28 checks passed
@Cristhianzl Cristhianzl deleted the cz/limitCsvView branch September 27, 2024 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants