Tags: ding113/dify-ooxml-tool
Tags
Update version to 0.1.0 and enhance OOXML text extraction and rebuild… …ing tools - Incremented the version from 0.0.8 to 0.1.0 to reflect new features and improvements. - Added a new parameter for text extraction granularity in the OOXML text extraction tool, allowing users to choose between 'element', 'run', and 'paragraph' levels for better control over text processing. - Enhanced logging in the OOXML document rebuilding process to provide detailed progress updates during document reconstruction, improving user experience and transparency. - Updated the OOXML parser to support the new text unit level parameter, ensuring consistent extraction behavior across different document types. These changes improve the functionality and usability of the translation tools, facilitating more precise text handling during the extraction and rebuilding processes.
Refine space addition logic in OOXMLRebuilder to handle digit adjacency - Enhanced the space addition logic to prevent adding spaces between adjacent digits, improving text formatting accuracy. - Introduced a new variable to streamline the space addition checks, ensuring clarity in the decision-making process. - Updated logging to provide insights into the space rule overrides for digit-to-digit transitions. These changes improve the robustness of text processing during translation, ensuring better handling of numerical sequences.
Enhance OOXMLRebuilder with detailed logging for space processing and… … XML replacements - Configured a dedicated logger for the OOXMLRebuilder to enable detailed debug output during space processing and XML text replacements. - Added comprehensive debug statements to track the space check logic, including character analysis and decision-making processes. - Improved logging for XML replacements to capture the original and final text, as well as any issues encountered during the replacement process. - Enhanced overall transparency in the text processing workflow, facilitating easier debugging and maintenance. These changes improve the robustness and traceability of the OOXML document rebuilding process, ensuring better handling of text formatting and space management during translation.
Refine segment validation in UpdateTranslationsTool for improved logging - Updated the segment validation logic to ensure that only segments with a valid ID are processed, enhancing the clarity of the logging output. - Adjusted warning messages to specify when a segment is invalid due to a missing ID, improving debugging capabilities. These changes enhance the robustness of the translation update process by providing clearer insights into segment handling.
Refine space addition logic in OOXMLRebuilder for improved accuracy - Updated the space addition rules to focus solely on the characters at the connection points, enhancing precision in determining when to add spaces. - Removed the previous "word-like" check and simplified the logic to ensure better handling of various text scenarios, avoiding unnecessary spaces at punctuation. - Adjusted preprocessing of segments to align with the new space detection logic, ensuring consistent formatting in translated text. These changes enhance the overall text processing accuracy during translation.
Refine space addition logic in OOXMLRebuilder for improved text handling - Updated the space addition rules to consider "word-like" forms, enhancing support for common cases such as abbreviations and contractions (e.g., "don't", "Mr."). - Introduced a new helper function to check if text starts and ends with alphanumeric characters, replacing the previous simple word/digit check. - Enhanced preprocessing of segments to ensure correct space handling in translated text. These changes improve the accuracy of text formatting during the translation process.
Enhance GetTranslationTextsTool with new chunking strategy and parame… …ters - Introduced a new chunking strategy parameter to allow intelligent chunk creation based on total chunk count. - Added parameters for minimum segments per chunk, maximum total chunks, target total chunks, and overlap settings. - Updated logging to reflect the new chunking strategy and parameters used during processing. - Improved chunk creation methods to support both legacy and new strategies, ensuring flexibility in handling translation segments. - Enhanced output schema to include details about the chunking strategy used and average segments per chunk. These changes improve the efficiency and adaptability of the translation process, allowing for better handling of varying segment sizes and translation requirements.