A GUI-based utility for combining multiple Excel files (.xlsx) into a single, master workbook. The tool is designed for performance and usability, featuring a responsive, non-blocking interface that preserves all cell formatting, styles, and merged regions 📊🔗
➡️ Read more about the project, its features, and development in my Medium story.
- Overview
- Key Features
- Installation
- Usage
- Project Structure
- Development
- Known Issues
- Contributing
- License
- Contact
CombineXL tackles the tedious task of manually merging data from multiple Excel spreadsheets. This tool provides an intuitive graphical interface to seamlessly combine any number of .xlsx files into one master file, giving you full control over the process.
The core strength of CombineXL is its meticulous preservation of the original files' integrity. It copies not just the data but also all cell formatting, styles, comments, merged cell ranges, and column/row dimensions. The entire operation runs in a background thread, ensuring the application never freezes, and you can monitor its progress or cancel it at any time.
A typical workflow involves:
- Launching the script and selecting the Excel files to combine.
- Interactively reordering the files to control the sequence of combination.
- Choosing an output location and filename.
- Selecting between an Optimized (fast) or Standard processing method.
- Configuring combination parameters: header rows to keep from the first file, header rows to discard from subsequent files, and whether to add a source filename column.
- Deciding whether to keep formulas or convert them to static values.
- Monitoring the non-blocking progress window.
- Receiving a summary report upon completion, cancellation, or error.
- Optimized Processing Engine: Features a high-performance mode that uses style caching to dramatically speed up the combination process on files with extensive formatting.
- Interactive File Ordering: A unique dialog allows you to drag and drop or move files up and down to define the exact order of combination.
- Rich Formatting Preservation: Faithfully copies fonts, colors, borders, fills, number formats, and alignments.
- Non-Blocking GUI: The combination process runs in a separate thread, keeping the UI fully responsive.
- Real-Time Progress Tracking: A dedicated window shows the current status, a progress bar, and which file is being processed.
- Graceful Cancellation: A "Cancel" button allows the user to stop the process at any point, preventing an incomplete output file.
- Formula or Value Preservation: Users can choose to either keep formulas intact or save only their calculated static values to ensure data integrity.
- Flexible Header Control: Specify how many header rows to retain from the first file and how many to discard from all subsequent files.
- Source File Tracking: Optionally adds a new column at the beginning of the master file indicating the source filename for each row.
- Comprehensive Metadata Copying: Preserves column widths, row heights, and merged cells from the source files.
- Python 3.6+
- A graphical desktop environment is required to run the
tkinter
-based GUI.
git clone https://github.com/sztaroszta/CombineXL.git
cd CombineXL
You can install the required dependency using pip:
pip install -r requirements.txt
Alternatively, install the dependency manually:
pip install openpyxl
1. Run the application:
python excel_combiner.py
2. Follow the GUI Prompts:
- Select Input File: A dialog will ask you to choose the .xlsx files you want to combine. You can select multiple files.
- Order Files: A unique dialog appears where you can reorder the selected files. The final combination will follow this sequence.
- Choose Processing Method: You will be prompted to select between an "Optimized" (recommended for speed) and "Standard" cell copying method.
- Select Output File: Choose a folder and a name for the final combined .xlsx file.
- Configure Headers: Enter the number of header rows to keep from the first file and the number of header rows to discard from the others.
-
Include Filename: A dialog will ask if you want to add a new first column containing the original filename for each row.
-
Preserve Formulas: A dialog box will ask if you want to keep formulas or save their calculated values. Choosing 'No' is safer for preventing broken references.
3. Monitor Progress:
- A progress window will appear, showing the current status and a progress bar. You can click "Cancel" at any time to halt the operation.
4. Verify Success:
- Upon completion, a summary dialog will report the outcome (success, cancellation, or error). The combined file will be located in your chosen output directory.
CombineXL/
├── excel_combiner.py # Main script for running the tool
├── README.md # Project documentation
├── requirements.txt # List of dependencies
├── .gitignore # Git ignore file for Python projects
├── assets/ # Contains screenshots of the application's UI
└── LICENSE # GNU AGPLv3 License File
Guidelines for contributors:
If you wish to contribute or enhance CombineXL:
- Coding Guidelines: Follow Python best practices (PEP 8). Use meaningful variable names and add clear comments or docstrings.
- Testing: Before submitting changes, please test them locally by running the script with various Excel files to ensure existing functionality is not broken.
- Issues/Pull Requests: Please open an issue or submit a pull request on GitHub for enhancements or bug fixes.
- Formula References: If you choose to preserve formulas, any formulas that reference cells in other workbooks or worksheets that are not part of the combination will result in #REF! or other errors.
- Structural Mismatches: Combining files with vastly different column structures may lead to data that is difficult to analyze in the final sheet. The optional "Include Filename" feature can help identify the source of each row.
- Platform Dependency: The script requires a desktop environment with a graphical display to run, as it is built with tkinter. It will not run in a headless environment (like a standard server SSH session).
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch for your feature or fix.
- Commit your changes with descriptive messages.
- Push to your fork and submit a pull request.
For major changes, please open an issue first to discuss the proposed changes.
Distributed under the GNU Affero General Public License v3 (AGPLv3) License. See LICENSE for full details.
For questions, feedback, or support, please open an issue on the GitHub repository or contact me directly:
Project Showcase: sztaroszta.github.io
Combine smarter, not harder!
Version: 4
Concept Date: 2024-05-22