Features:
- Add support for for and while loops including unrolling, type speculation and break/continue keywords
- Add preliminary support for reading/writing Apache ORC files
- Add support for builtin iterators iter, zip, enumerate, next, reversed
- Add support for is keyword
- Add list functionality to both Posix and AWS S3 file systems
- Package WebUI with auto-start (requires working installation of MongoDB)
- Add option to package experimental Lambda runner with pip package/wheel
- Add auto-setup functionality for deploying Lambda runner to AWS
- Add support for running Tuplex in Google Colab Notebooks
- Add MacOS wheel
- Switch to PyBind11 from Boost Python
Bug Fixes:
- Auto-unpacking of dictionaries with string keys works now in fallback mode as well
- Test suite now works also when invoked using multiple processes/threads
- Update various links to software in dockerfiles, install scripts
- Fix reference count issues when invoking parallelize on lists
- Fix issue with file output deleting the output directory, changed to validate directory first
- Fix bug where non-conforming rows will crash Tuplex when the majority type is a simple tuple containing var length fields due to a missing type check in parallelize.
- Fix building Tuplex on Ubuntu 20.04, add missing dependencies
- Fix auto-complete bug when using Tuplex in shell mode
- Fix calling len on empty lists
- Fix bug in aggregateByKey where source code was not correctly extracted
- Free local partitions when python context object is destroyed
Committers:
Ben Givertz, Yash Gotmare, Zain Ruan, Malte Schwarzkopf, Yunzhi Shao, Leonhard Spiegelberg, Rahul Yesantharao