1
EvoSort: A Genetic-Algorithm-Based Adaptive Parallel Sorting Framework for Large-Scale High Performance Computing
arXiv:2505.18681v2 Announce Type: replace
Abstract: We present EvoSort, a general-purpose adaptive parallel parallel sorting framework accessible at the Python level. EvoSort employs a Genetic Algorithm (GA) to automatically discover and refine critical parameters, including insertion sort thresholds and algorithm selection (e.g., versus LSD radix sort). By adapting continuously to input data and system architecture, EvoSort provides a drop-in replacement for standard Python routines like NumPy and Pandas. Experiments up to10 billion elements across nine data distributions and two hardware platforms demonstrate that EvoSort consistently outperforms competing methods. Results show speedups of up to 225x, exemplifying a powerful auto-tuning solution for large-scale data processing.
Abstract: We present EvoSort, a general-purpose adaptive parallel parallel sorting framework accessible at the Python level. EvoSort employs a Genetic Algorithm (GA) to automatically discover and refine critical parameters, including insertion sort thresholds and algorithm selection (e.g., versus LSD radix sort). By adapting continuously to input data and system architecture, EvoSort provides a drop-in replacement for standard Python routines like NumPy and Pandas. Experiments up to10 billion elements across nine data distributions and two hardware platforms demonstrate that EvoSort consistently outperforms competing methods. Results show speedups of up to 225x, exemplifying a powerful auto-tuning solution for large-scale data processing.