Viability of Recursive SQL Functions

DSpace Repository


Dokumentart: PhDThesis
Date: 2022-04-18
Language: English
Faculty: 7 Mathematisch-Naturwissenschaftliche Fakultät
Department: Informatik
Advisor: Grust, Torsten (Prof. Dr.)
Day of Oral Examination: 2023-04-05
DDC Classifikation: 004 - Data processing and computer science
Keywords: Compiler , SQL , Rekursion , Relationale Datenbank , Funktionale Programmierung , Program Slicing , Lineare Rekursion
Other Keywords: Endrekursion
Tail Recursion
Order a printed copy: Print-on-Demand
Show full item record


The inclusion of recursive common table expressions (recursive CTEs) in the SQL:1999 standard enabled the developer to implement complex in-database computations, e.g., graph algorithms and others. However, their awkward syntax and rigid fixed- point evaluation method requires highly-specialized expert knowledge in SQL to implement complex computations. This sets the bar quite high for developers to consider implementing such queries. We propose an alternative way to implement complex in-database computations, which we call functional-style SQL UDFs. UDFs implemented in functional-style allow recursive self-invocation inside their own function body. In this publication, we measure the viability of functional-style UDFs from two angles: readability and runtime performance. To measure the readability of functional-style UDFs, we conducted a user study in 2020. We presented the participants with tasks from the following topics: - Choose the correct implementations of the algorithms for the fibonacci numbers and greatest common divisor formulated in functional-style and recursive CTE. - Describe and evaluate two unknown UDFs. One is formulated in functional-style and the other as a recursive CTE. - Implement the 0-1 Knapsack algorithm based on its textbook style formulation in either functional-style or recursive CTE formulation. Each participant is then graded based on the score and time needed. In Chapter 2, we present the user study results and compare how well the participants handle these functional-style UDFs compared to recursive CTEs. We discuss if functional- style UDFs can help improve readability for such complex queries. Besides readability, developers are also interested in the runtime performance of functional-style UDFs. We find that some RDBMSs such as PostgreSQL, Oracle, Microsoft SQL Server, MySQL and SQLite have trouble handling functional-style UDFs as-is. Performance issues, strict recursion depth limitations, or even outright denying evaluation of functional-style UDFs are the main issues we encountered. In Chapter 4, we describe a SQL-to-SQL compiler which accepts functional-style UDFs, as defined in Chapter 3. The compiler produces standard SQL:1999 recursive CTEs, which replaces the function body of functional-style UDFs so that it no longer exhibits recursive self-invocations. The compiled function body evaluates in a two-phased fashion that (1) constructs a call graph top-down and then (2) traverses the call graph bottom-up until its root node is reached and the result is returned. Indeed, the compiler does not rely on intrusive changes to the underlying database engine. Furthermore, this two-phased approach enables optimizations (call sharing, ref- erence counting, linear- and tail-recursion detection, memoization, batching) through small tweaks and improvements to the compiler, which we describe in Chapter 5. We also measure the execution times of various algorithms before and after compilation and discuss the results. We find the runtime performance a developer experiences when compiling functional-style UDFs can improve which is supported by the experiments.

This item appears in the following Collection(s)