Tag Archives: XML Streaming

XML stream processing using a lazy concurrent language

S-C. Mu, T-C. Tsai, and K. Nakano. In Programming Language Techniques for XML (PLAN-X 2008). January 2008. [PDF]

Motivated by previous work on XML stream processing, we noticed that programmers need concurrency to save space, especially in a lazy language. User-controllable concurrency provides the possibility of reducing space usage in many programs. With lower garbage-collection overhead resulting from concurrent execution, the elapsed time of programs, stream processing ones in particular, is tremendously decreased.

The challenge is how to encapsulate concurrency without compromising expressiveness and flexibility of languages. We propose the idea of pushing datatypes — when a pushing closure is demanded, all expressions referring to it are evaluated concurrently to weak head normal forms. The closure is no more alive and may thus be garbage collected. Semantically, it is a non-intruding extension because it does not change the denotational semantics of an expression. It is also easy to be implemented on top of a language providing concurrent threads and inter-thread synchronisation. We have developed a prototype using Haskell and showed pushing datatypes can be used to effectively reduce space usage and thus result in shorter elapsed time in many programs.

Push: Improving Heap Residency for Lazy Stream Processing by Concurrency

Prototype implementation of a language Push, accompanying our recently submitted paper. The prototype is implemented and prepared by Ta-Chung Tsai.

While studying XML stream processing, we noticed that programmers need concurrency to save space, especially in a lazy language. We propose the idea of pushing datatypes — when a pushing closure is demanded, all expressions referring to it are evaluated concurrently to weak head normal forms. The closure is no more alive and may thus be garbage collected.

[GZipped Tarball]

A pushdown machine for recursive XML processing

K. Nakano and S-C. Mu. In The Fourth Asian Symposium on Programming Language and Systems, LNCS 4279, pp. 340-356, November 2006.
[PDF]

XML transformations are most naturally defined as recursive functions on trees. A naive implementation, however, would load the entire input XML tree into memory before processing. In contrast, programs in stream processing style minimise memory usage since it may release the memory occupied by the processed prefix of the input, but they are harder to write because the programmer is left with the burden to maintain a state. In this paper, we propose a model for XML stream processing and show that all programs written in a particular style of recursive functions on XML trees, the macro forest transducer, can be automatically translated to our stream processors.
The stream processor is declarative in style, but can be implemented efficiently by a pushdown machine. We thus get the best of both worlds — program clarity, and efficiency in execution.