Type of publication | Title |
---|---|
Soon... | PhD Thesis |
Later | Distributed OMEGA-storage: Scalable High Performance Distributed Storage |
In-the-pipe | Live Optimization in SDDS Join Operations |
Paper | OMEGA-storage: A Self Indexing Multi-Attribute Storage for Very Large Main Memories |
Tech Report | OMEGA-storage: A Self Indexing Multi-Attribute Storage for Very Large Main Memories |
Paper | hQT*: A Scalable Distributed Data Structure for High-Performance Spatial Access |
Paper | Transparent Distribution in a Storage Manager |
Tech Report | Scalable Storage for a DBMS Using Transparent Distribution |
Internal report | Synergy effects when using Linear Hashing (LH) Locally for Distributed Linear Hashing (LH*) on a Massive Parallel Machine (Parsytec GC) |
Lic Thesis | A Scalable Data Structure for a Parallel Data Server |
Paper | LH*LH : A Scalable High Performance Data Strucuture for Switched Multicomputers |
Presentation of project at "Ramkonferense" | spAMOS: Scalable Parallel AMOS using LH* |
Tech Report | LH*LH : A Scalable High Performance Data Strucuture for Switched Multicomputers |
Unpublished | Light Weight Thread Implementation and Integration into AMOS DBMS |
Internal Paper | Implementing C-Portable Light Weight Threads |
CAELAB Internal Document | AMOS Programmer's Hackbook - a Guideline to the Source |
CAELAB Memo | AMOS.v1 User's Guide |
M Sc Thesis | An Implementation of Transaction Logging and Recovery in a Main Memory Resident Database System |
Jonas S Karlsson, Martin L. Kersten
OMEGA-storage: A Self Indexing Multi-Attribute Storage
for Very Large Main Memories
To be presented at the
Australian Database Conference
Canberra, Australia, January
Abstract: Main memory storage is continuously improving, both in its price and its capacity. With this comes new storage problems and new directions of possible usage. Just before the millennium, several main memory database systems are becoming commercially available. The hot areas for their deployment include boosting the performance of web-enabled systems, such as search-engines, and electronic auctioning systems. We present a novel data storage structure -- the Omega-storage structure, a high performance data structure, to index very large amounts of multi-attribute data. The experiments show excellent performance for point retrieval, and highly efficient pruning for pattern searches. It provides the balanced storage previously achieved by random kd-trees, but avoids their increased pattern match search times, by an effective assignment bits of attributes to index. Moreover, it avoids the sensitivity of the kd-tree to insert orders.
Jonas S Karlsson
hQT*: A Scalable Distributed Data Structure for High-Performance Spatial Access
Was presented at
International Conference on
Foundations of Data Organization
pp 37-46, Kobe, Japan, November
Abstract: Spatial data storage stresses the capability of conventional DBMSs. We present a scalable distributed data structure, {\hQTs}, which offers support for efficient spatial point and range queries using order preserving hashing. It is designed to deal with skewed data and extends results obtained with scalable distributed hash files, LH*, and other hashing schemas. Performance analysis shows that an hQT* file is a viable schema for distributed data access, and in contrast to traditional quad-trees it avoids long traversals of hierarchical structures. Furthermore, the novel data structure is a complete design addressing both scalable data storage and local server storage management as well as management clients addressing. We investigate several different client updating schemes, enabling better access load distribution for many ``slow'' clients.
J. S Karlsson, M. L. Kersten
Transparent Distribution in a Storage Manager
Was presented at
Internatial Conference on Parallel and Distributed Processing
Techniques and Applications, Las Vegas, NV, USA, July 1998.
Abstract: Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into Monet, our efficient and extensible DBMS. We show that this merge provides high performance processing and scalable storage of very large sets of distributed data. In our implementation we extended the Monet language interpreters operators in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. Performance measures show viability of our approach, querying using a number of operators on distributed data on a number of nodes.
J. S Karlsson, M. L. Kersten
Scalable Storage for a DBMS Using Transparent Distribution
Technical Report INS-R9710, CWI,
Amsterdam, The Netherlands, 1997.
Abstract: Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into our efficient and extensible DBMS, called Monet\footnote{See http://www.cwi.nl/${\sim}$monet}. We show that this merge permits processing very large sets of distributed data. In our implementation we extended the relational algebra interpreter in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. The on-the-fly optimization of operations --- heavily used in Monet --- to deploy different strategies and scenarios inside the primary operators associated with an SDDS adds self-adaptiveness to the query system; it dynamically adopts itself to unforeseen situations. We illustrate the performance efficiency by experiments on a network of workstations. The transparent integration of SDDSs opens new perspectives for very large self-managing database systems.
Jonas S Karlsson (1997).
A Scalable Data Structure for a Parallel Data Server
.
Thesis No 609 by Jonas S Karlsson, 1997
A Licentiate Thesis is a simpler form of PhD Thesis made about 3 years
after the MSc, availiable only in Sweden. A full PhD these takes 2-3
more years.
Abstract: In this thesis we identify the importance of appropriate data structures for parallel data servers. We focus on Scalable Distributed Data Structures for this purpose. In particular LH*, and the new data structure LH*lh. An overview is given of related work and systems that have traditionally implicated the need for such data structures. We begin by discussing high-performance databases, and this leads us to database machines and parallel data servers. We sketch an architecture for an LH*lh-based file storage that we plan to use for a parallell data server. We also show performance measures for the LH*lh and present its algorithm in detail. The testbed, the Parsytec switched multicomputer, is described along with experience acquired during the implementation process. Parts of the thesis are based on the article on LH*lh published in the lecture notes from the 5th International Conference on Extending Database Technology, in Avignon, France 1996.
Karlsson, J. S., Litwin, W., and Risch, T. (1995).
LH*LH : A Scalable
High Performance Data Strucuture for Switched Multicomputers
.
Technical Report LiTH-IDA-R-95-25, Department of Computer and Information
Science, Linköping University, Sweden. Has been accepted to
EDBT-96, Avinon, France.
Abstract: LH*LH is a new data structure for scalable high-performance hash files on the increasingly popular switched multicomputers, i.e., MIMD multiprocessor machines with distributed RAM memory and without shared memory. An LH*LH file scales up gracefully over available processors and the distributed memory, easily reaching Gbytes. Address calculus does not require any centralized component that could lead to a hot- spot. Access times to the file can be under a millisecond and the file can be used in parallel by several client processors. We show the LH*LH design, and report on the performance analysis. This includes experiments on the Parsytec GC/PowerPlus multicomputer with up to 128 Power PCs and 32 MB of distributed RAM per node. We prove the efficiency of the method and justify various algorithmic choices that were made. LH*LH opens a new perspective for high-performance applications, especially for the database management of new types of data and in real-time environments.
Karlsson, J. S. (1995).
An Implementation of Transaction Logging and
Recovery in a Main Memory Resident Database System
.
Master Thesis LiTH-IDA-Ex-94-04, Department of Computer and Information
Science, Linköping University, Sweden.
Abstract: This report describes an implementation of Transaction Logging and Recovery using Unix Copy-On-Write on spawned processes. The purpose of the work is to extend WS-Iris, a research project on Object Oriented Main Memory Databases, with functionality for failure recovery.The presented work is a Master Thesis for a student of Master of Science in Computer Science and Tech nology. The work has been commissioned by Tore Risch, Professor of Engineering Databases at Computer Aided Engineering laboratory (CAElab), Linköping University (LiU/LiTH), Sweden.
J.S. Karlsson, S. Flodin, K. Orsborn, T. Risch, M. Sköld, M. Werner (1994)
Amos.v1 User's Guide
.
CAELAB Memo 94-01, Department of Computer and Information Science,
Linköping University, Sweden, March 1994.
Abstract: AMOS(Active Mediating Object System) is an Object-Relational database system. AMOS differs from the first generation Object-Oriented (OO) databases in that a relationally complete query language AMOSQL, is available which is more general tahn relational query languages, such as SQL. Furthermore, AMOS is a main-memory database system, since the design of AMOS is optimized for efficient execution assuming that the entire database fits in main memory. For persistence, the system provides primitives for logging and saving and restarting the database from disk. AMOS is implemented in C and runs on HP and SUN Unix platforms. This manual descripes how to use AMOSQL-query language. For interfaces to C, Lisp, and description of some internals, see AMOS Ssytems Manual.