First University-Industry Meeting on Graph Databases (UIM-GDB)

Organized by DAMA-UPC in conjunction with Kobrix

The analysis and storage of data in the form of a graph has increased in the recent years in many areas. Making it possible to extract relevant information from digital libraries is becoming crucial. Such areas include social networking, where the nodes can be people and their hobbies or activities, and the edges are the relations between them (LinkedIn, Facebook, etc); bibliographic databases, with complex on-line queries where the nodes are the authors or the papers written by them, and the edges are authorships or references to those papers; proteomic applications which require the analysis of the interaction between proteins in order to find new cures for several diseases, and do it by mapping proteins and their interactions into very large graphs that have to be thoroughly analyzed; fraud detection applications in different areas like police investigation, where the nodes are the entities investigated and the actions taken by those entities, and the edges are the relations between those entities and the actions; the Wikipedia or similar wiki-like sources of information, where the nodes are the different keywords (including names, locations, urls, etc) and the edges are the relations between the places where those keywords are used or the urls that they point to; etc. Furthermore, one established prominent area in the application of graph-oriented data is the semantic web and more generally in knowledge management, AI systems and natural language processing. There are several semantic web standards and many independent knowledge management models which can ultimately be viewed as graphs, but are not grounded as such. Moreover, specification efforts have been geared towards the logical aspects of the models while the (more practical) data management perspective has been relegated to implementers.

The need to manage the knowledge represented as graphs and to launch ad-hoc queries to those large databases is an attractive challenge to both the research community and to IT companies. In fact, several companies have proposed new graph database management systems in order to handle these graph-like datasets. From an academic perspective, graph databases lack from a universally recognized abstract model. As a result, the need for building new graph database management engines using alternative data structures, different from those normally adopted in other data models (such as the relational data model or the object-oriented) remains an active debate. However, "navigational" models, of which both graph and some object-oriented databases are examples, are suitable and arguably more natural for many business applications, especially where complex domains with rapidly evolving schemas must be represented. But adoption is difficult because of this lack of standardization and the potential of a vendor lock-in. On top of this, industry is facing a real challenge when trying to generalize the use of such systems because of the lack of consensus when it comes to define querying and manipulation capabilities over generic graphs.

The First University-Industry Meeting on Graph Databases (UIM-GDB) is an initiative to gather companies developing graph database management systems together with researchers from academia. The main objective is to join efforts to overcome some of the main obstacles in the development and acceptance of graph databases in industry.

Objectives

The main objectives for this meeting are:

Present different graph database management systems in the market, offering an opportunity to describe the different tools available in the market.
Discuss customer needs regarding graph-like data management and analyze the most common query types. In order to set the groundwork to standardize graph databases, during this workshop we will try to answer questions such as what a graph is and what type of information it may contain, what queries are the most common operations, whether these systems should have extensive support for updates (e.g. large batch updates, ACID transactions), whether they should basically focus on pure analysis/read queries or whether all operations should be mandatory or there might be optional operations.
Foster the standardization of a query language on generic graphs. The term query is still ambiguous in the field of graph databases, mostly because the community has not been able to define and accept a standard language to make the most typical operations on graph analysis. But, is it necessary to define a new language (based on a standardized, general graph model) or would a set of APIs be sufficient? Either way, language or APIs, what programming paradigms would be appropriate to work in? What can be learned from analogous attempts such SQL, SPARQL or XQUERY?
Create a collaborative forum to exchange ideas related to graph management and foster research in this area.

Invited People

We have invited people that has shown to be active in the area both from industry and academia, as well as people that has participated in defining previous languages for semi-structured data. Nevertheless, the meeting is open to include any other people that might be interested in participating or contributing in any manner to this initiative.

Event Details and Dates

The event will take place on February 7^th-8^th 2011 and it will be held at the Universitat Politècnica de Catalunya (UPC), in Barcelona (Catalonia, Spain).

The agenda for the meeting will be defined during the next weeks. We are planning several talks including industrial and research graph database projects presentations, as well as other talks in topics related to the area.

Attendees are expected to confirm their assistance to the event before December 20^th, 2010.

Online Discussion

An online forum has been setup for participants and other interested parties to engage in preliminary discussion in organizing the workshop and defining its scope and agenda. Please request an invitation by going to the following link: http://groups.google.com/group/graph-databases?hl=en and click on "Apply for membership", if you haven't received an invite already.

Meeting Organizers

Victor Muntés Mulero is an associate professor at the Universitat Politècnica de Catalunya (UPC) and a member of DAMA-UPC (http://www.dama.upc.edu), a research group that focuses in topics related to the management and retrieval of large data volumes, information quality and data exploration. Victor is also one of the founders of Sparsity S.L. (http://www.sparsity-technologies.com), a spin-off created from UPC that commercializes DEX, a graph database management system devised to handle massive graphs.

Borislav Iordanov is the founder of Kobrix Software, a consulting company. Borislav is the initiator of several open-source projects, including HyperGraphDB, a database for managing highly structured data knowledge representation, engineering of complex systems and programming language design.

Schedule and Assistants

Venue and Accomodation

The meeting will take place in:

Universitat Politècnica de Catalunya
Dep. d'Arquitectura de Computadors
Building C6 Campus Nord
Jordi Girona 1-3
08034 Barcelona
Room: E101

For those who have chosen to stay in the residence close to UPC, here you have a link for directions:

http://www.resa.es/eng/residencias/torre_girona#

The residence is the RS building.

Monday's Dinner

We are planning to go for an informal dinner on Monday. Since there are not fees for attending the meeting, the dinner is not included. However, we think it can be nice to unplug from the meeting and enjoy the night in Barcelona.

We would need you to confirm if you plan to come for doing a reservation in a nice restaurant.