Shortest Path (2016)
Archival Note
The original contest site is still available! If it’s unavailable in the future, it can be found on an Archive.org mirror instead.
The provided code for this contest is available at transactionalblog/sigmod-contest-2016. The exact provided code is preserved as commit 0c2aceb3. The main
branch contains changes make to fix build issues, improve the build system, update instructions, etc. Links to code in the copied text below have been changed to point to the GitHub repo.
This contest was organized by University of Waterloo’s Database Systems group. The winner of this contest was Takuto Ikuta, Takanori Hayashi, Yosuke Yano, and Yoichi Iwata from University of Tokyo. The leaderboard has the posters and submission from the five finalists.
Task Details
In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. This is a fundamental and well-studied combinatorial optimization problem with many practical uses: from GPS navigation to routing schemes in computer networks; search engines apply solutions to this problem on website interconnectivity graphs and social networks apply them on graphs of peoples' relationships.
In this contest, the task is to answer shortest path queries on a changing graph, as quickly as possible. We will provide an initial graph which you may process and index in any way you deem necessary. Once this is done, we will begin issuing a workload consisting of a series of sequential operation batches. Each operation is either a graph modification (insertion or removal) or a query about the shortest path between two nodes in the graph. Your program is expected to correctly answer all queries as if all operations had been executed in the order they were given.
The graphs are directed and unweighted. Input to your program will be provided via standard input, and the output must appear on the standard output.
Testing Protocol
Our test harness will first feed the initial graph to your program’s standard input. The graph is represented as a list of edges, each of which consists of a pair of node ids (the edge starting node, followed by the edge destination node) represented as non-negative integer numbers. Your program will receive multiple lines (each representing one edge) containing exactly 2 integer numbers in decimal ASCII representation separated by a single space. The initial graph ends with a line containing the character 'S'.
1 2 2 3 3 1 4 1 2 4 S
In the example above, the input to the left describes the graph presented at the right. Node IDs may appear in the input in any order, and there may be "gaps". That is, nodes in an N-node graph may have node IDs larger than N-1. The largest possible Node ID is 232-1. There may be at most one edge between any two nodes. If the same edge appears in the input more than once, only a single edge should be included in the graph.
After sending the initial graph input, our test harness will monitor your program’s standard output for a line containing the character 'R' (case insensitive, followed by the new line character '\n'). Your program uses this line to signal that it is done ingesting the original graph, has performed any processing and/or indexing on it and is now ready to receive the workload.
The test harness delivers the workload in batches. Each batch consists of a sequence of operations provided one per line followed by a line containing the single character 'F' that signals the end of the batch.
Each operation is represented by one character ('Q', 'A' or 'D') that defines the operation type, followed by a space and two positive integer numbers in decimal ASCII representation, also separated by a space. The two integer numbers represent node IDs.
The three operation types are as follows:
-
'Q'/query: this operation needs to be answered with the distance of the shortest (directed) path from the first node to the second node in the current graph. The answer should appear as output in the form of a single line containing the decimal ASCII representation of the integer distance between the two nodes, i.e., the number of edges on a shortest directed path between them. If there is no path between the nodes or if either of the nodes does not exist in the graph, the answer should be -1. The distance between any node and itself is always 0.
-
'A'/add: This operation requires you to modify your current graph by adding another edge from the first node in the operation to the second. As was the case during the input of the original graph input, if the edge already exists, the graph remains unchanged. If one (or both) of the specified endpoints of the new edge does not exist in the graph, it should be added. This operation should not produce any output.
-
'D'/delete: This operation requires you to modify your current graph by removing the edge from the first node in the operation to the second. If the specified edge does not exist in the graph, the graph should remain unchanged. This operation should not produce any output.
After the end of every batch, our test harness will wait for output from your program before providing the next batch. You need to provide as many lines of output as there are query ('Q') operations in the batch - each line containing the distance of the shortest path as described above. Your program is free to process operations in a batch concurrently. However, the query results in the output must reflect the order of the queries within the batch.
After the last batch, the standard input stream of your program will be closed by our test harness. Here are some example batches corresponding to the initial example graph above:
Batch 1
Q 1 3 A 4 5 Q 1 5 Q 5 1 F
Output:
2 3 -1
Batch 2
A 5 3 Q 1 3 D 2 3 Q 1 3 F
Output:
2 4
Try-It-Yourself Example
Input | Output |
---|---|
|
Reference Solution
We have created a simple reference solution, which you are welcome to download and modify. It is implemented in Python, using the networkx module.