IIT Database Group

Header bar
Fork me on GitHub

Provenace for Updates and Transactions

In this project we explore how provenance computation can benefit from temporal database techniques. This project is funded by and executed in collaboration with the Oracle corporation. Starting from porting rewrite-based techniques such as the ones used in Perm (Perm) to the Oracle SQL dialect, we will study how to 1) compute the provenance of past query and 2) compute the provenance for updates and transactions. This requires non-trivial extensions to current provenance techniques, because of, e.g., interaction of transactions under lower serialization level. Our solution can retroactively trace transaction provenance as long as an audit log and time travel functionality are available (both are supported by most DBMS). One of the major outcomes so far is the development of the concept of reenactment queries, queries that reenact the effects of a transaction. Reenactment queries are the main enabler of retroactive provenance computation for transactions.

Within this project we have made the following major contributions to provenance management

  • Development of MV-relations, a provenance model for queries, updates, and transactional histories that extends the seminal semiring annotation model (defined for queries) with support for updates and transactions.
  • Development of reenactment, a declarative replay technique with provenance capture that enables tracking the provenance of a past update or transaction retroactively by executing a query.
  • Implementation of provenance tracking for transactions over a standard relational database as part of the GProM system.

MV-relations - A Provenance Model for Transactional Updates

As part of this project we have developed a provenance model that allows tracking the provenance of tuples through queries and transactional updates. In our model, the complete derivation history of a tuple - which update operations derived the tuple and one which inputs of these operations does it depend on - can be encoded in the annotation of the tuple.

Reenactment - Declarative Replay with Provenance Capture

Reenactment is a declarative replay technique that enables a transactional history (or part thereof) to be repeated by executing a so-called reenactment query. We have proven that reenactment queries produce the same result and have the same provenance as the operation(s) they are replaying. Thus, a reenactment query can be used to retroactively compute the provenance of an operation executed some time in the past as long as the database state seen by this operation can be accessed.

Implementation in GProM

To retrieve the provenance of a past update (transaction, or history) we construct its reenactment query based on a log of SQL operations executed in the past (e.g., Oracle's audit log facility). Such an reenactment query needs to be executed over the database state seen by the operation(s) to be replayed. We use time travel to access such past database states. The techniques developed in this project have been integrated in the GProM system, a database independent middleware application for computing provenance.

Collaborators

Funding

  • Oracle - Provenance using temporal databases (Extension) (2016 - 2017), $95,829, PIs: Boris Glavic
  • Oracle - Provenance using temporal databases (2015 - 2016), $85,000, PIs: Boris Glavic

Publications

  1. Efficient Answering of Historical What-if Queries
    Felix Campbell, Bahareh Arab and Boris Glavic
    Proceedings of the 48th International Conference on Management of Data (SIGMOD) (2022), pp. 1556–1569.
    details
  2. Provenance For Transactional Updates
    Bahareh Arab
    Illinois Institue of Technology.
    details
  3. Using Reenactment to Retroactively Capture Provenance for Transactions
    Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
    IEEE Transactions on Knowledge and Data Engineering. 30, 3 (2018) , 599–612.
    details
  4. Answering Historical What-if Queries with Provenance, Reenactment, and Symbolic Execution
    Bahareh Arab and Boris Glavic
    Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance (2017).
    details
  5. Debugging Transactions and Tracking their Provenance with Reenactment
    Xing Niu, Boris Glavic, Seokki Lee, Bahareh Arab, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Su Feng and Xun Zou
    Proceedings of the VLDB Endowment (Demonstration Track). 10, 12 (2017) , 1857–1860.
    details
  6. Reenactment for Read-Committed Snapshot Isolation
    Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
    Proceedings of the 25th ACM International Conference on Information and Knowledge Management (2016), pp. 841–850.
    details
  7. Formal Foundations of Reenactment and Transaction Provenance
    Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
    Technical Report #IIT/CS-DB-2016-01
    Illinois Institute of Technology.
    details
  8. Reenactment for Read-Committed Snapshot Isolation (long version)
    Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
    Illinois Institute of Technology.
    details
  9. A Generic Provenance Middleware for Database Queries, Updates, and Transactions
    Bahareh Arab, Dieter Gawlick, Venkatesh Radhakrishnan, Hao Guo and Boris Glavic
    Proceedings of the 6th USENIX Workshop on the Theory and Practice of Provenance (2014).
    details
  10. Reenacting Transactions to Compute their Provenance
    Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
    Technical Report #IIT/CS-DB-2014-02
    Illinois Institute of Technology.
    details