Skip to main content

Community

Other: HVR feature request - add property to Integrate action to be able to limit router file size

Please sign in to leave a comment.

Comments

2 comments

  • Mark Van de Wiel User

    Hi Nora,

    A 179 GB transaction file is HUGE, and I can totally see that we could not process this.

    However, transactional consistency is foundational to the way HVR functions. Way back it was decided that the transaction file is the lowest level of granularity for propagating changes. A single transaction file always contains at least one transaction, and a single source transaction never spans multiple transaction files.

    Your request is to change this. Such change would have major implications for the core of how HVR works. In particular we need to worry about failure scenarios. Imagine we would have split this transaction into 179 files of 1 GB each. After writing 100 files the system crashes for whatever reason. How do we recover from that? Of course we can build this, but it has major implications.

    Likewise, on the destination side - for historical reasons - HVR maintains transactional consistency by default. Based on transaction files always containing one or more transactions, and transactions never spanning transaction files, the integrate "simply" processes one or more transaction files on every cycle. If we allow arbitrary splitting of transactions across files then we have to build infrastructure to maintain consistency (and ability to recover) within a new paradigm. This is a massive overhaul.

    Finally, there is a strong argument that even if we had split this massive transaction into multiple files you would still not be happy. Processing a single 1 GB transaction file is quite a hefty process that can easily take 30 minutes or more. And now we have to go through 179 of these before we proceed with other transactions (that is a few days of processing)?

    Really the best way to deal with a transaction that effectively replaces most if not all of a massive table's data, is to refresh the table. This is much quicker. Also, with 6.2.5 we introduced isolated table refresh that allows you to refresh a table without interrupting replication for the remaining tables in the channel. This makes the refresh less disruptive (and/or you no longer have to jump through hoops to avoid refresh impacting replication).

    I hope you can understand that we do not plan to allow you to split a single transaction over multiple transaction files.

    Thank you,
    Mark.

  • Mark Van de Wiel User

    Note alternatively if the source transaction could commit (more) frequently then of course we can send changes through the channel at smaller increments.

    Mark.