Reverse tree building (with an odd number of children)

I just found out about the AWS Glacier service and wanted to write a small Python application to upload files via the REST API. I took a look at the required headers and stumbled upon x-amz-sha256-tree-hash . I need to calculate the SHA-256 hash of the entire file as well as the hash of the parent of all hashes of each 1 MB chunk. This leads to the following tree:

AWS的SHA-256树哈希程序

(Image taken from here)

I already made a function that reads 1 MB chunks and a class which calculates their hashes on-the-fly but then I completely struggle:

In my application I made a class called chunk which takes data and calculates a hash in the __init__ method as well as holds parent and children (like a regular tree). When the user opens a file those chunks instances will be generated properly with their respective hashes (in this example that would be 7 chunk instances).

Now I have two big problems that are connected to each other:

  • How do I build this tree in reverse? I basically need to make a new chunk for each two chunk instances on the lowest layer and calculate a hash based on those two hashes. But where do I store that parent? In the children of the parent and do reverse tree walking?
  • How does that work with an odd number of children? If I have an algorithm that goes through each parent layer then I would miss the last (0.5 MB) chunk.
  • I checked out this topic on SO but that method only works with an even children count which is not always given.

    Can you help me finding a way/an algorithm/an approach on how to solve this issue?

    Thanks in advance!

    Paul


    First calculate the number of levels, then

    def proclevel(levels):
        if levels > 0:
            generator = proclevel(levels - 1)
            temp = None
            for firsthash, secondhash in generator:
                if not temp: temp = hashofthem(firsthash, secondhash)
                else: yield temp, hashofthem(firsthash, secondhash); temp = None
            #If odd number of packets
            if temp: yield temp, None
        else:
            temp = None
            for chunk in chunks:
                if not temp: temp = hash(chunk)
                else: yield temp, hash(chunk); temp = None
            if temp: yield temp, None
    

    Make sure to handle None as second argument in hashofthem :)

    链接地址: http://www.djcxy.com/p/62834.html

    上一篇: Backbone JS承诺在模型上设置属性之前解决

    下一篇: 反向树木建设(奇数个孩子)