Hashing polymorphic type the proper way

2018-07-04 20:55:42

I have a hash process implemented using Howard Hinnant's method (generic hash based on hash_append overloads).

The purpose of that method is to create hash of classes in order to "memoize" result of computations (see end of this answer), so I am facing some issue. In particular, consider the following possible Input class that needs to be hashed:

struct A {
    virtual int do_stuff() const = 0;
    virtual ~A(); 
};
struct B: A {
    int do_stuff() const override { return 0; }
};
struct C: A {
    const int u;
    int do_stuff() const override { return u; }
};

struct Input {
    A const& a; // store a reference to an instance of B or C
};

Now, if I want to hash Input , I will have something like:

template <class HashAlgorithm>
void hash_append(HashAlgorithm& h, Input const& input) {
    hash_append(h, typeid(input));
    hash_append(h, typeid(input.a));
}

So I need an overload of hash_append for A :

template <class HashAlgorithm>
void hash_append(HashAlgorithm& h, A const& a) {
    hash_append(h, typeid(a)); 
}

The problem here is that depending on the runtime type of a , I would need to add extra information to the hash, eg for C I would need to add u .

I thought about the following solutions (and drawbacks):

add a virtual method to A that returns a specific value that can be added to the typeid() hash, but:

this means adding a method inside A that is not related to the purpose of A , thus I don't really like this idea (in particular because I have multiple A -like classes);

this breaks the concept of hash_append since the method will have a unique return type for all inheriting classes.

do a bunch of dynamic_cast inside hash_append :

I found this pretty ugly... in particular if I have multiple classes similar to A ;

this is error-prone: if someone adds a new children of A and do not add a dynamic_cast inside hash_append .

Is there a way to hash a polymorphic type, without having to modify the type itself or rely on a bunch of dynamic_cast ?

The final goal of this is to be able to memoize results of some heavy functions. Let's sketch the basic structure of my application:

struct Input { };
struct Result { };

Result solve(Input const&);

The solve function is computationally-heavy, so I want to save the results of previous computation in file using hash of Input s, eg something like:

// depends on hash_append
std::string hash(Input const&);

Result load_or_solve(Input const& input) {
    auto h = hash(input);
    Result result;
    if (exists(h)) { // if result exists, load it
        result = load(h);
    }
    else { // otherwize, solve + store
        result = solve(input);
        store(h, result);
    }
    return result;
}

The load and store methods would load and store results from files, the goal is to memoize solutions between different runs.

If you have suggestion on how to memoize these results without having to deal with the above issues, I'll be glad to read them.

You can use double dispatching within the hash_append version of A and forward the request to the proper version (that is the one either for B or C ). The drawback is that you must add boilerplate to those classes to accept a visitor and I cannot say if it's acceptable for you.
Here is a bunch of code that should illustrate the idea:

struct B;
struct C;

struct Visitor {
    virtual void visit(const B &) = 0;
    virtual void visit(const C &) = 0;
};

template<typename T, typename... O>
struct HashVisitor: T, HashVisitor<O...> {
    template<typename U>
    std::enable_if_t<std::is_same<T, U>::value> tryVisit(const U &u) {
        T::operator()(u);
    }

    template<typename U>
    std::enable_if_t<not std::is_same<T, U>::value> tryVisit(const U &u) {
        HashVisitor<O...>::visit(u);
    }

    void visit(const B &b) override { tryVisit<B>(b); }
    void visit(const C &c) override { tryVisit<C>(c); }
};

template<>
struct HashVisitor<>: Visitor {};

template<typename... F
auto factory(F&&... f) {
    return HashVisitor<std::decay_t<F>>{std::forward<F>(f)...};
}

struct A {
    virtual void accept(Visitor &) = 0;
    virtual int do_stuff() const = 0;
    virtual ~A();
};

struct B: A {
    void accept(Visitor &v) override { v.visit(*this); }
    int do_stuff() const override { return 0; }
};

struct C: A {
    const int u;
    void accept(Visitor &v) override { v.visit(*this); }
    int do_stuff() const override { return u; }
};

template <class HashAlgorithm>
void hash_append(HashAlgorithm &, const B &) {
    // do something
}

template <class HashAlgorithm>
void hash_append(HashAlgorithm &, const C &) {
    // do something
}

template <class HashAlgorithm>
void hash_append(HashAlgorithm &h, const A &a) {
    auto vis = factory(
        [&h](const B &b){ hash_append(h, b); },
        [&h](const C &c){ hash_append(h, c); }
    );

    a.accept(vis);
}

链接地址: http://www.djcxy.com/p/97084.html

上一篇: 在Ruby中从文件名动态创建自动加载命令

下一篇: 散列多态类型是正确的方式