finding vals from table with variable keys

There is a table:

key consists from 3 suffixes: region+s1+s2

region, like US is always specified, but other ones can be not specified so * will be used for "all".

for example: for key = "US_A_U" value = 2, because:

  • trying to find full match: find in the table key ("US_A_U") - not found
  • 1 step less strict find: find key ("US_A_*") - found == 2
  • for key = "US_Q_Q" value = 3, because:

  • trying to find full match: find in the table key ("US_Q_Q") - not found
  • 1 step less strict find: find key ("US_Q_*") - not found
  • find key ("US_*_Q") - not found
  • 1 step less strict find: find key (" US_*_* ") - found = 3
  • for key = "US_O_P" value = 3, because:

  • trying to find full match: find in the table key ("US_O_P") - not found
  • 1 step less strict find: find key ("US_O_*") - not found
  • find key ("US_*_P") - not found
  • 1 step less strict find: find key (" US_*_* ") - found = 3
  • so to use HashMap method I will need to call 4 times map.get() to find a value, which is too many as this code will be run very very often.

    Is there any nicer or faster solutions?

    package test;
    
    import java.util.HashMap;
    
    public class MainCLass {
    
        public static void main(String[] args) {
            // init map (assuming this code will be run only once)
            HashMap<String, String> map = new HashMap<>();
            map.put("US_A_B", "1");
            map.put("US_A_*", "2");
            map.put("US_*_*", "3");
            map.put("US_O_O", "4");
            map.put("US_*_W", "5");
            map.put("ASIA_*_*", "6");
    
            // now often called logic
            // incoming params, for this example hardcoded
            String reg = "US";
            String s1 = "O";
            String s2 = "P";
            String val = null;
            val = map.get(reg+"_"+s1+"_"+s2);
            if (val == null){
                val = map.get(reg+"_"+s1+"_*");
                if (val == null){
                    val = map.get(reg+"_"+"*_"+s2);
                    if (val == null){
                        val = map.get(reg+"_*_*");
                    }
                }
            }
            System.out.println(val);
        }
    }
    

    upd: I need to add that there are always 3 incoming params (region, s1, s2). Each of this param never will equal "*" and never be empty, so the full key always be like US_J_K (and not US_*_K etc.)

    so by these 3 params I need to find right value from the init table.


    You could try creating a tier of maps such as

    Map<String, Map<String, Map<String, String>>> map;
    

    In this map the first key is region, the second key is s1, and the third key is s2. This will allow To easily search for region, s1, and s2 independently.

    EDIT:

    Example usage with searching for "US_O_P"

    public static void main(String[] args) {
        RegionMap map = new RegionMap();
        String region = "US";
        String s1 = "O";
        String s2 = "P";
        String val = map.search(region, s1, s2);
        System.out.println(val);
    }
    
    public class RegionMap {
        private Map<String, Map<String, Map<String, String>>> regionMap;
    
        public RegionMap() {
            init();
        }
    
        public String search(String region, String s1, String s2) {
            String val = searchS1(regionMap.get(region), s1, s2);
            if (val == null) {
                val = searchS1(regionMap.get("*"), s1, s2);
            }
            return val;
        }
    
        private String searchS1(Map<String, Map<String, String>> s1Map, String s1, String s2) {
            if (s1Map == null) {
                return null;
            }
            String val = searchS2(s1Map.get(s1), s2);
            if (val == null) {
                val = searchS2(s1Map.get("*"), s2);
            }
            return val;
        }
    
        private String searchS2(Map<String, String> s2Map, String s2) {
            if (s2Map == null) {
                return null;
            }
            String val = s2Map.get(s2);
            if (val == null) {
                val = s2Map.get("*");
            }
            return val;
        }
    
        private void init() {
            regionMap = new HashMap<>();
            addEntry("US", "A", "B", "1");
            addEntry("US", "A", "*", "2");
            addEntry("US", "*", "*", "3");
            addEntry("US", "O", "O", "4");
            addEntry("US", "*", "W", "5");
            addEntry("ASIA", "*", "*", "6");
        }
    
        private void addEntry(String region, String s1, String s2, String value) {
            Map<String, Map<String, String>> s1Map = regionMap.get(region);
            if (s1Map == null) {
                s1Map = new HashMap<>();
                regionMap.put(region, s1Map);
            }
    
            Map<String, String> s2Map = s1Map.get(s1);
            if (s2Map == null) {
                s2Map = new HashMap<>();
                s1Map.put(s1, s2Map);
            }
    
            s2Map.put(s2, value);
        }
    }
    

    EDIT: Benchmark results

    I ran tests for searching for "US_O_P" multiple times and found the following results for 1,000,000,000 searches

    Original: 9.7334702479 seconds
    Tiered: 2.471287074 seconds
    

    The following is the benchmark code

    public class RegionMapOrig {
        private Map<String, String> map;
    
        public RegionMapOrig() {
            init();
        }
    
        private void init() {
            map = new HashMap<>();
            map.put("US_A_B", "1");
            map.put("US_A_*", "2");
            map.put("US_*_*", "3");
            map.put("US_O_O", "4");
            map.put("US_*_W", "5");
            map.put("ASIA_*_*", "6");
        }
    
        public String search(String reg, String s1, String s2) {
            String val = null;
            val = map.get(reg + "_" + s1 + "_" + s2);
            if (val == null) {
                val = map.get(reg + "_" + s1 + "_*");
                if (val == null) {
                    val = map.get(reg + "_" + "*_" + s2);
                    if (val == null) {
                        val = map.get(reg + "_*_*");
                    }
                }
            }
            return val;
        }
    }
    
    private static final int N = 1000000000;
    
    public static void main(String[] args) {
        String region = "US";
        String s1 = "O";
        String s2 = "P";
    
        testOrig(region, s1, s2);
        test(region, s1, s2);
    }
    
    private static void testOrig(String region, String s1, String s2) {
        RegionMapOrig map = new RegionMapOrig();
    
        long start = System.nanoTime();
    
        for (int i = 0; i < N; ++i) {
            String val = map.search(region, s1, s2);
        }
    
        long end = System.nanoTime();
        System.out.println((end - start) / 10E9);
    }
    
    private static void test(String region, String s1, String s2) {
        RegionMap map = new RegionMap();
    
        long start = System.nanoTime();
    
        for (int i = 0; i < N; ++i) {
            String val = map.search(region, s1, s2);
        }
    
        long end = System.nanoTime();
        System.out.println((end - start) / 10E9);
    }
    

    Running this code multiple times have yielded the same results. However, this benchmark is a simple and may not be definitive. To truly test your results you will need to analyze the performance with a real data set that represents your typical values. I believe your performance issue may lie within your string concatenation and not how many calls to the map. The other reason why mine may have performed better is that my internal maps may be cached making multiple retrievals faster.

    EDIT: Benchmark update

    After further investigation by removing string concatentation your original code improved showing these results:

    Orginal (no concatentation): 1.2068575417 seconds
    Tiered: 2.2982665873 seconds
    

    The code changes are:

    public String searchNoCat(String cache1, String cache2, String cache3,  String cache4) {
        String val = null;
        val = map.get(cache1);
        if (val == null) {
            val = map.get(cache2);
            if (val == null) {
                val = map.get(cache3);
                if (val == null) {
                    val = map.get(cache4);
                }
            }
        }
        return val;
    }
    
    private static void testOrigNoCat(String region, String s1, String s2) {
        RegionMapOrig map = new RegionMapOrig();
    
        String cache1 = region + "_" + s1 + "_" + s2;
        String cache2 = region + "_" + s1 + "_*";
        String cache3 = region + "_" + "*_" + s2;
        String cache4 = region + "_*_*";
    
        long start = System.nanoTime();
    
        for (int i = 0; i < N; ++i) {
            String val = map.searchNoCat(cache1, cache2, cache3, cache4);
        }
    
        long end = System.nanoTime();
        System.out.println((end - start) / 10E9);
    }
    

    However, the issue still remains on how to efficiently cache such values or reduce the number of concatenations for generic input. I do not know of an efficient way to do this. Therefore, I think that the tiered map is an efficient solution that eludes the concatenation problem.


    It looks like you need some tree structure to help you encapsulating the logic with the wildcards ("*") replacements when searching for a value.

    First I wrote some unit tests to describe the expected behaviour

    import static org.junit.Assert.*;
    
    import org.junit.Before;
    import org.junit.Test;
    
    public class WildcardSearchSpec {
        private Node root;
    
        @Before
        public void before() {
            root = new WildcardSearch();
            root.add("US_A_B", "1");
            root.add("US_A_*", "2");
            root.add("US_*_*", "3");
            root.add("US_O_O", "4");
            root.add("US_*_W", "5");
            root.add("ASIA_*_*", "6");
        }
    
        @Test
        public void itShouldReturnFullWildcardCorrespondingValue() {
            String key = "US_Q_Q";
    
            String value = root.value(key);
    
            assertEquals("3", value);
        }
    
        @Test
        public void itShouldReturnNoWildcardCorrespondingValue() {
            String key = "US_A_B";
    
            String value = root.value(key);
    
            assertEquals("1", value);
        }
    
        @Test
        public void itShouldReturnS2WildcardCorrespondingValue() {
            String key = "US_A_U";
    
            String value = root.value(key);
    
            assertEquals("2", value);
        }
    
        @Test
        public void itShouldReturnS1WidlcardCorrespondingValue() {
            String key = "US_W_W";
    
            String value = root.value(key);
    
            assertEquals("5", value);
        }
    
        @Test(expected=NoValueException.class)
        public void itShouldThrowWhenNoCorrespondingValue() {
            String key = "EU_A_B";
    
            root.value(key);
    
            fail();
        }
    }
    

    The interface one can extract from these tests is the following

    public interface Node {
        void add(String key, String value);
        String value(String key);
    }
    

    Which is implemented by WildcardSearch

    import java.util.HashMap;
    import java.util.Map;
    
    public final class WildcardSearch implements Node {
        private final Map<String, CountrySearch> children = new HashMap<>();
    
        @Override
        public void add(String key, String value) {
            String country = key.split("_")[0];
            String rest = key.substring(country.length() + 1);
    
            children.putIfAbsent(country, new CountrySearch());
            children.get(country).add(rest, value);
        }
    
        @Override
        public String value(String key) {
            String country = key.split("_")[0];
            String rest = key.substring(country.length() + 1);
    
            if (!children.containsKey(country)) {
                return children.get(country).value(rest);
            } else {
                throw new NoValueException();
            }
        }
    }
    

    WildcardSearch uses CountrySearch to delegate the search in each country.

    import java.util.HashMap;
    import java.util.Map;
    
    final class CountrySearch implements Node {
        private final Map<String, SuffixeSearch> children = new HashMap<>();
    
        @Override
        public void add(String key, String value) {
            String[] splittedKey = key.split("_");
            String s1 = splittedKey[0];
            String s2 = splittedKey[1];
            children.putIfAbsent(s1, new SuffixeSearch());
            children.get(s1).add(s2, value);
        }
    
        @Override
        public String value(String key) {
            String[] splittedKey = key.split("_");
            String s1 = splittedKey[0];
            String s2 = splittedKey[1];
    
            if (children.containsKey(s1)) {
                return children.get(s1).value(s2);
            } else if (children.containsKey("*")) {
                return children.get("*").value(s2);
            } else {
                throw new NoValueException();
            }
        }
    }
    

    CountrySearch uses SuffixeSearch to delegate the search in the suffixes.

    import java.util.HashMap;
    import java.util.Map;
    
    final class SuffixeSearch implements Node {
        private final Map<String, String> children = new HashMap<>();
    
        public void add(String key, String value) {
            children.put(key, value);
        }
    
        @Override
        public String value(String key) {
            if (children.containsKey(key)) {
                return children.get(key);
            } else if (children.containsKey("*")) {
                return children.get("*");
            } else {
                throw new NoValueException();
            }
        }
    }
    

    Note: NoValueException is a custom RuntimeException .

    The point is that each responsibility is clearly separated.

    SuffixeSearch is only able to return the value for the corresponding key or the value corresponding to "*". It doesn't know anything about how is the overall key structured, nor the values are clustered by country, etc.

    CountrySearch only knows about its level, delegating the rest to SuffixeSearch or ignoring what is above.

    WildcardSearch only knows about splitting in country and delegates to CountrySearch the responsibility to do the wildcard magic.


    Best and more general solution would be to use a Search Tree which you could implement yourself fairly easily and is a good programming exercise as well. There are also lots of tutorials and examples around, how to implement it.

    For your special use case you could make use of cascading Maps, as DragonAssassin aready posted, which leverages what Java already offers.

    链接地址: http://www.djcxy.com/p/94266.html

    上一篇: 为什么相关性只包含在发布版本中?

    下一篇: 用变量键从表中找到val