Java中小型静态数据集的内存中自动完成实现

如何解决Java中小型静态数据集的内存中自动完成实现

我正在尝试为少量静态数据5K记录实现自动完成功能。

我想到了使用前缀树来支持此类查询

class AutocompleteSearch {
    class Entry {
        String sentence;
        int times;
        
        Entry(String sentence,int times) {
            this.sentence = sentence;
            this.times = times;
        }
    }
    
    class TrieNode {
        TrieNode[] children;
        int times;
        
        TrieNode() {
            children = new TrieNode[27];
            times = 0;
        }
    }
    
    private TrieNode root;
    private TrieNode prevIoUs;
    private String query;

    public AutocompleteSystem(String[] sentences,int[] times) {
        root = new TrieNode();
        query = "";
        
        for (int i = 0; i < sentences.length; i++) {
            insert(sentences[i],times[i]);
        }
    }
    
    public List<String> input(char c) {
        List<String> result = new ArrayList<>();
        if (c == '#') {
            insert(query,1);
            prevIoUs = null;
            query = "";
            return result;
        }
        
        query += c;
        List<Entry> history = lookup(c);
        history.sort((a,b) -> {
            if (a.times == b.times) {
                return a.sentence.compareto(b.sentence);
            }
            return b.times - a.times;
        });
        for (int i = 0; i < Math.min(history.size(),3); i++) {
            result.add(history.get(i).sentence);
        }
        return result;
    }
    
    private void insert(String sentence,int times) {
        TrieNode current = root;
        for (char c : sentence.tochararray()) {
            int index = c == ' ' ? 26 : c - 'a';
            if (current.children[index] == null) {
                current.children[index] = new TrieNode();
            }
            current = current.children[index];
        }
        current.times += times;
    }
    
    private List<Entry> lookup(char c) {
        List<Entry> history = new ArrayList<>();
        if (prevIoUs == null && query.length() > 1) {
            return history;
        }
        
        TrieNode current = prevIoUs == null ? root : prevIoUs;
        int index = c == ' ' ? 26 : c - 'a';
        if (current.children[index] == null) {
            prevIoUs = null;
            return history;
        }
        
        prevIoUs = current.children[index];
        traverse(query,prevIoUs,history);
        return history;
    }
    
    private void traverse(String s,TrieNode node,List<Entry> history) {
        if (node.times > 0) {
            history.add(new Entry(s,node.times));
        }
        
        for (int i = 0; i < 27; i++) {
            if (node.children[i] != null) {
                String next = i == 26 ? s + ' ' : s + (char) ('a' + i);
                traverse(next,node.children[i],history);
            }
        }
    }
}

问题是如果我要在full_name上扩展自动完成的解决方案以支持其他字段，例如可以说我有记录为

Student
id:
full_name:
gender:
subject_enrolled:

，我不仅要自动完成搜索，还要对subject_enrolled或其他字段进行过滤。我想到了使用倒排索引将每个字段映射到对应的学生ID列表，然后与自动完成但不如自动完成效率高的学生ID相交。

有人对数据结构，可以更精确地使用的库有任何建议吗？我正在寻找mongodb /弹性搜索的内存替代方案，例如查询支持？