微信公众号搜"智元新知"关注
微信扫一扫可直接关注哦!

将字符串转换为位并将其写入Java文件中

如何解决将字符串转换为位并将其写入Java文件中

我编写了一个关于 Huffman 压缩的程序,最终我可以操作读取的字符串并将其更改为特定于 Huffman。例如,对于输入的“abceffg”字符串,我可以获得“1001010000010101111011”字符串。我的问题是:我必须将这个字符串写入文件并保存。我也知道文件操作,但我不知道如何以位为单位编写这个字符串。我现在从文件中读取的字符串“abcdeffg”是 8 个字节,而我创建的字符串“1001010000010101111011”是 22 个字节。因此,如果我将其写入文件,则不会进行压缩。如何将这个 0 和 1 的字符串写入文件以使霍夫曼压缩工作?到目前为止我写的代码如下;

HuffmanEncodedResult 类:

class HuffmanEncodedResult {
    final Node root;
    final String encodedData;

    public HuffmanEncodedResult(final String encodedData,final Node root) {
        this.encodedData = encodedData;
        this.root = root;
    }

    public Node getRoot() {
        return this.root;
    }

    public String getEncodedData() {
        return this.encodedData;
    }
}

HuffmanEncoder 类:

import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.Map;
import java.util.PriorityQueue;

public class HuffmanEncoder {

    private static final int LETTER_SIZE = 256;

    public HuffmanEncodedResult compress(final String data){
        final int[] freqTable = buildFreqTable(data);
        final Node root = buildHuffmanTree(freqTable);
        final Map<Character,String> lookupTable = buildLookupTable(root);

        return new HuffmanEncodedResult(generateEncodedData(data,lookupTable),root);
    }

    private static String generateEncodedData(String data,Map<Character,String> lookupTable) {
        final StringBuilder builder = new StringBuilder();
        for(final char character : data.tochararray()){
            builder.append(lookupTable.get(character));
        }
        return builder.toString();
    }

    private static Map<Character,String> buildLookupTable(final Node root){
        final Map<Character,String> lookupTable = new HashMap<>();
        buildLookupTableImpl(root,"",lookupTable);

        return lookupTable;
    }

    private static void buildLookupTableImpl(Node node,String s,String> lookupTable) {
        if(!node.isLeaf()){
            buildLookupTableImpl(node.leftChild,s + '0',lookupTable);
            buildLookupTableImpl(node.rightChild,s + '1',lookupTable);
        }else{
            lookupTable.put(node.character,s);
        }
    }

    private static Node buildHuffmanTree(int[] freq){
        final PriorityQueue<Node> pq = new PriorityQueue<>();
        for(char i = 0; i<LETTER_SIZE; i++){
            if(freq[i] > 0){
                pq.add(new Node(i,freq[i],null,null));
            }
        }

        if(pq.size() == 1){
            pq.add(new Node('\0',null));
        }

        while (pq.size() > 1){
            final Node left = pq.poll();
            final Node right = pq.poll();
            assert right != null;
            assert left != null;
            final Node parent = new Node('\0',left.frequency + right.frequency,left,right);
            pq.add(parent);
        }
        return pq.poll();
    }

    private static int[] buildFreqTable(final String data){
        final int[] freq = new int[LETTER_SIZE];
        for(final char character :data.tochararray()){
            freq[character]++;
        }

        return freq;
    }

    public String decompress(HuffmanEncodedResult result){

        final StringBuilder resultBuilder = new StringBuilder();
        Node current = result.getRoot();
        int i = 0;

        while(i<result.getEncodedData().length()){
            while(!current.isLeaf()){
                char bit = result.getEncodedData().charat(i);
                if(bit == '1'){
                    current = current.rightChild;
                }else if(bit == '0'){
                    current = current.leftChild;
                }else{
                    throw new IllegalArgumentException("Invalid Bit in Message! " + bit);
                }
                i++;
            }
            resultBuilder.append(current.character);
            current = result.getRoot();
        }

        return resultBuilder.toString();
    }

    private byte[] toByteArray(String input){
        //to chararray
        char[] preBitChars = input.tochararray();
        int bitShortage = (8 - (preBitChars.length%8));
        char[] bitChars = new char[preBitChars.length + bitShortage];
        System.arraycopy(preBitChars,bitChars,preBitChars.length);

        for (int  i= 0;  i < bitShortage;  i++) {
            bitChars[preBitChars.length + i]='0';
        }

        //to bytearray
        byte[] byteArray = new byte[bitChars.length/8];
        for(int i=0; i<bitChars.length; i++) {
            if (bitChars[i]=='1'){
                byteArray[byteArray.length - (i/8) - 1] |= 1<<(i%8);
            }
        }
        return byteArray;
    }

    public static void main(String[] args) {
        final String test = "abcdeffg";
        final HuffmanEncoder encoder = new HuffmanEncoder();
        final HuffmanEncodedResult result = encoder.compress(test);
        System.out.println("Encoded Message: " + result.encodedData);
        System.out.println("Unencoded Message: " + encoder.decompress(result));
    }
}

类节点:

public class Node implements Comparable<Node>{
    protected final char character;
    protected final int frequency;
    protected final Node rightChild;
    protected final Node leftChild;

    public Node(char character,int frequency,Node leftChild,Node rightChild) {
        this.character = character;
        this.frequency = frequency;
        this.rightChild = rightChild;
        this.leftChild = leftChild;
    }

    public boolean isLeaf(){
        return this.leftChild == null && this.rightChild == null;
    }

    @Override
    public int compareto(Node o) {
        final int frequencyComparison = Integer.compare(this.frequency,o.frequency);
        if(frequencyComparison != 0){
            return frequencyComparison;
        }
        return Integer.compare(this.character,o.character);
    }
}

控制台输出

"C:\Program Files\Java\jdk-15.0.1\bin\java.exe" "-javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2020.3.1\lib\idea_rt.jar=62965:C:\Program Files\JetBrains\IntelliJ IDEA 2020.3.1\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\ayber\Desktop\untitled\out\production\untitled HuffmanEncoder
Encoded Message: 1001010000010101111011
Unencoded Message: abcdeffg

Process finished with exit code 0

解决方法

您需要使用 FileOutputStream.write 将数据写入字节数组,并将字节数组作为 1 和 0 字符串的二进制表示。

将字符串转换为位的一种方法是使用 BitSet

要创建正确大小的 BitSet,请使用:

BitSet b = new BitSet(string.length());

然后你可以遍历字符串并设置位:

for (int i = 0; i < string.length(); i++) {
    if (string.charAt(i) == '1') {
        b.set(i);
    }
}

然后您可以将位集转换为字节数组并将其写入文件:

byte[] bytes = b.toByteArray();
FileOutputStream f = new FileOutputStream(filename);
f.write(bytes);

但是,我会注意到创建一个字符串,然后将其转换为位集,然后将其转换为字节数组并不是很有效。从一开始就使用 bitset 可能会更好。

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。