如何解决将字符串转换为位并将其写入Java文件中
我编写了一个关于 Huffman 压缩的程序,最终我可以操作读取的字符串并将其更改为特定于 Huffman。例如,对于输入的“abceffg”字符串,我可以获得“1001010000010101111011”字符串。我的问题是:我必须将这个字符串写入文件并保存。我也知道文件操作,但我不知道如何以位为单位编写这个字符串。我现在从文件中读取的字符串“abcdeffg”是 8 个字节,而我创建的字符串“1001010000010101111011”是 22 个字节。因此,如果我将其写入文件,则不会进行压缩。如何将这个 0 和 1 的字符串写入文件以使霍夫曼压缩工作?到目前为止我写的代码如下;
HuffmanEncodedResult 类:
class HuffmanEncodedResult {
final Node root;
final String encodedData;
public HuffmanEncodedResult(final String encodedData,final Node root) {
this.encodedData = encodedData;
this.root = root;
}
public Node getRoot() {
return this.root;
}
public String getEncodedData() {
return this.encodedData;
}
}
HuffmanEncoder 类:
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.Map;
import java.util.PriorityQueue;
public class HuffmanEncoder {
private static final int LETTER_SIZE = 256;
public HuffmanEncodedResult compress(final String data){
final int[] freqTable = buildFreqTable(data);
final Node root = buildHuffmanTree(freqTable);
final Map<Character,String> lookupTable = buildLookupTable(root);
return new HuffmanEncodedResult(generateEncodedData(data,lookupTable),root);
}
private static String generateEncodedData(String data,Map<Character,String> lookupTable) {
final StringBuilder builder = new StringBuilder();
for(final char character : data.tochararray()){
builder.append(lookupTable.get(character));
}
return builder.toString();
}
private static Map<Character,String> buildLookupTable(final Node root){
final Map<Character,String> lookupTable = new HashMap<>();
buildLookupTableImpl(root,"",lookupTable);
return lookupTable;
}
private static void buildLookupTableImpl(Node node,String s,String> lookupTable) {
if(!node.isLeaf()){
buildLookupTableImpl(node.leftChild,s + '0',lookupTable);
buildLookupTableImpl(node.rightChild,s + '1',lookupTable);
}else{
lookupTable.put(node.character,s);
}
}
private static Node buildHuffmanTree(int[] freq){
final PriorityQueue<Node> pq = new PriorityQueue<>();
for(char i = 0; i<LETTER_SIZE; i++){
if(freq[i] > 0){
pq.add(new Node(i,freq[i],null,null));
}
}
if(pq.size() == 1){
pq.add(new Node('\0',null));
}
while (pq.size() > 1){
final Node left = pq.poll();
final Node right = pq.poll();
assert right != null;
assert left != null;
final Node parent = new Node('\0',left.frequency + right.frequency,left,right);
pq.add(parent);
}
return pq.poll();
}
private static int[] buildFreqTable(final String data){
final int[] freq = new int[LETTER_SIZE];
for(final char character :data.tochararray()){
freq[character]++;
}
return freq;
}
public String decompress(HuffmanEncodedResult result){
final StringBuilder resultBuilder = new StringBuilder();
Node current = result.getRoot();
int i = 0;
while(i<result.getEncodedData().length()){
while(!current.isLeaf()){
char bit = result.getEncodedData().charat(i);
if(bit == '1'){
current = current.rightChild;
}else if(bit == '0'){
current = current.leftChild;
}else{
throw new IllegalArgumentException("Invalid Bit in Message! " + bit);
}
i++;
}
resultBuilder.append(current.character);
current = result.getRoot();
}
return resultBuilder.toString();
}
private byte[] toByteArray(String input){
//to chararray
char[] preBitChars = input.tochararray();
int bitShortage = (8 - (preBitChars.length%8));
char[] bitChars = new char[preBitChars.length + bitShortage];
System.arraycopy(preBitChars,bitChars,preBitChars.length);
for (int i= 0; i < bitShortage; i++) {
bitChars[preBitChars.length + i]='0';
}
//to bytearray
byte[] byteArray = new byte[bitChars.length/8];
for(int i=0; i<bitChars.length; i++) {
if (bitChars[i]=='1'){
byteArray[byteArray.length - (i/8) - 1] |= 1<<(i%8);
}
}
return byteArray;
}
public static void main(String[] args) {
final String test = "abcdeffg";
final HuffmanEncoder encoder = new HuffmanEncoder();
final HuffmanEncodedResult result = encoder.compress(test);
System.out.println("Encoded Message: " + result.encodedData);
System.out.println("Unencoded Message: " + encoder.decompress(result));
}
}
类节点:
public class Node implements Comparable<Node>{
protected final char character;
protected final int frequency;
protected final Node rightChild;
protected final Node leftChild;
public Node(char character,int frequency,Node leftChild,Node rightChild) {
this.character = character;
this.frequency = frequency;
this.rightChild = rightChild;
this.leftChild = leftChild;
}
public boolean isLeaf(){
return this.leftChild == null && this.rightChild == null;
}
@Override
public int compareto(Node o) {
final int frequencyComparison = Integer.compare(this.frequency,o.frequency);
if(frequencyComparison != 0){
return frequencyComparison;
}
return Integer.compare(this.character,o.character);
}
}
控制台输出:
"C:\Program Files\Java\jdk-15.0.1\bin\java.exe" "-javaagent:C:\Program Files\JetBrains\IntelliJ IDEA 2020.3.1\lib\idea_rt.jar=62965:C:\Program Files\JetBrains\IntelliJ IDEA 2020.3.1\bin" -Dfile.encoding=UTF-8 -classpath C:\Users\ayber\Desktop\untitled\out\production\untitled HuffmanEncoder
Encoded Message: 1001010000010101111011
Unencoded Message: abcdeffg
Process finished with exit code 0
解决方法
您需要使用 FileOutputStream.write
将数据写入字节数组,并将字节数组作为 1 和 0 字符串的二进制表示。
将字符串转换为位的一种方法是使用 BitSet
。
要创建正确大小的 BitSet,请使用:
BitSet b = new BitSet(string.length());
然后你可以遍历字符串并设置位:
for (int i = 0; i < string.length(); i++) {
if (string.charAt(i) == '1') {
b.set(i);
}
}
然后您可以将位集转换为字节数组并将其写入文件:
byte[] bytes = b.toByteArray();
FileOutputStream f = new FileOutputStream(filename);
f.write(bytes);
但是,我会注意到创建一个字符串,然后将其转换为位集,然后将其转换为字节数组并不是很有效。从一开始就使用 bitset 可能会更好。
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。