如何将 tfrecords 转换为 numpy？

如何解决如何将 tfrecords 转换为 numpy？

我确实在这个记录中发现了类似的东西： How can I convert TFRecords into numpy arrays?。但是，它在我们使用图像作为数据的上下文中提到了这一点，而不是与 RNAseq 矩阵非常相似的数据，该矩阵的列名（来自原始矩阵）存储为字符串，值存储为 tf.float32。所以真正的问题是如何将这两个东西作为张量，然后将它们组合起来以获得完整的矩阵以供进一步分析。最后，我需要 6020 X 1422 的矩阵，其中 6020 是行（值）和 1422 列。

我使用以下命令读取 TFRecord 文件并将所有数据作为张量获取。

# its a mapping function
def parse_element(element):
    data = {
        'indices': tf.io.FixedLenSequenceFeature((),tf.int64,allow_missing = True),'values': tf.io.FixedLenSequenceFeature((),tf.float32,'names': tf.io.FixedLenFeature((),tf.string),}
    # create an example:
    content = tf.io.parse_single_example(element,data)
    #extract every element to theire respective tensors/ placeholders.
    indices = content['indices']
    values = content['values']
    colnames = content['names']
    
    return (indices,values,colnames)

# We create a TFRecordDataset by pointing it to the TFRecord file on our disk,and 
# then apply our prevIoUs parsing function to every extracted Example. 
# This returns a dataset:
#
def get_dataset(filename):
    # create the dataset
    dataset = tf.data.TFRecordDataset(filename,compression_type="GZIP")
    # pass every single feature through our mapping function.
    dataset = dataset.map(parse_element)
    # return the whole dataset
    return dataset

# load tfrecords using functions defined above get_dataset() and parse_element().
dataset_valid = get_dataset("TF_records/validate.tfrecords")
#print one record to have a look.
for data in dataset_valid.take(1):
    print(data[0])#indicies
    print(data[1])#values
    print(data[2])#colnames

这可能是一个非常幼稚的问题。