使用梯度带自动区分张量流并将其合并到 keras NN

如何解决使用梯度带自动区分张量流并将其合并到 keras NN

我正在尝试在 tensorflow2 的 keras API 中构建一个神经网络。在这个 NN 中，我将时间变量作为输入，并将来自物理方程的两个向量称为“a”和“b”作为输出。我正在尝试构建一个 NN，它在其损失函数中包含物理方程，即计算向量 a 相对于时间的导数。为此，我使用梯度磁带 API 来自动区分 TensorFlow。

据我从文档中了解到，使用梯度带计算非标量目标的导数是通过雅可比方法完成的。但是，当我通过 10 的批量大小并计算雅可比时，我得到了关于所有 10 个时间值的导数，这不是我想要的。我只想计算呈现给网络的 10 个向量的每个向量“a”的导数，仅相对于相应的时间值。

代码如下，我在阶梯函数中训练网络。至于方程，您可以在附图中看到它们，只有向量“a”和“b”是变量，而所有其他项都是标量，例如 \nu 或矩阵和张量。The physical equations under question.

import sys

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import scipy.io
from scipy.interpolate import griddata
import time
from itertools import product,combinations
from mpl_toolkits.mplot3d import Axes3D
from mpl_toolkits.mplot3d.art3d import poly3DCollection
from mpl_toolkits.axes_grid1 import make_axes_locatable
from sklearn.preprocessing import MinMaxScaler
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation,Dense 
from tensorflow.keras.optimizers import Adam
import matplotlib.gridspec as gridspec
import matplotlib.pyplot as plt

np.random.seed(1234)
tf.random.set_seed(1234)

神经网络类：

class MyModel:
    # Initialize the class
    def __init__(self,t,a,b,nu,M,B,C,K,P):
        # Initialize the values of the constants scalars,vectors and matrices involved in the          
        # equations,t is the input data matrix,while a and b are the output data matrices 
        self.t = t
        self.a = a
        self.b = b
        self.nu = nu
        self.M = M
        self.B = B
        self.C = C
        self.K = K
        self.P = P
        self.Nu = a.shape[1]
        self.Np = b.shape[1]
        
    def build_model(self):
        
        Nu = self.Nu
        Np = self.Np
        NoutputTotal = Nu + Np
        model1 = Sequential([
        Dense(units=100,input_shape=(1,),activation='sigmoid'),Dense(units=80,Dense(units=70,Dense(units=50,Dense(units=40,Dense(units=NoutputTotal,])

        return model1

    def step(self,y):
        
        Nu = self.Nu
        Np = self.Np
        a_tf = y[:,0:Nu]
        b_tf = y[:,Nu:]
        model = self.build_model()
        # I create a tensorflow variable in which I store the values of time in order to be able 
        # take the time derivatives
        x_in = tf.Variable(t)
        
        with tf.GradientTape() as tape:
            pred = model(x_in)
            a_pred = pred[:,0:Nu]
            b_pred = pred[:,Nu:]
        print(a_pred.shape)
        da_dt = tape.jacobian(a_pred,x_in)     
        print(da_dt.shape)
        # f_1 corresponds the first ODE equation while f_2 corresponds to the second matrix eq.
        f_1 = tf.tensordot(M,da_dt,1) - nu * tf.tensordot(B,a_pred,1) + tf.tensordot(tf.tensordot(a_pred.T,[[0],[1]]),1) + tf.tensordot(K,b_pred,1) 
        f_2 = tf.tensordot(P,1)
        # compute the loss as the MES of the data on "a" and "b" and also the one that comes
        # from the physical eqs described by f_1 and f_2            
        loss = tf.reduce_sum(tf.square(a_tf - a_pred)) + \
                    tf.reduce_sum(tf.square(b_tf - b_pred)) + \
                    tf.reduce_sum(tf.square(f_1)) + \
                    tf.reduce_sum(tf.square(f_2))

        grads = tape.gradient(loss,model.trainable_variables)
        opt.apply_gradients(zip(grads,model.trainable_variables))

我主要定义了时代数和批量大小。我使用阶跃函数来训练网络。但是，如前所述，当我尝试计算“a”的导数时遇到了问题。打印命令为我提供了 a_pred 的 (10,25) 形状（25 对应于“a”的维度）和 da_dt 的 (10,25,10,1) 形状，我期望 a_pred 的大小相同。值得一提的是，对于与需要导数的时间值不对应的条目，da_dt 的值为零。

if __name__ == "__main__": 
      
    EPOCHS = 20000
    BS = 10
    INIT_LR = 1e-4
    print("Loading the data")
    # Load Data
    exec(open('MatricesPostProcessing/coeffL2U_mat.py').read())
    exec(open('MatricesPostProcessing/coeffL2P_mat.py').read())
    exec(open('Matrices/C_mat.py').read())
    exec(open('Matrices/B_mat.py').read())
    exec(open('Matrices/K_mat.py').read())
    exec(open('Matrices/P_mat.py').read())
    exec(open('Matrices/M_mat.py').read())
    exec(open('tSnapshots_mat.py').read())
    t = tSnapshots
    T = t.shape[0]
    Nu = coeffL2U.shape[1]
    Np = coeffL2P.shape[1]
    # scale the data to the range of [0,1]
    scaler = MinMaxScaler(feature_range=(0,1))
    t_scaled = scaler.fit_transform(t.reshape(-1,1))
    coeffL2U_scaled = scaler.fit_transform(coeffL2U)
    coeffL2P_scaled = scaler.fit_transform(coeffL2P)
    a_and_b = np.concatenate([coeffL2U_scaled,coeffL2P_scaled],1)
    # Create a tensor tf version of each constant matrix and tensor appearing in the equations
    M_T = tf.constant(M)
    B_T = tf.constant(B)
    C_T = tf.constant(C)
    K_T = tf.constant(K)
    P_T = tf.constant(P)
    t_T = tf.constant(t_scaled)
    output_T = tf.constant(a_and_b)
    a_T = output_T[:,0:Nu]
    b_T = output_T[:,Nu:]
    nu = 1e-4
    # Training
    model = MyModel(t_T,a_T,b_T,M_T,B_T,C_T,K_T,P_T)
    # compute the number of batch updates per epoch
    numUpdates = int(t_T.shape[0] / BS)
    # loop over the number of epochs
    for epoch in range(0,EPOCHS):
        # show the current epoch number
        print("[INFO] starting epoch {}/{}...".format(
            epoch + 1,EPOCHS),end="")
        sys.stdout.flush()
        epochStart = time.time()
        # loop over the data in batch size increments
        for i in range(0,numUpdates):
            # determine starting and ending slice indexes for the current
            # batch
            start = i * BS
            end = start + BS
            # take a step
            model.step(t_T[start:end],output_T[start:end])
        # show timing information for the epoch
        epochEnd = time.time()
        elapsed = (epochEnd - epochStart) / 60.0
        print("took {:.4} minutes".format(elapsed))

对构建此类神经网络有任何更正或有用的评论吗？

使用梯度带自动区分张量流并将其合并到 keras NN

如何解决使用梯度带自动区分张量流并将其合并到 keras NN

相关推荐