What is the difference between Keras and Tensorflow?

Hello, everyone. I just learned neural network. I want to do a regression experiment to test whether neural networks can learn this special case. This is a simple experiment, training data is randomly generated 10000 pieces of data, each data feature vector dimension of 10x1Magi label is the first element value of the feature vector.

from numpy.random import RandomState
rdm=RandomState(1)
data_size=10000
xdim=10
X=rdm.rand(data_size,xdim)
Y = [x1[0] for x1 in X]

I use a layer-by-layer network to train, and the expected output is Weights=, bias=0.

I wrote tensorflow and keras respectively. It is strange that keras can get the correct results, but the training of tensorflow can not be converged. And the two versions of loss are very different.

Tensorflow version:

import tensorflow as tf
x=tf.placeholder(tf.float64,shape=(None,xdim))
y=tf.placeholder(tf.float64,shape=(None))
Weights = tf.Variable(tf.random_normal([xdim, 1],dtype=tf.float64))
biases = tf.Variable(0.1,dtype=tf.float64)
y_predict = tf.matmul(x, Weights) + biases
loss = tf.reduce_mean(tf.square(y_predict - y))
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

batch_size=100
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10001):
        start = i * batch_size % data_size
        end = min(start + batch_size,data_size)
        sess.run(optimizer,feed_dict={x:X[start:end],y:Y[start:end]})
        if i % 1000 == 0:
            ypred,training_loss= sess.run([y_predict,loss],feed_dict={x:X,y:Y})
            print("Epoch %d: loss=%g"%(i,training_loss))

output of Tensorflow version:

Epoch 0: loss=1.0679
Epoch 1000: loss=0.11685
Epoch 2000: loss=0.0842979
Epoch 3000: loss=0.0827121
Epoch 4000: loss=0.0824983
Epoch 5000: loss=0.0824296
Epoch 6000: loss=0.0824021
Epoch 7000: loss=0.0823903
Epoch 8000: loss=0.0823851
Epoch 9000: loss=0.0823826
Epoch 10000: loss=0.0823814

Keras version:

from keras.models import Sequential
from keras.layers import Dense
import numpy as np

model = Sequential()
model.add(Dense(units=1, input_dim=xdim)) 
model.compile(loss="mse", optimizer="sgd")

batch_size=100
for i in range(10001):
    start = i * batch_size % data_size
    end = min(start + batch_size,data_size)
    cost = model.train_on_batch(X[start:end], np.array(Y[start:end]))
    if i % 1000 == 0:
        print("Epoch %d: loss=%g"%(i,cost))

output of Keras version:

Epoch 0: loss=0.261707
Epoch 1000: loss=0.00811771
Epoch 2000: loss=0.000325865
Epoch 3000: loss=2.21623e-05
Epoch 4000: loss=4.63907e-06
Epoch 5000: loss=1.66684e-06
Epoch 6000: loss=6.55329e-07
Epoch 7000: loss=2.61024e-07
Epoch 8000: loss=1.04213e-07
Epoch 9000: loss=4.16416e-08
Epoch 10000: loss=1.66369e-08

I think the two pieces of code should be equivalent, but the difference is that I don"t know how to set learning rate inside keras. Why can keras get the right results but not tensorflow? Excuse me, which side did I make a mistake?


    There is something wrong with the
  • tf part of the code. y is not the same as shape of y_predict . The former is (batch size,) , and the latter is (batch size, 1)
  • after calculating wx+b , you can adjust shape of y_predict with tf.squeeze () .
...
y_predict = tf.matmul(x, Weights)+biases
-sharp squeezey_predicty
y_predict = tf.squeeze(y_predict)
loss = tf.reduce_mean(tf.square(y_predict - y))
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
...
  • add 1 mse can be directly calculated by tf.losses.mean_squared_error () instead of tf.reduce_mean (y_predict-y)) .
  • add 2 sgd default learning rate of Keras is 0.01, which can be found in the documentation and source code . To ensure consistency with tf, you can also manually specify the learning rate optimizer=keras.optimizers.SGD (lr=0.01) .
  • add 3, your keras's Dense layer initialization method may be different from tf's, although the final convergence is the same. You can specify it through model.add (Dense (units=1, input_dim=xdim,kernel_initializer='weights initialization method', bias_initializer='biases initialization method') .
Menu