CNTK - 逻辑回归模型


本章讨论在 CNTK 中构建逻辑回归模型。

逻辑回归模型基础知识

逻辑回归是最简单的机器学习技术之一,是一种专门用于二元分类的技术。换句话说,在要预测的变量值可以是两个分类值之一的情况下创建预测模型。逻辑回归最简单的例子之一是根据人的年龄、声音、头发等来预测人是男性还是女性。

例子

让我们借助另一个例子从数学上理解逻辑回归的概念 -

假设,我们想要预测贷款申请的信用度;根据申请人的债务、收入信用评级,0 表示拒绝,1 表示批准。我们用 X1 表示债务,用 X2 表示收入,用 X3 表示信用评级。

在逻辑回归中,我们为每个特征确定一个权重值(用w表示)和一个偏差值(用b表示) 。

现在假设,

X1 = 3.0
X2 = -2.0
X3 = 1.0

假设我们确定权重和偏差如下 -

W1 = 0.65, W2 = 1.75, W3 = 2.05 and b = 0.33

现在,为了预测类别,我们需要应用以下公式 -

Z = (X1*W1)+(X2*W2)+(X3+W3)+b
i.e. Z = (3.0)*(0.65) + (-2.0)*(1.75) + (1.0)*(2.05) + 0.33
= 0.83

接下来,我们需要计算P = 1.0/(1.0 + exp(-Z))。这里,exp()函数是欧拉数。

P = 1.0/(1.0 + exp(-0.83)
= 0.6963

P 值可以解释为类别为 1 的概率。如果 P < 0.5,则预测为类别 = 0,否则预测 (P >= 0.5) 为类别 = 1。

为了确定权重和偏差的值,我们必须获得一组具有已知输入预测变量值和已知正确类别标签值的训练数据。之后,我们可以使用一种算法,通常是梯度下降,来找到权重和偏差的值。

LR模型实现示例

对于这个 LR 模型,我们将使用以下数据集 -

1.0, 2.0, 0
3.0, 4.0, 0
5.0, 2.0, 0
6.0, 3.0, 0
8.0, 1.0, 0
9.0, 2.0, 0
1.0, 4.0, 1
2.0, 5.0, 1
4.0, 6.0, 1
6.0, 5.0, 1
7.0, 3.0, 1
8.0, 5.0, 1

要在 CNTK 中启动 LR 模型实现,我们需要首先导入以下包 -

import numpy as np
import cntk as C

该程序的结构如下: main() 函数 -

def main():
print("Using CNTK version = " + str(C.__version__) + "\n")

现在,我们需要将训练数据加载到内存中,如下所示 -

data_file = ".\\dataLRmodel.txt"
print("Loading data from " + data_file + "\n")
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[0,1])
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2)

现在,我们将创建一个训练程序,该程序创建与训练数据兼容的逻辑回归模型 -

features_dim = 2
labels_dim = 1
X = C.ops.input_variable(features_dim, np.float32)
y = C.input_variable(labels_dim, np.float32)
W = C.parameter(shape=(features_dim, 1)) # trainable cntk.Parameter
b = C.parameter(shape=(labels_dim))
z = C.times(X, W) + b
p = 1.0 / (1.0 + C.exp(-z))
model = p

现在,我们需要创建 Lerner 和 trainer,如下所示 -

ce_error = C.binary_cross_entropy(model, y) # CE a bit more principled for LR
fixed_lr = 0.010
learner = C.sgd(model.parameters, fixed_lr)
trainer = C.Trainer(model, (ce_error), [learner])
max_iterations = 4000

LR模型训练

一旦我们创建了 LR 模型,接下来就该开始训练过程了 -

np.random.seed(4)
N = len(features_mat)
for i in range(0, max_iterations):
row = np.random.choice(N,1) # pick a random row from training items
trainer.train_minibatch({ X: features_mat[row], y: labels_mat[row] })
if i % 1000 == 0 and i > 0:
mcee = trainer.previous_minibatch_loss_average
print(str(i) + " Cross-entropy error on curr item = %0.4f " % mcee)

现在,借助以下代码,我们可以打印模型权重和偏差 -

np.set_printoptions(precision=4, suppress=True)
print("Model weights: ")
print(W.value)
print("Model bias:")
print(b.value)
print("")
if __name__ == "__main__":
main()

训练逻辑回归模型 - 完整示例

import numpy as np
import cntk as C
   def main():
print("Using CNTK version = " + str(C.__version__) + "\n")
data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file
print("Loading data from " + data_file + "\n")
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[0,1])
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2)
features_dim = 2
labels_dim = 1
X = C.ops.input_variable(features_dim, np.float32)
y = C.input_variable(labels_dim, np.float32)
W = C.parameter(shape=(features_dim, 1)) # trainable cntk.Parameter
b = C.parameter(shape=(labels_dim))
z = C.times(X, W) + b
p = 1.0 / (1.0 + C.exp(-z))
model = p
ce_error = C.binary_cross_entropy(model, y) # CE a bit more principled for LR
fixed_lr = 0.010
learner = C.sgd(model.parameters, fixed_lr)
trainer = C.Trainer(model, (ce_error), [learner])
max_iterations = 4000
np.random.seed(4)
N = len(features_mat)
for i in range(0, max_iterations):
row = np.random.choice(N,1) # pick a random row from training items
trainer.train_minibatch({ X: features_mat[row], y: labels_mat[row] })
if i % 1000 == 0 and i > 0:
mcee = trainer.previous_minibatch_loss_average
print(str(i) + " Cross-entropy error on curr item = %0.4f " % mcee)
np.set_printoptions(precision=4, suppress=True)
print("Model weights: ")
print(W.value)
print("Model bias:")
print(b.value)
if __name__ == "__main__":
  main()

输出

Using CNTK version = 2.7
1000 cross entropy error on curr item = 0.1941
2000 cross entropy error on curr item = 0.1746
3000 cross entropy error on curr item = 0.0563
Model weights:
[-0.2049]
   [0.9666]]
Model bias:
[-2.2846]

使用经过训练的 LR 模型进行预测

一旦 LR 模型经过训练,我们就可以使用它进行预测,如下所示 -

首先,我们的评估程序导入 numpy 包并将训练数据加载到特征矩阵和类标签矩阵中,其方式与我们上面实现的训练程序相同 -

import numpy as np
def main():
data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=(0,1))
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=[2], ndmin=2)

接下来,是时候设置由我们的训练计划确定的权重和偏差的值了 -

print("Setting weights and bias values \n")
weights = np.array([0.0925, 1.1722], dtype=np.float32)
bias = np.array([-4.5400], dtype=np.float32)
N = len(features_mat)
features_dim = 2

接下来,我们的评估程序将通过遍历每个训练项目来计算逻辑回归概率,如下所示 -

print("item pred_prob pred_label act_label result")
for i in range(0, N): # each item
   x = features_mat[i]
   z = 0.0
   for j in range(0, features_dim):
   z += x[j] * weights[j]
   z += bias[0]
   pred_prob = 1.0 / (1.0 + np.exp(-z))
  pred_label = 0 if pred_prob < 0.5 else 1
   act_label = labels_mat[i]
   pred_str = ‘correct’ if np.absolute(pred_label - act_label) < 1.0e-5 \
    else ‘WRONG’
  print("%2d %0.4f %0.0f %0.0f %s" % \ (i, pred_prob, pred_label, act_label, pred_str))

现在让我们演示如何进行预测 -

x = np.array([9.5, 4.5], dtype=np.float32)
print("\nPredicting class for age, education = ")
print(x)
z = 0.0
for j in range(0, features_dim):
z += x[j] * weights[j]
z += bias[0]
p = 1.0 / (1.0 + np.exp(-z))
print("Predicted p = " + str(p))
if p < 0.5: print("Predicted class = 0")
else: print("Predicted class = 1")

完整的预测评估方案

import numpy as np
def main():
data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file
features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=(0,1))
labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",",
skiprows=0, usecols=[2], ndmin=2)
print("Setting weights and bias values \n")
weights = np.array([0.0925, 1.1722], dtype=np.float32)
bias = np.array([-4.5400], dtype=np.float32)
N = len(features_mat)
features_dim = 2
print("item pred_prob pred_label act_label result")
for i in range(0, N): # each item
   x = features_mat[i]
   z = 0.0
   for j in range(0, features_dim):
     z += x[j] * weights[j]
   z += bias[0]
   pred_prob = 1.0 / (1.0 + np.exp(-z))
   pred_label = 0 if pred_prob < 0.5 else 1
   act_label = labels_mat[i]
   pred_str = ‘correct’ if np.absolute(pred_label - act_label) < 1.0e-5 \
     else ‘WRONG’
  print("%2d %0.4f %0.0f %0.0f %s" % \ (i, pred_prob, pred_label, act_label, pred_str))
x = np.array([9.5, 4.5], dtype=np.float32)
print("\nPredicting class for age, education = ")
print(x)
z = 0.0
for j in range(0, features_dim):
   z += x[j] * weights[j]
z += bias[0]
p = 1.0 / (1.0 + np.exp(-z))
print("Predicted p = " + str(p))
if p < 0.5: print("Predicted class = 0")
else: print("Predicted class = 1")
if __name__ == "__main__":
  main()

输出

设置权重和偏差值。

Item  pred_prob  pred_label  act_label  result
0   0.3640         0             0     correct
1   0.7254         1             0      WRONG
2   0.2019         0             0     correct
3   0.3562         0             0     correct
4   0.0493         0             0     correct
5   0.1005         0             0     correct
6   0.7892         1             1     correct
7   0.8564         1             1     correct
8   0.9654         1             1     correct
9   0.7587         1             1     correct
10  0.3040         0             1      WRONG
11  0.7129         1             1     correct
Predicting class for age, education =
[9.5 4.5]
Predicting p = 0.526487952
Predicting class = 1