Попытка написать softmax и NNLib softmax дает неожиданный результат

Question

Попытка написать softmax и NNLib softmax дает неожиданный результат

Я работаю над книгой на питоне ... но вместо этого использую Julialang ... чтобы выучить язык и т.д ... и я наткнулся на другую область, где я не совсем понимаю ...

но когда я начал подбрасывать более сложные матрицы, он развалился ..

      include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")

coords, color = spiral_data(100, 3)

dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)

forward(dense1, coords)
println("Forward 1 layer")
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
println("Forward 2 layer")
activated_output2 = softmax_activation(dense2.output)

println("\n", activated_output2)

Я получаю обратно правильную матрицу

      julia> activated_output2
300×3 Matrix{Float64}:
 0.00333346  0.00333337  0.00333335
 0.00333345  0.00333337  0.00333335
 0.00333345  0.00333336  0.00333335
 0.00333344  0.00333336  0.00333335
 0.00333343  0.00333336  0.00333334
 0.00333311  0.00333321  0.00333322

но в книге есть

      >>>
[[0.33333 0.3333 0.3333]
...

Кажется, я на порядок ниже книги? даже при использовании функции softmax FluxML

РЕДАКТИРОВАТЬ:

Я подумал, что, возможно, мой код активации ReLU вызывает несоответствие ... и попытался переключиться на версию FluxML NNlib ... но получил то же самое activated_output2 с участием 0.0033333 вместо 0.333333

буду продолжать проверять другие части, такие как моя функция пересылки

РЕДАКТИРОВАТЬ2:

Добавление моего DenseLayer реализация для полноты

Плотный слой

      # see https://github.com/FluxML/Flux.jl/blob/b78a27b01c9629099adb059a98657b995760b617/src/layers/basic.jl#L71-L111
using Base: Integer, Float64

mutable struct LayerDense
    weights::Matrix{Float64}
    biases::Matrix{Float64}
    num_inputs::Integer
    num_neurons::Integer
    output::Matrix{Float64}
    LayerDense(num_inputs::Integer, num_neurons::Integer) = new(0.01 * randn(num_inputs, num_neurons), zeros((1, num_neurons)),num_inputs, num_neurons)
end


function forward(layer::LayerDense, inputs::Matrix{Float64})
    layer.output = inputs * layer.weights .+ layer.biases
end

РЕДАКТИРОВАТЬ3:

Пользуясь библиотекой .. Я начал проверять свой spiral_data реализация .. кажется в пределах разумного

Python

      import numpy as np
import nnfs

from nnfs.datasets import spiral_data

nnfs.init()


X, y = spiral_data(samples=100, classes=3)

print(X[:4]). # just check the first couple

>>>
[[0.         0.        ]
 [0.00299556 0.00964661]
 [0.01288097 0.01556285]
 [0.02997479 0.0044481 ]]

JuliaLang

      include("activation_function_exercise/spiral_data.jl")

coords, color = spiral_data(100, 3)

julia> coords
300×2 Matrix{Float64}:
  0.0         0.0
 -0.00133462  0.0100125
  0.00346739  0.0199022
 -0.00126302  0.0302767
  0.00184948  0.0403617
  0.0113095   0.0492225
  0.0397276   0.0457691
  0.0144484   0.0692151
  0.0181726   0.0787382
  0.0320308   0.0850793

1

julia flux-machine-learning

Источник

user389976 07 июл '21 в 13:55

1 ответ

Другие вопросы по тегам julia flux-machine-learning

user389976 07 июл '21 в 19:43 2021-07-07 19:43 · Answer 1 · 2021-07-07 19:43

оказалось, что я использовал NNlib softmax на всей матрице ... чего НЕ делала книга python ... и все, что нужно было сделать, это изменить мой softmax() позвони мне нравится

      using NNlib

function softmax_activation(inputs)
    return softmax(inputs, dims=2)
end

Затем результат в конце моего большого длинного примера выходит, как и ожидалось.

      #using Pkg
#Pkg.add("Plots")

include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")

coords, color = spiral_data(100, 3)

dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)

# Julia doesn't lend itself to OO programming...
# so the following will just be function
# activation1 = activation_relu
# activation2 = activation_softmax

forward(dense1, coords)
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
activated_output2 = softmax_activation(dense2.output)


using Plots

#scatter(coords[:,1], coords[:,2])
scatter(coords[:,1], coords[:,2], zcolor=color, framestyle=:box)

display(activated_output2)

300×3 Matrix{Float64}:
 0.333333  0.333333  0.333333
 0.333336  0.333334  0.33333
 0.333338  0.333339  0.333323
 0.33334   0.333344  0.333316
 0.333339  0.333361  0.3333
 0.333341  0.333365  0.333294
 0.333345  0.333362  0.333293
 0.333345  0.333374  0.333281
 0.333349  0.33337   0.333281
 0.333347  0.33339   0.333262
 ⋮                   
 0.333564  0.332673  0.333764
 0.333583  0.332885  0.333532
 0.333588  0.332967  0.333445
 0.333587  0.333148  0.333265
 0.333593  0.332935  0.333472
 0.333596  0.333006  0.333398
 0.333583  0.33333   0.333086
 0.3336    0.333062  0.333338
 0.333603  0.333082  0.333316