I’m reading the “60-minutes blitz” for Flux, and I’m trying to understand more deeply how the package works. Now,

I have the following code:

```
using Flux: params
W = randn(3, 5)
b = zeros(3)
x = [3,1,0,1,2]
y(x) = sum(W * x .+ b)
grads = gradient(()->y(x), params([W, b]))
grads[W], grads[b]
```

What I find odd in the code above is that the function gradient is supposedly taking a function `()->y(x)`

that has no arguments. BUT, the function somehow knows that inside `y()`

I’m using the variables `W`

and `b`

. Why does Flux uses this roundabout way? Why not just:

```
func(W,b) = sum(W*x.+b)
grads = gradient(func, W,b)
```

Also, with this first implementation `grads = gradient(()->y(x), params([W, b]))`

, I can use `grads[W]`

, but if I change the value of `W`

, then `grads[W]`

won’t work anymore… So, what exactly is `grads`

storing?