Yes, you can if you use an activation function like Relu (f(x) =max(0,x))
Example of weights of such network are:
Layer1: [[-1, 1], [1, -1]]
Layer2: [[1], [1]]
For the first (hidden) layer:
- If the input is [0,0], both nodes will have an activation of 0: ReLU(-1*0 + 1*0) = 0, ReLU(1*0 + -1*0) = 0
- If the input is [1,0], one node will have activation of 0 ReLU(-1*1 + 1*0) = 0 and the other activation of 1 ReLU(1*1 + -1*0) = 1
- If the input is [0,1], one node will have activation of 1 ReLu(-1*0 + 1*1) = 1 and the other activation of 0 ReLU(1*0 + -1*1) = 0
- If the input is [1,1], both nodes will have an activation of 0: ReLU(-1*1 + 1*1 = 0) = 0, ReLU(1*1 + -1*1 = 0) = 0
For the second (output) layer:
Since the weights are [[1], [1]] (and there can be no negative activations from previous layer due to ReLU), the layer simply acts as a summation of activations in layer 1
- If the input is [0,0], the sum of activations in the previous layer is 0
- If the input is [1,0], the sum of activations in the previous layer is 1
- If the input is [0,1], the sum of activations in the previous layer is 1
- If the input is [1,1], the sum of activations in the previous layer is 0
While this method coincidentally works in the example above, it is limited to using zero (0) label for False examples of the XOR problem. If, for example, we used ones for False examples and twos for True examples, this approach would not work anymore.