This chapter gives a brief overview of neural networks and blockchain, starting from introduction to creating a neural network and how blockchain links to neural networks.The chapter starts with giving the reader a brief introduction to neural networks then taking one step ahead with introduction to the terminologies related to neural networks and architecture of the neural network. Data propagation is the most important step in the neural network where the model is trained on basis of data propagated in forward and backward propagation. Final step in full understanding of neural networks comes with the implementation of the neural network using Python libraries. The chapter concludes with a literature review of papers on the use of neural network and blockchain technology, together with open challenges in the field.
Neural networks or artificial neural networks, as the name suggests, is a connection of nodes which we call neurons or artificial neurons. This structure is comprised of many artificial neuron layers connected together which is inspired by the structure of the human brain. Neural networks are considered better than many other machine learning algorithms due to their capability of finding complex relations between the input data and the output data.
Neural networks are part of deep learning which is a subset of machine learning where we try to replicate the structure of the human brain. Neural networks are one of the most used machine learning models due to their high computational power and wide range of applications like CNN – convolutional neural network which is used in deep learning problems related to images, RNN – recurrent neural network used in deep learning problems related to speech, etc.
Blockchain is an expanding record list which are termed as blocks and linking of these blocks are done using cryptography which form chain-like structures which give it its name blockchain. Each block which is added contains following information like hash code, a timestamp, and transaction data all these are stored in the block and hash code is a cryptographic code which is of the last block which is added to the chain. This type of structure can not be altered as the block is added for each transaction and it is design in the way to be effective against modification. The most common use of blockchain technology is cryptocurrency like bitcoin.
Blockchain enables data systems which can be either used in healthcare or student education sector all can be implemented using blockcahin technology which makes easier to add user data and is more secure. Then neural network can be trained on that type of data to make more easier method to analyze the data fit model on it rather than generic way to collect data [1].
Objective: This chapter will give a brief overview of neural networks to the reader with full understanding about the terms related to the neural network, architecture of the neural network and derivation of equations related to neural network. There will be a brief overview of blockchain technology and real-life application of blockchain technology and neural networks.
Neural network architecture is mainly divided into basic components of layers having neurons and neurons having weights, biases, and activation functions.
Artificial neural networks have basically three types of layers:
First layer of a neural network which takes input from the data and where each neuron gets input from the feature of the given data and input layer neurons are equal to features in the given data.
Last layer of a neural network which gives the output of the model, neural network output can be of either regression or classification. A neural network with regression output will have only one neuron in the output layer while in classification output layer neurons depends on number of classes. In regression a single value will be returned by the neural network and in case of n classes n probabilities are given where each probability signifies the chance of that class being the predicted class. Maximum probability class is taken as the final predicted class.
All the layers which are between input layer and output layer are considered as hidden layers. Each layer is connected with its previous and next layer as it takes input from previous layer and gives output to the next layer. These layers are not visible to user as the user only gives input and expects output so the layers are hidden from perspective of the user (Figure 3.1).
Figure 3.1 Neural network structure of a classifier.
Yellow-colored neurons represent the input layer neurons which takes the input, pink-colored neurons are hidden layer neurons which take input from the input layer, and blue-colored neuron is the output layer neuron which gives the predicted class or regression output.
The neuron is the smallest unit of the neural network which takes input from all previous layers of neurons, processes it, and sends it to all neurons in the next layer. Let’s understand what a neuron does when it gets input from the previous layers, first it calculates the summation of product of each input value and weights assigned for that value in the particular neuron and then add bias value to it after it activation function is applied and that output is sent to next layer neurons. The activation function transforms the summation of product of weights and inputs (with addition of bias) so that it can be easily recognized by the next layers whether it is a useful input or not (Figure 3.2).
Figure 3.2 Detailed look at neurons in layers.
Weights are used with each feature given as an input to the neuron and weights tell which feature is more important. As an example, if the weight of one feature is more than another then that means that feature is more important while predicting the output than the other feature with less weight value. Negative value of weights determine that the feature is inversely proportional to the target value which we are predicting.
Bias is a constant value added to the summation of product of weights and features. Bias is added to make the activation function to shift its direction in either left or right, as an example if we are making the decision that if summation value is greater than 0.5 we will assign class 1 and if less than 0.5 we will assign class 0. This threshold value 0.5 which we have set can be used in comparison and this threshold constant is used in decision making. Let’s consider the case when the summation is greater than 0
Activation functions are mathematical functions which determines the output of the neuron. Different activation functions are used for different types of problems and activation function used also depends whether problem is regression problem or classification problem. Gradient of weights and biases plays a very important role while choosing the activation function, Gradient is defined as rate of change in a variable respect to other variable and gradient of error with respect to weights and biases which we achieve after training is used to update biases and weights to reduce the error in prediction [2].
This is the simplest activation function where a constant is multiplied without variable.
The sigmoid function is a mathematical function which transforms the value between range [0,1] and it is a non-linear activation function.
This is a non-linear function which is derived from the sigmoid function by making small changes in the sigmoid function.
This is a non-linear function where if our value of y is greater than 0 then y is returned else 0 is returned and this property is advantage of rectified linear unit that it only send positive values or 0.
The exponential function is a non-linear function where if y is greater than equal to zero it remains same else it is transformed using exponential function.
When we train an artificial neural network the data is transmitted in each epoch twice through the model once transmitted forward and once backward. The transmission of data through the whole model is called propagation.
Data is transmitted from input layer neurons to the output layer and output is predicted from the output layer. Output given from output layer marks the end of forward propagation. Each neuron performs the same function by calculating weighted sum with addition of bias and then passing it to the activation function. The output is given to each neuron in the next layer [3] (Figure 3.3).
Figure 3.3 Forward propagation.
Yellow, pink, and blue arrows show how the data is transmitted in the forward propagation.
This is the reverse process of forward propagation where process goes in backward direction. Backward propagation is based on the principle of error where we first calculate error in our prediction, then we try to propagate backward and check how the error is affected by specific neurons’ weights and bias by calculating the gradient of error in relation to the variable weights and bias and then update them accordingly [4].
Let’s consider an example in Figure 3.4:
Figure 3.4 Backward propagation.
Blue arrows show how data is transmitted from the output layer to hidden layer with pink-colored neurons. Let’s find the loss function which is defined as square of difference in actual value and predicted value.
Figure 3.5 Backward propagation.
Let’s consider the output neuron (blue neuron): it is receiving 2 weights W _{4} and W _{3} and has bias value b _{3} so while back propagating the values of them will be updated as:
Gradient of loss with respect to W _{4} is calculated as:
Let’s consider linear activation function:
There are many ways to implement a neural network either by making our own class and defining all weights, biases, and activation functions; we will be using predefined modules in Python. We will be using Scikit-learn library to create our neural network module. Data which we will use in this implementation is Titanic dataset. Many fields have practical implementation of ANN [5, 7] (Figures 3.6–3.9).
Figure 3.7 Initializing the model and training it and testing it.
Figure 3.8 Using method info for information about data.
Figure 3.9 Using method described for statistics of the data.
The model trained obtained an accuracy of 60.81%. The accuracy depends on many features like choosing activation functions, number of neurons in each layer, number of layers, and optimizing algorithm, etc.
There are many open challenges related to the technology of blockchain but the most common challenge is to convince the public to adopt it as most users have a misconception in their mind relating blockchain technology to cryptocurrency which most people think as an illegal currency or currency used by hackers and fraud users. Another main challenge faced in the technology is speed as this system used for transactions is very slow as compared to other transaction systems used in modern-day technologies.