Binary convolutional neural networks are optimization of conventional convolutional neural networks. By binarizing both the weight and activation, the floating-point number is converted to -1(0) or +1, so the heavy matrix multiplication operations can be replaced with weighted bitwise XNOR operations and Bitcount operations [1], which is more friendly for hardware. In the other hand, Single flux quantum (SFQ) circuits are known for their low power consumption and high-speed operation [2]. And the parallel structure can realize high throughput. Obviously, it is suitable to design a high-speed and low-energy binary convolutional neural networks using the parallel structure of the SFQ circuits.
In this paper, two max pooling circuits are proposed and compared. As crucial component placed after the convolution layer, max pooling is the most conventional pooling method, which can effectively reduce the amount of calculation and avoid over-fitting. The logic function of the max pooling circuit is to compare all input data and finally output the maximum value. For 2×2 max pooling circuit after 3×3 convolution circuit, which needs to compare four 5-bit signed data and output the maximum value.
The first max pooling circuit used a conventional method to compare two data of SFQ circuits, which is based on the sign after subtraction. But in order to output the maximum value, it needs to use a large number of shift registers to store the input data in each pipeline stage, to wait for the result of the subtraction operation. In order to alleviate this problem, the second method is proposed to compare two data and output the maximum value. Instead of subtraction, all data is compared bitwise in the first two pipeline stages. After that, the output is calculated bit by bit from the most significant bit to the least significant bit. In this way, it only needs to store the high-order output using the shift registers and wait for the low-order calculation, without storing all the input data, so it can be realized in a smaller area. Then three such numerical comparison circuits have been used to form a tree structure to realize the 2×2 max pooling circuit.
The layout of two max pooling circuits realized by subtraction and bitwise comparison methods are both designed for AIST ADP2 process with the critical current density of 10 kA/cm2 for the simulation purpose. They can compare the four 5-bit input data and output the maximum value. Since almost every pipeline stage in the bitwise comparison method needs to control the low-order data to reset by the result of the high-order comparison, the bias margin and maximum frequency are slightly lower than the subtraction method. The number of Josephson junctions (JJs) in bitwise comparison method is 7951, which is 18.36% less than the subtraction method, and the area is also smaller than the subtraction method.
References
[1] Rastegari, Mohammad, et al. "Xnor-net: Imagenet classification using binary convolutional neural networks." European conference on computer vision., 2016.
[2] Likharev and Semenov, "RSFQ logic/memory family: A new Josephson-junction technology for sub-terahertz-clock-frequency digital systems." IEEE Trans. Appl. Supercond., vol. 1, pp. 3–28, 1991.
Keywords: Single Flux Quantum Circuit, Binary Convolutional Neural Networks, Max Pooling