Hi! In this class we are going to introduce you to the algorithm we have chosen to demonstrate how it is possible to create an hardware implementation of a system based on FPGA technologies by using the Xilinx SDAccel design framework. The algorithm we are going to work with has been chosen among the ones used in the computational biology field. In such a context, the development of computational biology increased the complexity of the algorithms used by biologists. The huge amount of data they need to process and the complexity of these algorithms, raised the problem of increasing the amount of computational power needed to perform the computation. The main reason for this is that during the years, thanks to an increase in genome technologies, a huge amount of genomic data has been collected. This massive increase in available data, paired with the complexity of the algorithms that needs to process this huge amount of data, is the reason why also the computational power that is necessary to efficiently perform the computation is increased! In such a scenario, hardware accelerators revealed to be efficient in achieving a speed-up in the computation while, at the same time, saving power consumption. FPGA-based systems have proved to be effective in optimizing the ratio among the performance and the power consumption as these devices allow to obtain a huge amount of parallelism, while keeping a relative low power profile. Among the multiple tasks performed in bioinformatics to analyze DNAs, one of the most compute intensive one is the task of identifying regions of similarity between sequences of DNA, RNA or protein, called Sequence Alignment. A very famous algorithm that performs sequence alignment is called the Smith-Waterman algorithm. The Smith-Waterman algorithm is a dynamic programming algorithm, guaranteed to find the optimal local alignment between two strings that could be nucleotides or proteins. This algorithm has been firstly introduced in 1981 by Mr Smith and Mr Waterman and it is based on a previously published algorithm called Needleman-Wunsch. It is quite interesting to notice that one of the reasons why we choose the Smith-Waterman algorithm is also because this algorithm is not only widely used in bioinformatics to identify similarity of regions among genomes, but it is also quite widely used in multiple different fields. As an example We can consider the computer science field, generally speaking, where it is used for similar document retrieval or near duplicate web page detection, or even fingerprint matching. If this is not enough, we can also consider the computer security field. Within this context, this algorithm can be used when performing deep packet inspection, trying to find known threats to our systems. The algorithm performs local sequence alignment between two strings, usually referred as query, or read, and the database sequences. As previously mentioned, the Smith-Waterman is a dynamic programming algorithm as it decomposes a big problem into smaller ones, storing the intermediate results and using them again when solving the next sub-problem. In our examples, we can organise it into four macro phases: Read, Compute, Traceback, and Store or Write. Given a query N, a database M, a similarity system and an affine gap function, the algorithm calculates the scoring matrix S and the Traceback Matrix T, whose dimension is NxM each. Finally it traces back the path of elements starting from the index identifying the maximum element in the similarity matrix, and terminating whenever a value with value zero is found. It is worthy to remember that one of the most important features of this algorithms is that it is guaranteed to find the optimal local alignment between the two strings, with respect to the scoring system that is is provided in input. Because of the high computational needs required by the algorithm, it has been modified during the years and multiple heuristic methodologies has been introduced. These heuristics from the one hand allow to increase the overall performance of the system, but from the other hand decrease the precision of the algorithm, that does not guarantee anymore the optimality of the alignment found. Furthermore, they may not be able to find similarities for sequences that are very distant. This is important, and we do have always to keep this in mind and, by doing this, try to find the right tradeoff in between algorithm precision and system performance. As it can be easily understood, to avoid the lost in precision, several solutions have been tried over the years to achieve better performance but without loosing in precision. Because of this, in the state of the art it is possible to find multiple implementation of this algorithm using general purpose CPU, and different hardware accelerators such as Graphical Processing Unit, and FPGAs. The performance of this algorithms are calculated by means of GCUPS or Giga Cell Update per Seconds. This measure of performance can be obtained multiplying the size of the query by the size of the database, and finally the result can be divided for the total execution time. At the end, what we are going to present in this module is an analysis and successive FPGA-based hardware acceleration of the Smith-Watermanalgorithm used to perform pairwise alignment of DNA sequences. Now, after having understood why we chose the Smith-Waterman algorithm from a general context perspective, let us now enter more into the details of the specific characteristics of this algorithm and on how it is going to be efficiently implemented on an FPGA because of them.