This paper focuses on acoustic side-channel attack and surveys the methods and techniques employed in the attack; we will also see some of the different devices which can be under the threat of such attack and finally countermeasures against this attack, which helps in reducing the risk.
ISSN:2249-5789 Deepa G M et al , International Journal of Computer Science & Communication Networks,Vol 3(1), 15-20 An Overview of Acoustic Side-Channel Attack Deepa G.M., G SriTeja & Professor S Venkateswarlu Department of Computer Science & Engineering KL University, Andhra Pradesh, India Abstract Need for security is an important aspect in protecting private information Side channel attacks are a recent class of attacks that is very powerful in practice Most side-channel attack research has focused on electromagnetic emanations (TEMPEST) and power consumption however; one of the oldest eavesdropping channels is acoustic emanations This paper focuses on acoustic side-channel attack and surveys the methods and techniques employed in the attack; we will also see some of the different devices which can be under the threat of such attack and finally countermeasures against this attack, which helps in reducing the risk Introduction Invasive attack: this attack involves tampering device to get direct access to internal components Semi-invasive attack: this kind of attack involves access to device but without making any direct contact with device for example, fault-based attack Non-invasive attack: this attack involves close observation on externally available information which is often unintentionally leaked Examples like electromagnetic attack, power attack and acoustics attack Visible Light Power consumption ExecutionTime Embedded Cryptographic device Security is the main concern of privacy and it is as strong as weakest link We live in a world in which all the sensitive data is controlled and distributed using computer system There has been put much effort in protecting this information with a wide array of cryptographic schemes, protocols and security systems, but there are still many concerns for systems in which the physical implementations can be accessed For example, system like ATM’s which we use in everyday life is vulnerable to implementation attacks through their cryptographic protocols Physical attacks on cryptographic embedded devices take advantage of implementation-specific characteristics to recover the secret parameters involved in the computation The side-channel attacks are a class of such physical attacks in which an attacker tries to exploit physical information leakages from those devices Different side-channel leakages can be- power, electromagnetic radiation, sound/acoustic emanation, light emanation etc., shown in fig 1.1 [1] These leakages can be classified into invasive, semi-invasive and non-invasive attacks Electromagnetic radiation Faulty Output Sound Emanation Fig 1.1 Different leakages of cryptographic device 1.1 Known Side-channel Attacks The most popular side-channel attacks known are as follow: Timing Attack: A timing attack is actually a way of obtaining some user's private information by carefully measuring the time it takes the user to carry out cryptographic operations The objective of this attack is very simple: to exploit the timing variance in the operation Fault Attack: Fault attacks present practical and effective attacking against the cryptographic 15 ISSN:2249-5789 Deepa G M et al , International Journal of Computer Science & Communication Networks,Vol 3(1), 15-20 hardware devices such as smart cards This attack is usually performed on cryptographic device Here the attacker intentionally induce fault in the device to know the crypto operation and to retrieve secret key to some extent.Countermeasures like protecting device from faults, avoiding them perform repeated operation Power Attack: Principle of power attack is that, the power consumption of a cryptographic device may provide much information about the operations that are taking place and parameters involved Electromagnetic attack: The components of electrical devices usually emit some electromagnetic radiation while operating EM attack is a noninvasive attack, where electromagnetic radiations emitted by such device can be used to analyze the device’s internal operation Acoustic Attack: Emanations produced by electronic devices have been a source for attacks on the security of systems and acoustic emanation is one of such kind This attack is based on the assumption that the sound of clicks can differ slightly from key to key and analyzing this difference, we can guess the text that’s being produced We will discuss this in detail in further sections This paper outlines the concept of acoustic sidechannel attack Section gives brief idea about acoustic attack Section describes technical requirements needed to carry out this attack and section lists the devices which can be attackedusing acoustic emanation and finally ideas to countermeasure them Acoustic Attack Audio emanated by a device serves as a potential vulnerability for side channel attacks The proposed acoustic attack is mainly based on the hypothesis that the sound produced by the keys might differ slightly from key to key, although the clicks of different keys sound similar to the human Figure 2.1 shows the overview of acoustic attack It has two main phases called training phase and recognition phase Training phase: In this phase, a sequence of words from a dictionary is tested for their characteristic sound features and stored in a database For obtaining the best results, the setting should be close to the setting in which the actual attack is carried out The main steps of the training phase are as follows: Feature extraction: This technique of feature extraction is taken from speech recognition and music processing The most interesting features for printed sounds occur above 20 kHz, and that a logarithmic scale cannot be assumed for them We therefore split the recording sample into single words based on the intensity of the frequency band between 20 kHz and 48 kHz, and spread the filter frequencies linearly over the frequency range We subsequently use digital filter banks to perform sub-band decomposition on each word [3] Subband decomposition gives better results than simple resolution The output of sub-band decomposition is smoothed to make it more robust to environmental noise The extracted features are stored in a database which is further used in processing 2 Computation of language model: To solve the next phase, we will complement acoustic information with information about the occurrence likelihood of words in their linguistic context (e.g., the sequence “such as the” is much more likely than “such of the”) More specifically, we estimate for each word in our lexicon n-gram probabilities, i.e., the likelihood that the word occurs after a sequence of n − given words These probabilities make up a (statistical) language model Probabilities are computed based on frequency counts of n-place sequences (n-grams) from a corpus of text documents We need to extract these frequencies from asufficiently large corpus, which makes up the second step of the training phase Recognition Phase: this phase uses the characteristic features of the trained words to recognize new sound recordings of printed text, supported by suitable language-correction techniques The main steps are as follows: Select candidate words: We start by extracting features of target text, as shown in the first step of the training phase Let us compare the features of recorded attacked data and the characteristics of words from database.If the features extracted from different recordings of the same 16 ISSN:2249-5789 Deepa G M et al , International Journal of Computer Science & Communication Networks,Vol 3(1), 15-20 Data-base word are always identical then one would obtain a unique Acoustic feature extraction Acoustic feature extraction Training data Features Training Phase Acoustic feature extraction Attack data Features Select candidate words Acoustic feature extraction Words Ordered words Recognition Phase Figure 2.1 overview of acoustic attack correspondence between trained features and target features However, measurement variations, environmental noise, etc show that this is not the case Multiple recordings of the same word sometimes yield different features; for example, printing the same word at different places in the document results in different acoustic emanations conversely, recordings of words that differ significantly in their spelling might yield almost identical soundfeature Let the selected trained word be a random variable conditioned on the printed word, i.e., every trained word will be a candidate with a certain probability Using sufficiently good feature extraction and distance computations between two features, the probabilities of one or a few such trained words will dominate for each printed word The output of the first recognition step is a list of most likely candidates, given the acoustic features of the target word Language-based reordering to reduce word error rate: finally we try to find the most likely of printed words Although always randomly picking the most likely word based on the acoustic signal might already yield a suitable recognition quality, technologies like Hidden Markov Model (HMM), in particular language models and the Viterbi algorithm, which is regularly used in speech recognition, to determine the most likely sequence of printed words Intuitively, this technology works well for us because most errors that we encounter in the recognition phase are due to incorrectly recognized words that not fit thecontext; by making use of linguistic knowledge about likely sound selected, and unlikely sequences of words, we have a good chance of detecting correcting such errors [4] The use of HMM technology yields accuracy rates of 70 % on average for words for the general-purpose corpus, and up to 95 % for the domain-specific corpus 3.Technical Requirements for Acoustic Attack In this section we’ll see what the technical requirement needs are in acoustic attack 3.1 Analyzing Audio Frequency Choosing correct features of keystroke is critical in differentiating between the keys Such features should be consistent for individual keystrokes; it should appear each time a given key is pressed, it should also be unique and they should vary fromkey to key Experiment work in [6] shows that best features for speech recognition are in the frequency domain, not the time domain Actually, the difference between frequencies responses of different key pressed comes from the physical location of keys on the keyboard Now to compare frequency responses of different keys, we’ll follow variety of ways to compare signals [6]: i Sum of squared differences Given two arrays of FFT'd keystrokes, the difference of each corresponding FFT value is squared and added to a cumulative sum Lower sums meant a better match 17 ISSN:2249-5789 Deepa G M et al , International Journal of Computer Science & Communication Networks,Vol 3(1), 15-20 ii Peak alignment Given two arrays and one of various peak detection schemes, the arrays would be aligned at the peaks, and then sum of squared differences is performed The goal is to minimize needless error resulting from skew Sliding One array was slid over the other array, and for each slide thesum of squared differences is taken Convolution The two arrays are convolved, and the index of the maximum value is then used as an offset to and where the two arrays best lined up This is nearly a repeat of the peak alignment method, but with greater mathematical basis for believing that this method results in a logical comparison of signals Compare The compare method is used to compare unknown key presses to our known training data The lowest result from the compare method with a known key will show that the unknown key is the same as that known key iii iv v 3.2 Triangulation Method Triangulation method is used for knowing the position of the key Given a source of sound i.e key press (K5) and two distinct recording devices i.e microphones (M1 & M2) in different positions as shown in figure 3.1, we can measure the difference in the time it takes for the sound to reach each microphones Assuming a goodpositioning of the microphones, each key will produce a unique difference in the time to arrival (TTA) in the two microphones This difference in TTA will be proportional to the difference in distances of the key to each microphone We can exploit this fact in order to guess which keys are pressed simply by listening for them 3.3 Processing Technology – HMM This section describes technique based on language models to further improve thequality of reconstruction This technique helps to improve the word recognition rate 3.3.1 Hidden Markov models (HMMs) HMMs are graphical models for recovering a sequence of random variables which cannot be observed directly from a sequence of given variables The random variables are modeled as hidden states, the output variables as observed states HMMshave been used for many tasks that deal with language processing such as speech recognition [7, 8, and 9], handwriting recognition [11] or part-of-speech tagging [10, 12] Formally, an HMM of order d is defined by a five-tuple (Q, O, A, B, I) where Q = (q1, q2 qN) is the set of (hidden) states, O = (o1, o2 oM) is the set of observations, A = Qd+1 is the matrix of state transition probabilities (i.e., the probability to reach state qd+1 when being in state qd with history q1, , qd−1), B = Q × O are the emission probabilities (i.e., the probability of observing a specific output oi when being in state qj ), and I = Qd is the set of initial probabilities (i.e., the probability of starting in state qi) Figure 3.2 shows a graphical representation of an HMM, where white circles represent hidden states and grey circles represent observed states q1 b11 a12 q2 a23 aN-1N eNM b22 o1 qN o2 oM Figure 3.2 HMM K5 M1 K5 Figure 3.1 Triangulation Method M2 We use HMMs in two phases: training phase and recognition phase For training phase, the initial probabilities, which model the probability of starting in a given state, and the transition probabilities, which model the likelihood of different words following each other in an English text, can be obtained by building a language model from a large text corpus To address the second phase, determining the most likely sequence of hidden states (i.e., recorded text) we can use the Viterbi algorithm [13] 18 ISSN:2249-5789 Deepa G M et al , International Journal of Computer Science & Communication Networks,Vol 3(1), 15-20 Language Building Model: A language model of size n assigns a probability to each sequence of n words The probability distribution can be estimated by computing the frequencies of all n-grams from a large text corpus n-grams words with probability will never be selected by the Viterbi algorithm; we smooth the probabilities by assigning a small probability to each unseen n-gram The length of an n-gram determines how many words of context are taken into account by the language model Higher values for n can lead to better models but also require exponentially larger corpora for an accurate estimation of the n-gram probabilities The higher the value of n, the larger the likelihood that some ngrams never appear in the corpus, even though they are valid word sequences and thus may still appear in the text Reordering words based on obtained language model: Having built thelanguage model, we can reorder the candidate words using the model to select the most likely word sequence [4].This task is addressed by the Viterbi algorithm [13], which takes as input an HMM (Q, O, A, B, I) of order d and a sequence of observations a1, , aT ϵ OT Its state consists of ψ = T ×Qd First, the d-th step is initialized (the earlier are unused) according to the initial distribution, weighted with the observations: Ψd,i1, id= Ii1, id 𝑘=1, 𝑑 Bik,ak ∀ 1≤ i,j ≤ N such attacks are mainly for capturing login detail, passwords and other secret information recovery An obvious idea for countermeasuring acoustic attack is a silent keyboard, which not produce more sound It can be a keyboard made of rubber or touchpad [13], or a keyboard based on a touchscreen or touchstream technologies [15] Nowadays, virtual keyboards have appeared that can be projected on a flat surface [16] and printers with acoustic shielding foam which minimize sound of keys pressed These choices are more expensive than the standard mechanical keyboard Typing on a standard keyboard is much comfortable than typing on a touchscreen or a rubber keyboard The above mentioned ways are useful in avoiding emanation of sound from devices But there are also some other methods by which we can prevent this attacks to take place They are: Distance- the recognition rate drops substantially if the distance between the device and the microphone is increased Obstacle- any obstacles between thedevice and microphone can prevent the sound reaching the recording device (microphone) Avoiding contact with microphone: the absence of microphones near emanation device is sufficient to protect privacy Conclusion In the recursion, for increasing indices s, the maximum of all previous values is taken: Ψs,i1, id= Bid,as max𝑖0𝜖𝑄 (Ai0,i1, id ψs-1,i0, id-1) ∀s>d, 1≤i,j≤N The sequence of hidden states finally can be obtained by back tracking the indices that contributed to the maximum value in the recursion step This paper describes the overview of acoustic sidechannel attack and provides different techniques like HMM (Hidden Markov Model), triangulation method reordering words using Viterbi algorithm to recognize the data that is been recorded At last, some of the countermeasures to avoid and overcome the attack Reference 4.Devices under Countermeasures Threat and its Secret information leakage caused by emanations from electronic devices has been a topic of concern for a long time Emanations such as sound produced by electronic devices can be from different sources [5] Sound as a wave carries information in the form of frequency, wavelength and amplitude which can bemeasured by audio capturing device like microphone The powerful acoustic attacks sources have been keyboard, keypad of ATM machine and key strokes of printer machine and application of [1] Side-Channel Attacks: Ten Years after Its Publication and the Impacts on Cryptographic Module Security Testing YongBin Zhou, DengGuo Feng State Key Laboratory of Information Security, Institute of Software, Chinese Academy of Sciences, Beijing, 100080, China [2] Power analysis attack Countermeasures and their weaknesses Thomas S Messerges, Ph.D., Security Technology Research Laboratory Motorola Labs, Motorola [3] Meinard Măuller Information Retrieval for Music and Motion Springer, 2007 19 ISSN:2249-5789 Deepa G M et al , International Journal of Computer Science & Communication Networks,Vol 3(1), 15-20 [4] Acoustic Side-Channel Attacks on Printers Michael Backes, Markus Dăurmuth1, Sebastian Gerling1, Manfred Pinkal3, Caroline Sporleder, Saarland University, Computer Science Department, Saarbrăucken, Germany Saarland University, Computer Linguistics Department, Saarbrăucken, Germany [5] side-channels, compromising emanations and surveillance: current and future technologies Richard Frankland [6] Dmitri Asimov and Rakesh Agarwal, “Keyboard Acoustic Emnations", IBM [7] Lawrence R Rabiner “A tutorial on hidden markov models and selected applications in speech recognition.” [8] Biing-Hwang Juang and Lawrence R Rabiner “Hidden markov models for speech recognition.” [9] Frederick Jelinek “Statistical Models for Speech Recognition” MIT Press [10] Kenneth W Church “A stochastic parts program and noun phrase parser for unrestricted text” [11]R Nag, Kin HongWong, and Frank Fallside Script recognition using HiddenMarkovModels [12] Steven DeRose Grammatical category disambiguation by statistical optimization Computational Linguistics [13]Hidden Marckov Models http://cs.brown.edu/research/ai/dynamics/tutorial/Documen ts/HiddenMarkovModels.html [14] The virtually indestructible http://www.grandtec.com/vik.html keyboard [15] TouchStream http://www.fingerworks.com/ keyboards [16] Canesta http://www.canesta.com/products.html keyboards 20 ... Saarbrăucken, Germany [5] side- channels, compromising emanations and surveillance: current and future technologies Richard Frankland [6] Dmitri Asimov and Rakesh Agarwal, “Keyboard Acoustic Emnations",... for attacks on the security of systems and acoustic emanation is one of such kind This attack is based on the assumption that the sound of clicks can differ slightly from key to key and analyzing... the recursion step This paper describes the overview of acoustic sidechannel attack and provides different techniques like HMM (Hidden Markov Model), triangulation method reordering words using