Bristol Cryptography Blog: September 2011

Thursday, September 29, 2011

CHES Day 1, Session 3

This session was on elliptic curve cryptography. The first speaker (Jonathan Taverne) pointed out a long history on ECC at CHES. However, with new architectures come new challenges. For instance, past performance figures tended to concentrate on elliptic curves over prime fields, but with the new Intel PCLMULQDQ one would expect a performance boost for binary curves. When looking at the field arithmetic, this new operation will make multiplication less expensive compared to other operations, such as squaring or taking roots. When looking at curves, this has the effect that a double-and-add will gain more from the new instruction set than a halve-and-add method, so perhaps the former will once again be the new speed champion? This turned out not to be the case (and nor did the binary curves beat the prime curves yet), but along the way a very neat trick was presented: by cleverly writing the curve scalar multiplication, it turns out possible to perform half the computation using double-and-add and the other half using halve-and-add in parallel. This reminded me of a similar trick by Marcelo Kaihara and Naofumi Takagi, who split a modular multiplication in two (bipartite modular multiplication). Great fun!

Peter Schwabe continued the session in high speed. The topic here was high speed, high security signatures. A variant of Schnorr signatures were chosen, but with some interesting design choices. Remember that underlying a Schnorr signature is a (honest-verifier) zero-knowledge protocol of knowledge with the Fiat-Shamir heuristic applied to it. Conventional wisdom has it to send the hash value and the response value (both elements in the exponent group), but here instead the hash function was chosen to produce twice as many bits as one would expect plus group elements could be compressed efficiently. Thus sending the initial commitment suddenly made more sense than the hash value. Other small tweaks were the doubling up of the hash function as pseudorandom function to determine this initial commitment and the use of the hash function as pseudorandom generator to expand an initially smaller secret. The result were very fast and compact digital signatures. As a note on security, I think it was still random oracle, but generic collisions would not lead to a forgery. However, given the structure of the signature, my hunch is that chosen prefix collisions (as can be found for MD5) would lead to an existential forgery. In general, I'm somewhat skeptical of proofs in the random oracle where you can waiver some standard property, as usually whenever some non-trivial attack is presented, this is due to some structural flaw that might also be exploited otherwise.

The final talk of the session, and the first day of CHES'11, was by Junfeng Fan. He described an attack that combined fault injection and a side-channel to find a discrete logarithm. The underlying observation was very simple: if P is a point on a curve, change it by a fault injection into P' on the same curve, but such that P' has low order, say order 4. If P' is subsequently used in an exponentiation routine, repeated squarings lead to the point at infinity, which one can normally detect using a side-channel. A lot of the ideas that went into this attack seemed known to me, but the combination required an oracle that, given a curve, provided you with P and P' that were close together (to make the fault injection work). Even many implementations that are secure against side channels and fault injection separately, fall foul against this combination.

This was an enjoyable session and especially the trick of the first talk was rather neat!

Tuesday, September 27, 2011

Non-invasive attack testing workshop, day 2

Some interesting discussions were to be had today at NIAT about the challenges involved with designing testing regimes and metrics to assess the vulnerability a device has to side-channel attacks. Benjamin Jun and Pankaj Rohatgi of Cryptography Research gave two talks on their own testing methodologies for symmetric algorithms and for RSA, which essentially involved trying to test to see if the device produces any statistically significant information leakage by using hypothesis testing on acquired signals. The challenges for NIST to overcome are constructing a methodology that can be easily and quickly used across all testing laboratories, that can be used for conformance testing, and that does not require any exceptional operator skill.

A different perspective was provided in the final talk of the day was given by Francois-Xavier Standaert of the UCL Crypto Group at Université catholique de Louvain. The talk considered using the worst-case scenario for the device as a security metric. This is when the attacker has the most complete power model possible; essentially mounting an optimal template attack. In this way the evaluator can have some idea of the available 'security margin' for the device. Two particular analyses were proposed; one information theoretic analysis to assess the data complexity of the worst-case attack, and a second security analysis to assess the time complexity, because in practice a side-channel attack has to trade-off between the two. In the former analysis, an information theoretic metric based on mutual information was proposed as a better alternative to the success-rate of the adversary, because whilst success-rate is perfectly representative of all the available information, it is not as easy to interpret the results.

A cautionary note was given, which is that in practice we cannot exactly compute the mutual information because of the problem of needing to estimate probability density functions, an issue we've run into ourselves at Bristol. So instead MI was proposed to be renamed 'percieved information', to ensure that people remember that the estimation may not be perfect.

NIST's Non-Invasive Attack Workshop: Day 1

Luke Mather and I have been attending NIST's Non-Invasive Attack Workshop in Nara Japan. The major it of the presentations have been to do with the metrics and methods of evaluating secure implementations of cryptographic algorithms in microprocessors.

The discussion following a presentation on secure implementations of cryptographic algorithms on FPGAs was of particular interest. In response to the presentation Brad Hutchings from Brigham Young University, Utah pointed out that his team have developed a tool to control the routing on an FPGA. When implementing secure logic styles on an FPGS it is difficult to make sure that connections that are reproduced the same length (to ensure the power consumption is the same). Their tool provides a Java interface for imposing constraints on Xilinx designs. This is a source forge project called rapid smith.

Wednesday, September 21, 2011

ECRYPT yearly meeting, Leuven

Welcome back Leuven - this time for the ECRYPT annual meeting. I gave the reasearch talk on behalf of the MAYA working group which was an extended version of my talk at ESORICS last week. The VAMPIRE group presented "Another look at second order DPA".

SYMLAB talked about algebraic attacks on constructions like block ciphers or hash functions. Algebraic attacks model the primitive in question using a set of multivariate, nonlinear equations and then use generic techniques to solve these equations. The work complexity for this method is the complexity of the algebraic attack. Algebraic attacks are far from new but were dismissed early on as a problem that designers could easily deal with - indeed the generic approach for setting up equation systems is not particularly efficient.
But in 2003 algebraic attacks were rediscovered in the form of an efficient attack on a new stream cipher.

The example in the talk considered a keystream built from a n-bit LFSR initialised to the n-bit key whose outputs are fed through a compression function with one-bit range (so the construction produces one bit per clock of the LFSR). Writing this as a set of equations with L the LFSR and F the compression function, k the key and s the stream, the derived equation system looks something like

s[0] = F(L(k))
s[1] = F(L(L(k))
s[2] = F(L(L(L(k)))

and so on, which at some point becomes overdetermined and thus solvable in k (the s values are obviously known). L is linear so F is the main stumbling block here. In particular, creating and solving the equations takes polynomial time in the length of k but exponential in the degree of F.

It looks as if one can just pick the degree of F large enough to defeat this attack. The 2003 paper looked at an example where F had high degree but a low-degree annihilator (a function G with F.G = 0) and one could perform the attack using G instead.

The speaker argued that further such tricks could make algebraic attacks feasible against other ciphers; algebraic attacks should therefore not be written off too lightly.

Friday, September 16, 2011

CARDIS '11 - Invited speakers

CARDIS (Smart Card Research and Advanced Applications) was held this year at Katholieke Universiteit Leuven in Belgium. Helena Handschuh, chief technology officer of Intrinsic-ID, gave the first of two invited talks on SRAM PUFs.

A PUF is a physical unclonable function, a functional fingerprint for silicon chips designed to provide an unclonable means of identifying the device. The unclonable requirement extends as far as requiring that even the manufacturing process that created the original PUF should not be able to reproduce it for the same device. The way to achieve this is by creating a challenge-response situation by stimulating some physical structure on the device. The exact micro-structure is influenced by physical factors introduced during the manufacturing process, and is highly unpredictable. A PUF then has a way of mapping different stimuli (challenges) to unique reactions (responses).

A PUF can then be used to prevent a device being cloned, because the cloned PUF will react differently to the stimuli. With this property, it would be possible to create unclonable RFID tags, for instance. Some PUFs can also be used for secure key storage, by essentially one-time 'enrolling' a key on a device, and also in some particular cases as a way of generating entropy.

The physical structure that defines the PUF can be created in many different ways. For example, optical PUFs work by measuring how light from a laser is reflected when it of shone through randomly distributed glass scatterers; so the unclonability is achieved with explicitly-induced randomness. Additionally, by varying the angle of incidence of the laser light, a set of unique responses can be achieved. Alternatively, implicitly-induced randomness can be created with SRAM PUFs because the initial state of the start up cells at the device-power on is different for different devices. The rest of the talk concentrated on SRAM PUFs and their properties.

Helena spoke about the challenges involved in testing the effectiveness of their PUFs. One interesting comment was about how, over time, the physical structure that embodies the PUF can degrade through use. This could lead to the fingerprint of the device varying. To test this, a chip needs to be heated to 80c for 2000 hours to simulate 10 years of ageing. Factors that affect a PUF include the temperature of the chip, power-up times, and many others. Helena concluded with some interesting statistics illustrating just how much manpower and equipment have gone into assessing how these factors affect SRAM PUFs.

The second invited speaker was Olivier Ly of the Laboratoire Bordelais de Recherche en Informatique (LaBRI) at the University of Bordeaux. Olivier's talk revolved around the "Insight" framework, a piece of software designed for automatic analysis of programs in their binary state. The concept has applications in virus detection and assessing side-channel vulnerabilities.

Wednesday, September 14, 2011

ESORICS 2011 - Day 2

One talk I found particularly interesting was in yesterday's session on forensics, biometrics and software protection , and was entitled "Who Wrote This Code? Identifying the Authors Program Binaries", delivered by Nathan Rosenblum from the University of Wisconsin - Madison.

The talk considered how to recover provenance information from within program binaries and other forms of executable code. In this case, the goal is authorship attribution - identifying the author of a through the of a programmer's particular style. Trying to achieve this through an analysis of source code is perhaps easier; certain features are particularly strong at identifying a particular author. For instance, a programmer's indentation style, variable naming and use of standard library functions could quite heavily distinguish him or her from a different programmer.

However, source code is sent through many transformations (through a compiler, perhaps a code obfuscator) before being built into a executable program. The question to answer is how to indentify which features survive the most through this transformation process. The solution is a nice interplay between concepts in both security and machine learning. Author attribution is transformed into a machine learning problem, with control flow graphs and the underlying machine code forming the building blocks of the feature representation of a binary. A particularly interesting feature is the use of 'graphlets'; three-node subgraphs of the main control flow graph that reflect the local structure of a program.

The method was applied to two slightly different problems:
1) Identifying a particular author of a program;
2) Identifying similarities between programs of unknown origin,
and was tested on binaries from the Google Code Jam competition and submissions for a particular university course. The results, examined thoroughly in the paper, seem to indicate that programmer style is indeed preserved through the transformation process from source code into binary. Some nice applications of the technique could be in identification of known malware authors and detection of stolen software.

Tuesday, September 13, 2011

CryptoForma: David Nicol "Trade offs between privacy and performance in privacy-preserving traffic management"

As part of the CryptoForma meeting we were lucky to be at Kent at the same time as a visit for a departmental seminar by David Nicol of Univ Illinois. David's talk was about how information is leaked by connection information, even if the underlying data is encrypted. For example by correlating size or timing information of the encrypted packets.

Think of the problem of accessing say a banking web site. Even if the traffic is encrypted, a viewer may be able to determine you are performing a withdrawal simply due to the response time, or the size of the encrypted data being transferred. Thus the semantic security of the system is broken, since timing and message sizes reveal semantic information. Recall in standard definitions of semantic security, we only obtain security for equal length messages. Thus the security definition for semantic security does not meet the actual situation we see in the real world.

The main technique proposed was to insert cover traffic which looked like genuine encrypted traffic. The talk then discussed various experiments and analysis on the data. Looking not only at whether information could still be recovered, but also in terms of whether the process could be performed efficiently.

Monday, September 12, 2011

Esorics day 1

Today's sessions were mostly about web security - a welcome break from more theoretical cryptography and a reminder that crypto alone is not enough for building secure systems.

Code injection (usually SQL or HTML/javascript) is a major problem on the web despite its being known for a long time. As people will always make mistakes, several talks looked at automated approaches.

One proposed "complementary character coding" - each character (ASCII or unicode) gets two representations, a normal and a tainted one. Any data received from a client is automatically cast to the tainted representation by the server. The point is that only the non-tainted characters are parsed in SQL and the like, so a tainted quote character is just another character in a string. If web browsers adopted the same approach for HTML, it would get rid of javascript injection too.

Another talk looked at sanitation tools in popular web frameworks (PHP, django and so on) - the more traditional approach. All can do basic HTML sanitation but once you consider different contexts - URIs for example

<a href="$target" />

when

$target="javascript:alert('oh dear!');"

or even CSS (which can do JS too) it turns out only one framework of those studied can do it all - and that's one written in C++, not one of those fancy new languages!

Some authors looked at cross-site scripting in the context of scripts that are embedded in the page and thus don't fall under the same-origin policy (some advertisements, for example) but still shouldn't have access to all the page. Here the authors proposed a scheme in which each script runs in a "world" and can only access parts of the page/DOM that are declared writable in this world.

Yet another approach is client-side protection against cross-site requests as demonstrated by the authors of the CsFire firefox extension. The basic idea is to strip cookies from all requests that cross domain boundaries, thus defeating many attacks.
Some scenarios like single sign-on or delegated payment (think PayPal) require requests to pass between two domains with correct cookie handling but only when the delegated site returns control to the delegator - so CsFire allows them in this case only.

The day finished with a reception hosted by the mayor of Leuven.

Esorics, Day 1: Invited talk

Esorics opened today with a talk from Prof. Ross Anderson on "smart meters".

EU-wide, smart meters for electricity and gas are being installed in homes with a target of 100% coverage some time in the 2020s. They measure energy consumption in half-hourly intervals and can report this data over a wireless network. They can also be controlled remotely. This is the biggest europe-wide engineering project to date, dwarfing the channel tunnel. It's also supported by politicians across the board and by both energy companies and environmental organisations - as Ross Anderson says, that's a hint that not all is well.

The question is, who has access to the data and commands and how can they benefit from this?

Homeowners can track their energy usage and save energy or at least money by moving high-energy tasks (like charging an electric car) to times when energy is cheapest. The half-hourly metering allows for new tariff structures rather than today's day/night ones, in fact energy use could even be priced dynamically based on current demand and smart cars start to charge themselves when prices fall below a certain level.

Energy companies get much more exact data about their customers' energy usage, allowing them to devise new energy tariffs (which, if mobile phone tariffs are anything to go by, no-one will understand except the companies that make money off them). They can also remotely cut someone off (or switch them to prepayment only) if they fail to pay up, which is much cheaper than today's methods that may even involve sending in the police.

But anyone with access to this data can determine when a certain person is at home, how many people live in a home, when they usually work/cook/shower etc. (I personally suspect that even such factors like age/gender/"social class" could be guessed with some accuracy from energy graphs. A correlation between energy usage and certain TV series/sporting events would be an interesting research paper and a goldmine for advertisers.) So clearly this is private data that we may not want companies gathering on us.

And then there's the government, which in the UK at least has volunteered to build and run a giant database to hold all this information. This could allow targeted campaigns to reduce energy consumption. Or anything else they decide to do with the data. And of course, a government database with personal data of all its citizens could not possibly fall into the wrong hands.

Next up, enemies from the "chinese army" to the "militant wing of greenpeace" (both Ross' examples) have a potential "cybernuke" (again quoted from Ross) to shut down half the country's electricity for weeks.

Then there's conflicts of interest. Energy companies are there to make money, which they make best when people consume energy. Government is trying to reduce energy consumption and CO2. So who controls the smart meter, tariffs etc?

And finally (or perhaps encouragingly, looking at the potential problems with smart meters) the whole project looks like it will become "a classic government IT disaster" (Ross again).

O brave new world ...

Saturday, September 10, 2011

TGC 2011: Invited Talk -Resource Logics for Type-Based Authorization in Distributed Systems

The invited talk "Resource Logics for Type-Based Authorization in Distributed Systems" is given by Michele Bugliesi. The talk is related to authorization and attack in distribution systems.

Type systems for authorization are a popular device for the specification and verification of security properties in distributed systems. Resource logics are a crucial ingredient for the effectiveness of type-based authorization, and its ability to support the analysis of real-life applications, in which the freshness of communication and the effective number of transactions cannot be overlooked such ad e-banking, e-voting, etc.

Implementing resource-aware policies in a distributed environment is challenging, as exchanging transient resources over the network is exposed to the risk of replay attacks.

The speaker developed various patterns for the enforcement of resource-aware authorization by static typing. The speaker also showed the effectiveness of the approach on a number of applications, including cryptographic protocols for session-key establishment, and distributed protocols for e-payment and file hosting services.