|
A Supplementary Syllabus
If you're a professor teaching a database course, you may want to use
Translucent Databases as an additional textbook. You are welcome
to consider this one week module presents some of the most important
concepts from Translucent Databases. It consists ofthree parts
that roughly correspond to the three hours spent in a classroom in a typical
week.
Part I -- One-Way Functions
- One-way functions are easy to compute but hard to reverse.
- Some of the common ones are MD5, SHA, and raising a number to a
power modulo a prime number. This section will just use generic one-way
functions and call them h(x). There is no reason to do more with advanced
mathematics.
- Most common one-way functions are not truly impossible to reverse--
they're just practically impossible. Describe how hash functions like MD5
produce their answer. How long does it take to search for a collision? How
long does it take to do brute force attack?
- Show how to protect passwords using this approach. Anyone can look
at the file and anyone can test a password presented as real. But no one
can take the password database and work backwards to determine the password
- Show how to protect credit cards. (Some systems leave the last four
digits in the clear. Mention that this is a hint for how information is treated
in Part III.)
- Show how multiple people can use h(x) to look up information instead
of just x. This can be used to synchronize schedules or protect personal
information.
- Show how to design a store database that stores h(name) instead
of name.
- Emphasize that the regular SQL database features still work with
the fields of the database that aren't scrambled by h.
Part II -- Determining Reality
- Digital signatures can use one-way functions. This section won't
use the more sophisticated, traditional versions like RSA or Diffie-Hellman,
although it could. It will only use simpler versions that are often called
Message Authentication Codes. Describe how this is a weaker restriction.
- Someone can create a signature or MAC by computing h(password,document).
Only someone with the right password can check the signature and see if it
was generated by the document.
- Show how fake entries in the database can disguise the real ones.
- Only someone with the password can distinguish between the real
and the fake.
Part III -- Blurring Reality with Quantization
- Quantization is the act of taking a number from a big set and assigning
it the closest value from a smaller subset.
- Rounding off values is one form of quantization.
- More sophisticated algorithms don't distribute the small set of
surrogates evenly over the larger set.
- Some basic algorithms block some fields if it makes it too easy
to identify the human behind the record.
- Other algorithms add random amounts to the data to disguise the
true value.
- Some encrypt this random amount so some users can get the real values.
- Show how this can be applied to medical records used for research.
- Show how this can help hide the position of ships.
Sample Homework Questions:
- Write a program to try random values of x until MD5(x) ends with
the sixteen bit value FF. How many random values should it take? Run your
program. Do you come close? Repeat this 1000 times and report the average
number of samples that must be tested before one is found. Now, extrapolate
how long it will take for your computer to completely find an answer that
matches a complete 160-bit result from MD-5.
- Create a tool for protecting medical records in a trial. Determine
which fields to scramble and which fields to leave in the clear.
- Describe some possible attacks against the scheduling algorithms
described in Chapter 4.
- Describe three ideal databases where one-way functions can prevent
abuse. Describe several examples where the technique will fail.
- Describe three ideal databases where false entries can distract attackers.
Describe several cases where the fake entries will corrupt the database.
Can this problem be avoided?
- Describe three examples where blurring data with quantization can
add enough confusion to block attackers. Can you think of examples where too
much confusion also confounds the regular users? Are there examples where
there's no middle ground?
|