question about md5, rainbow tables, salting, etc.

Darren Dirt · October 16, 2015, 08:48:17 AM

Hey techie buddies, I know MD5 is super-vulnerable and SHA-256 is recommended etc. but I have a question.

What if instead of storing your normal "hash" (whether or not it came as a result of adding salt etc.) you instead stored a string that APPEARS to be a hash but in fact has a few characters replaced with something else that is actually used to produce the original hash?

Like, what is the vulnerability/flaw in the pseudocode below? Wouldn't it be completely immune to "rainbow tables" since it's not just "md5(key)", and it's not just "md5(md5(key))"?

Code Select



function md5(key)
{
	//normal algorithm
};

function mdfive(key)
{
	strSalt="5 digit random hex value".split('');
	strHash=md5( md5( md5( md5( md5( key+strSalt[0] )+strSalt[1] )+strSalt[2] )+strSalt[3] )+strSalt[4] )
	return strSalt.join('')+strHash.substring(5);
};

Just curious, no need to get into a discussion of why SHA-x is superior to MD5 etc... this is really just a question in general about encryption and storing hashes that may or may not be easy to crack via rainbow tables.

Tom · October 16, 2015, 08:56:19 AM

Often they get your code too. So your clever shenanigans are for naught!

Also, even if they don't have your code, they can just start by brute forcing, and figure out the pattern that way.

Mr. Analog · October 16, 2015, 08:56:38 AM

Then you are storing a cipher as a hash and is therefore security by obscurity, use at own risk (scream at own ass)

EDIT:
Actually I found a stack exchange question that sort of goes in this direction (block cipher vs hash functions)

http://security.stackexchange.com/questions/8048/why-aes-is-not-used-for-secure-hashing-instead-of-sha-x

Tom · October 16, 2015, 09:05:20 AM

Well, his original data still isn't reversible. And he drops the first 5 chars from the original hash as well, so it's gone too. It's just a custom hash, that may or may not have additional collision attacks.

Mr. Analog · October 16, 2015, 09:15:45 AM

Quote from: Tom on October 16, 2015, 09:05:20 AM
Well, his original data still isn't reversible. And he drops the first 5 chars from the original hash as well, so it's gone too. It's just a custom hash, that may or may not have additional collision attacks.

Like you said before by storing information about they key construction in code makes it vulnerable. Ideally if somebody gets your source somehow and it doesn't weaken your security (thanks server config!)

Thorin · October 16, 2015, 10:53:36 AM

As Tom said, if they have your database full of salted+hashed passwords, they most likely have your code to salt and hash those passwords. If they have the code and you cleverly messed with the hash, they'll just use that clever messing to make their rainbow tables match.

The way to defeat rainbow table attacks is:
1. don't let them get your password database
2. when they get your password database, salt each password with a different added value so that they have to create rainbow tables for every password they're trying to crack (see wiki)
3. when they do end up creating rainbow tables for every salt, have the passwords be minimum 20 characters in length so that the tables are so huge that mutliple terabyte drives can't hold them and no machines have enough physical ram to load the whole table

Beyond that, I've never been a proponent of "clever" code when it comes to security. Every "clever" bit of code I've seen has subsequently been shown to be weak.

Tom · October 16, 2015, 10:58:55 AM

Hash algorithms have also been designed to have minimal collisions. Messing with the hash after the fact, can (and usually does) increase collisions. Collisions means it's far easier to crack.

Darren Dirt · October 17, 2015, 06:16:28 PM

after reading https://en.wikipedia.org/wiki/Rainbow_table#Defense_against_rainbow_tables I wonder, how is that different in a huge way vs. what I was posing? Because essentially I was saying as a theory: store the salt with the resulting hash -- like NORMAL salted hashes do -- however THEN replace part of that final hash WITH THE SALT ITSELF so the # of characters is typical for the hash but the result is not a true hash aka it is not what crackers would expect (they would expect it to be e.g. 256 bits = 32 bytes, however virtually every final hash that they try to crack would not show up in their rainbow tables since it's not exactly a "valid" hash that got stored.

Hope this makes sense what I am saying now. Essentially I am saying "why include the salt as part of the resulting hash in an obvious distinguishable way* -- thus making it possible to separate the salt from the hash relatively easily -- instead why not 'blend it in' with the hash so on the surface it appears to be an un-salted hash?"

*since otherwise you would not be able to compare the computed hash of the key to test vs. the stored hash, right? (Unless the "salt" is buried inside the code, which is definitely not a good idea for obvious reasons)

PS: to assist in ensuring users have "chosen" really strong passwords, and to reduce to almost nill the usefulness of rainbow tables, why don't all password-hashing algorithms just "add" some long symbol-heavy string to the "key" before salting/hashing? In other words, EVEN IF Stupid User enters a password of "Pwd#123!" which is of course not "weak" since 8 characters and has lowercase and uppercase and numbers and symbols -- but short so possible vulnerable to rainbow tables, why not INSTEAD make your code say "hash the value of ' [user entered password]+[@!%^@^@!!aZ.,Za$] ' ?? That wouldn't exactly add to the computation time much (not compared to doing multiple runs of the hashing algorithm as I did in my example above) and it would ensure every hash stored is not gonna show up in a rainbow table. Especially if it is 15+ characters long right? HECK that means the stored info would still be secure even if the CODE was also grabbed by the crackes, right?

I may be confused though.

Thorin · October 17, 2015, 07:04:07 PM

The crackers will have your code so will see that you're putting the salt in the hash. So they'll adjust for your "clever" scheme. Trying to hide the salt in the hash is really just trying to obscure your technique, and obscurity is not security. Security should work even when the crackers have your source code.

Adding a string to the password to make it longer, well, that just means the crackers will add that string to every password they're generating a hash for in their rainbow tables. They'll end up with the same count of records in their rainbow tables.

Having a different salt for each password means that crackers need to create a set of rainbow tables for every salt. If they're after one specific user's password, they will eventually get it if they're willing to spend the time and the computing power and the money to buy the computing power. Salting is just to make it take longer and take away the easy mass password cracking activities.

Having a forced long password means that crackers need to create very large rainbow tables with huge numbers of hashes in them. Again, if they're willing to spend the time and money, they'll get it eventually. The idea is to make it take a couple of years of constant computing, so that they move on to easier pickings.

Darren Dirt · October 17, 2015, 10:03:30 PM

Quote from: Thorin on October 17, 2015, 07:04:07 PM
Having a different salt for each password means that crackers need to create a set of rainbow tables for every salt. If they're after one specific user's password, they will eventually get it if they're willing to spend the time and the computing power and the money to buy the computing power. Salting is just to make it take longer and take away the easy mass password cracking activities.

Having a forced long password means that crackers need to create very large rainbow tables with huge numbers of hashes in them. Again, if they're willing to spend the time and money, they'll get it eventually. The idea is to make it take a couple of years of constant computing, so that they move on to easier pickings.

^ idk if you noticed but the code I put in OP = a random salt, it might not be clear but since the salt is stored in the result I was obviously saying it would be random each time therefore "a different salt for each password" would happen. And re. the 2nd paragraph above my most recent suggestion/question kinda covers that by artificially adding a blurb to the end of the actual key then salting it (with something random each time of course) would that not result in a rainbowtable-proof output? Since it would if the normal hash was stored, why would that be different if what was stored was [normal-length hash] minus [X characters that are replaced with the salt] (instead of [normal-length hash] plus [salt] which would thus be longer than "normal" so be obviously salted).

Again, not trying to JUST use obscurity/obfuscation in place of actual security, just looking at a way of possibly improving by modifying the output in such a way as to appear un-salted -- but especially (my later post) making it so the input data is "idiot proof". Ideas to be discussed not just quickly dismissed. I mighta been less clear than I'd like, above. Or I could be completely missing something. Or I just like typing the word "Or".

Mr. Analog · October 17, 2015, 11:08:51 PM

Quote from: Darren Dirt on October 17, 2015, 10:03:30 PM
artificially adding a blurb to the end of the actual key then salting it (with something random each time of course) would that not result in a rainbowtable-proof output?

Not really, what you are doing is artificially lengthening the key

Original key "bob"
"Enhanced" key: "bob695b190b-d4b6-4b87-b760-a3066ec462c1" (better? but for how long?)

If I know how you are generating the enhancement to the key I now know something about the key, so that half actually becomes a liability. While "bob" is for sure bad, if I know how what you are appending to the original key I might be able to narrow down my search to say GUIDs (which is what this obviously is; alphanumeric, few symbols, structured in a particular way)

So I might have one routine generating two halves where I know the limits of one half (I could probably work out how much computational power would be required to crack it if I knew its limits / structure)

Thorin · October 18, 2015, 01:54:26 AM

I dunno, Darren, I can clearly see the flaw in your thinking and have already pointed it out. If you feel like that's me quickly dismissing your thinking, then I don't know how else to answer you.

Let me ask you this simple question: would this technique you're suggesting still be secure when the crackers have your code and understand your technique?

Darren Dirt · October 19, 2015, 09:15:09 AM

re. "artificially lengthening the key" yeah you make a good point, it would possibly open a door for the cracker that didn't need to be opened, since users choosing bad passwords is certainly a possibility and now (if they have the code) they can "find" those bad passwords with more certainty maybe.

Quote from: Thorin on October 18, 2015, 01:54:26 AM
I dunno, Darren, I can clearly see the flaw in your thinking and have already pointed it out. If you feel like that's me quickly dismissing your thinking, then I don't know how else to answer you.

Let me ask you this simple question: would this technique you're suggesting still be secure when the crackers have your code and understand your technique?

Including the salt in the output is what seems acceptable based on the stuff I read as a result of this thread... whether that salt is part of the same FIELD or in a different field in that table does not seem to be important, because having the salt visible in the output is required to compare hashes, and is NOT insecure if you are using a secure hashing algorithm. For example, is this post accurate in describing the standard practice? (i.e. 2 fields in the table, one for the HASH and one for the SALT) because that is really no different than a single field that is just [HASH]concat[SALT] right? <-- Perhaps this is where I am wrong about how salting is properly done?

for example, is this post accurate in describing the standard practice? (i.e. 2 fields in the table, one for the HASH and one for the SALT) because that is really no different than a single field that is just [HASH]concat[SALT] right?

If not, then all I am saying is, IF the salt being visible is not a vulnerability, then if you make your random salt allow values other than "0..f" then it's gonna be obvious "where" your salt is, and thus you have made it easier for the cracker to figure out what attack approach to take.

So from there I thought about "what if you made the salt APPEAR to be part of the hash, giving the impression that you did not actually use salt?" That might make the cracker presume you are more vulnerable than you actually are, and they might waste some time trying the wrong approach and then [the benefit to my idea, maybe] they might give up when they get "false positives" in their rainbow table lookups and essentially get garbage results in their test runs, etc.

Basically I was asking "is there a way of misdirecting the cracker initially, so they take the wrong attack approach up front"? (This would of course be with the presumption that the code is not stolen by the cracker too. But the resulting output is not suddenly super-easily-cracked even if the code is stolen, since the hashing algorithm is still a secure one being used, and being salted, what I am asking about is really just a different way of *storing* the output.)

HECK, instead of "replacing" a few characters of the hash output, how about this? What aabout if the hash is added as part of the output but you make the length appear to be an expected length of a more secure version. For example, your actual algorithm is hashing with SHA-256 but you choose your random salt to be a specific length -- with values 0..f only -- so that combined with the hash you would result in a string that has a length exatly equal to what SHA-512 would produce.

So forget "taking away" some of the effectiveness of a secure hash (which would increase collision maybe) instead I am saying what about a different way of storing the result in a single place that confuses the human being initially... and has a non-zero probability of making the cracker think you have perhaps encrypted your values instead of just hashing them or something, since if there are any rainbow table matches they would appear to be gobbledygook. (Which is likely what you think of my idea anyway

)

Darren Dirt · October 19, 2015, 09:34:56 AM

update: THIS is a great post that kinda touches on everything I was wondering AND what you guys were saying in response.

And it's more than 5 years old. But still relevant, and very clearly worded.

http://php.net/manual/en/function.hash.php#94104

Quote
Kai Petzke ? 6 years ago

The well known hash functions MD5 and SHA1 should be avoided in new applications. Collission attacks against MD5 are well documented in the cryptographics literature and have already been demonstrated in practice. Therefore, MD5 is no longer secure for certain applications.

Collission attacks against SHA1 have also been published, though they still require computing power, which is somewhat out of scope. As computing power increases with time and the attacks are likely to get better, too, attacks against systems relying on SHA1 for security are likely to become feasible within the next few years.

There is no lack of potential alternative hash algorithms, as the many choices for the "algo" argument of PHPs hash() function already suggests. Unfortunately, there is lack of analysis, as to how secure these alternative algorithms are. It is rather safe to assume, though, that the SHA2 family with its most prominent members SHA-256 und SHA-512, is better than SHA1.

When storing password hashes, it is a good idea to prefix a salt to the password before hashing, to avoid the same passwords to hash to the same values and to avoid the use of rainbow tables for password recovery. Unlike suggested in other articles, there is no security advantage in putting the salt in the middle, or even at both the beginning and the end, of the combined salt-password-string.

Rather, there are two other factors, that determine the strength of the salt: Its length and its variability. For example, using the same salt for all passwords is easy to implement, but gives only very little additional security. In particular, if users type the same passwords, they will still hash to the same value!

Therefore, the salt should be random string with at least as many variable bits, as there are bits in the hash result. In the user database, store username, the randomly generated salt for that user, and the result of hashing the salt-password-string. Access authentication is then done by looking up the entry for the user, calculating the hash of the salt found in the database and the password provided by the user, and comparing the result with the one stored in the database.

Also, this PHP.net/manual page gives an excellent, color-coded example of the format that has apparently become the standard way of storing [algo]+[salt]+[hash] as a single string value.

So the wheel has been very thoroughly invented, did not know that, good to know for the future. (tbh not even sure WHY this issue popped into my head, it's not like I'm currently working on any project that deals with user authentication, lol)

Thorin · October 19, 2015, 11:53:31 AM

Quote from: Darren Dirt on October 19, 2015, 09:15:09 AM
Including the salt in the output is what seems acceptable based on the stuff I read as a result of this thread... whether that salt is part of the same FIELD or in a different field in that table does not seem to be important, because having the salt visible in the output is required to compare hashes, and is NOT insecure if you are using a secure hashing algorithm. For example, is this post accurate in describing the standard practice? (i.e. 2 fields in the table, one for the HASH and one for the SALT) because that is really no different than a single field that is just [HASH]concat[SALT] right? <-- Perhaps this is where I am wrong about how salting is properly done?

The post you link to is exactly right for standard practice. Two columns in the user table in the database, one column has the salt, the other column has the hash of the salt+password. When a new user logs in, the code looks up the salt, concatenates the salt and password, hashes the concatenation, then compares it to the stored hash. If they match, the password is considered correct.

Quote from: Darren Dirt on October 19, 2015, 09:15:09 AM
So forget "taking away" some of the effectiveness of a secure hash (which would increase collision maybe) instead I am saying what about a different way of storing the result in a single place that confuses the human being initially... and has a non-zero probability of making the cracker think you have perhaps encrypted your values instead of just hashing them or something, since if there are any rainbow table matches they would appear to be gobbledygook. (Which is likely what you think of my idea anyway )

See, this is just trying to be clever to trick the crackers. The best assumption is that the crackers are smarter than you, not that you're smarter than the crackers.

Quote from: Darren Dirt on October 19, 2015, 09:34:56 AM
Also, this PHP.net/manual page gives an excellent, color-coded example of the format that has apparently become the standard way of storing [algo]+[salt]+[hash] as a single string value.

So the wheel has been very thoroughly invented, did not know that, good to know for the future. (tbh not even sure WHY this issue popped into my head, it's not like I'm currently working on any project that deals with user authentication, lol)

Yes, storing the algorithm that the password was hashed with has become a thing. It allows you to change the algorithm used to hash the salt+password without having to invalidate everyone's password. When you've decided to change the algorithm, the next time the user logs in you check their password with the old algorithm, and if it's good, then you re-create the hash with the new algorithm and update the record.

As far as I know most developers still store the salt and hash in separate columns, but some have decided to store them together in one field. The one thing I don't like about joining it all together is that future developers have to figure out just how to separate the three fields from the single string. It's more straightforward if the three fields are separated in the databases.