this post was submitted on 05 Jul 2025
65 points (100.0% liked)
Programming
21429 readers
326 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The article calls out hash functions and links to the relevant Wikipedia page, so I don’t think this is solely about cryptographic hash functions, though that seems to be what you were talking to the other user about.
I see what you are saying. But if you aren't using a cryptographic hash function then collisions don't matter in your use case anyway, otherwise you'd be using a cryptographic hash function.
For example, you'd use a non-cryptographic hash function for a hashmap. While collisions aren't exactly desireable in that use case, they also aren't bad and in fact, the whole process is designed with them in mind. And it doesn't matter at all that the distribution might not be perfect.
So when we are talking about a context where collisions matter, there's no question whether you should use a cryptographic hash or not.
Why wouldn’t collisions matter in a hash map? They’re directly attributable to the speed of the hash map. In fact I would venture to say that collisions are directly attributable to speed in all situations. That matters, right? Especially at the language level.
If you have a hash collision in a cryptography context, you have a broken system. E.g. MD5 became useless for validating files, because anyone can create collisions without a ton of effort, and thus comparing an MD5 sum doesn't tell you whether you have an unmodified file or not.
On a hash map collisions are part of the system. Sure, you'd like to not have collisions if possible, but if not then you'll just have two values in the same bucket, no big issue.
In fact, having a more complex hashing algorithm that would guarantee that there are no collisions will likely hurt your performance more because calculating the hash will take so long.