Jump to content

Talk:MD5

Add topic
From Meta, a Wikimedia project coordination wiki
Latest comment: 8 months ago by Pppery

Why is this page on Meta? --Nemo 17:10, 4 April 2014 (UTC)Reply

Most probably because it has/had links to it (I think it is related to the MediaWiki documentation). Look at the page creation date....
I don't ink it requires more than what it contains, it just says that the algorithms is used in MediaWiki, says that it is no longer secure but sufficient for an application in MediaWiki where security is not needed, but does not explain how it works in details (it just links to more complete resources, not specific to MediaWiki).
Also it is still useful here because it has (or has been) used for some security mechanisms, and recalls that this should no longer be the case in any wiki. It just remains as a convenient hashing function that allows a fast but correct statistic distribution (here in MediaWiki its role is to randomize the storage pages for medias into a fixed set of subdirectories, in order to allow faster accesses and modifications with less I/O and less cache memory needed to browse the contents, and also to limit the number of collisions on concurrent exclusive accesses to directories on modifications (which would cause excessive delays and woud multiply the number of busy but locked threads in the server, waiting for the completion of the other concurrent exclusive accesses).
Such hasing function is used in browsers (to manage large caches and also concurrent accesses to caches by concurrent browser threads and processes), and in most media servers, typically they use about 256 subdirectories to manage very large quantities of files whose list is constantly changing concurrently.
There are similar distribution mechanisms in databases when they use hash buckets or B-trees for storing collections: for the optimisation to be effective, we always need a good hashing function which is still fast to compute.
Historically MD5 has been used (it was much better than using simple hashsums whose distribution is not enough flattened), but today SHA1 would be equally fast as (or most probably even faster than) SHA1 with today's processors (it was not true on first generation processors that did not have full barrel-shifters allowing a flat 1-cycle time to shift or roll any number of bits in a word. My experiments since long with MD5 and SHA1, with implementations carefully optimized to the same level, have demonstrated since long that SHA1 was even a bit faster than MD5 (and the old MD4 algorithm is known to be completely broken since long, thanks we don't use it !).
This small page could stay there if we ever need to discuss more about other possible uses of this algorithms in MediaWiki, or in extension, or in some Wiki-based applications (by IMHO I tend to think that we can obsolete MD5 completely, replacing it with SHA1; this is already done since long for security applications). verdy_p (talk) 21:32, 4 April 2014 (UTC)Reply
I've moved it to MediaWiki, after rediscovering this during one of my periodic Meta cleanouts. * Pppery * it has begun 05:10, 29 March 2024 (UTC)Reply