Talk:IP Editing: Privacy Enhancement and Abuse Mitigation/Archives/2022-01
Please do not post any new comments on this page. This is a discussion archive first created in January 2022, although the comments contained were likely posted before and after this date. See current discussion or the archives index. |
Admins responsibility
Hi. Does it come with any additional responsibilities for administrators to not make these hidden information public? In any possible way? Piastu βy język giętki… 18:37, 4 January 2022 (UTC)
- They already have that responsibility. 86.14.197.26 18:53, 4 January 2022 (UTC)
What if not? Preset sanctions
What happens if the people who promise to keep IP's confidential, eventually publish this information?
I believe that there must be preset sanctions, so that users can lose their rights if they fail to keep the IP information confidential. --FocalPoint (talk) 18:58, 4 January 2022 (UTC)
Wątpliwości
Co mnie w tym niepokoi: że admini będą mieć łatwy dostęp do tych IP. Ja nie bardzo chciałbym mieć. Bo co, przyjdzie czas, że wpadnie policja, czy jakaś inna służba, i powie "daj sobie dostęp, albo coś tam..." A ja wtedy się postawię? Zapomnijcie. Dam sobie dostęp, tak szybko jak się da. Więc wolałbym by była do tego jakaś niezupełnie łatwa procedura. --5.60.193.155 19:59, 4 January 2022 (UTC)
Cookie based behaviour
There are two questionable statements in the description of how cookie based identities work:
- It is likely that many users will end up with a semi-permanent talk page unless they specifically try not to.
- Another advantage to note is that talk page messages will no longer end up with the wrong recipient in any scenario.
Many web sites (using cookies for sessions) recommend that you clear your cookies afterwards. With normal browser setups, clearing cookies for that site only is difficult, so most people will clear all cookies, meaning they will have a new talk page every time. Of course, many will not, but the wording suggests that the system generally leads to persistent talk pages. Also people using a public computer will often get cookies cleared between sessions, even if they routinely use the same computer.
There are still scenarios with shared talk pages. The simplest is a desktop computer used by a family, with no individual accounts, or with family members occasionally using each other's accounts on the computer. I think this is quite common. Another scenario is a public computer, where cookies are not cleared between uses. People might not log out, so even where cookies are cleared on logout, unrelated people might use the same cookie.
If we want talk pages to remain stable for unregistered users, we might want to provide the cookie in plaintext on their screen, allowing them to save it, and allow restoring the identity by entering that cookie somewhere (it doesn't of course need to be the cookie itself, but I see no reason to use anything else). That would be a login without registration, but the first point suggests that that is what we want.
–LPfi (talk) 20:19, 4 January 2022 (UTC)
- Browsing in incognito mode, as many anonymous users do, completely negates any advantage that may have been imagined for this session-identity idea, because the entire session (cookies and all) disappears when the browser window is closed. Definitely not ready for rolling out yet. Needs much more thought. Anachronist (talk) 02:08, 5 January 2022 (UTC)
The encyclopedia that anyone can edit
It's not a good sign when we are asked to comply with a legal directive whose scope is unclear. By what authority is that directive issued? In whose name? See Instruction creep, and en:Instruction creep.
When an editor is used to jumping in to edit an article on a favorite topic, without logging in, they are currently exposing their IP address, which is of interest, for example, to authoritarian regimes. If WMF were to assign a time/device-based identifier to that editor, an edit session and cookie would get created on the editor's device (phone/ tablet/ computer) in their behalf, by default without their own mental effort. If the editor were to select their own identifier that cookie would have their en:username embedded in it. A sock puppet investigation, or vandalism block, would be aided by a database of this kind of information. This would embargo the devices of those malicious users, who have not selected a username.
At the very least, the editor is going to have to acknowledge the they have read, and intend to comply with the encyclopedia's publshed standards; such as the en: five pillars. --Ancheta Wis (talk) 20:41, 4 January 2022 (UTC)
Impact on AbuseFilter
I think it would be best if impact of IP masking on AbuseFilter would be clearly explained on this page on Meta.
Moreover, if the session-based approach is used, it would be ideal if AbuseFilter could access both the IP and the session-based masked username in its rules. For instance, AbuseFilter currently allows throttling by IP and it would make sense for it to be able to both throttle by IP (i.e. across masked usernames from the same IP) and throttle by masked username (even if user changes IP). Huji (talk) 01:29, 5 January 2022 (UTC)
How do we handle IP vandalism, and also shared IPs violations
My questions are ultimately about how users report IP violators at English Wikipedia: Administrator intervention against vandalism, and also how we handle Wikipedia: Template:Shared IP violations. Maile66 (talk) 01:43, 5 January 2022 (UTC)
Question
Hello! When is it scheduled to be completed this project? AlPaD (talk) 20:37, 4 January 2022 (UTC)
- Per the message, they will decide on the final move after January 17. —Atcovi (Talk - Contribs) 21:16, 4 January 2022 (UTC)
- OK, thank you. AlPaD (talk) 07:08, 5 January 2022 (UTC)
For small wikis, I think the IP based approach is better
For small wikis, I think the IP based approach is better because it is unlikely that two anonymous users will have the same IP, and for a vandal modifying its Ip is most difficult that erasing cookies --Gat lombard (talk) 22:23, 4 January 2022 (UTC)
In addiction, small wikis need a heuristic system to recognize reitrated vandalism since there are few vandals but the same vandal could recur repeatedly with similar vandalisms. We need an artificial intelligence that recognizes vandalisms based on past recent vandalisms on that wiki, most likely for small wikis it would only block the vandals and nothing else. A famous example to solve is the one that kepps on happening on the wikipedia in Neapolitan language and occasionally on the Lombard one. The vandal is very harmful to the small Neapolitan (nap) wikipedia. A heuristic system might help. A similar problem could happen to any small wiki --Gat lombard (talk) 22:30, 4 January 2022 (UTC)
I agree with Gat lombard. In my opinion, the IP based approach is better for small wikis. --Starladin (talk) 10:56, 5 January 2022 (UTC)
Actually it's much easier to change IP than erase cookies. Of course depends of settings, but at least in Finland we use a lot mobile network, and all the time we restart it the IP will change automatically. Stryn (talk) 11:31, 5 January 2022 (UTC)
Suggestion
"The path is to create a new identity for unregistered editors based on a cookie placed in their browser. In this approach there is an auto-generated username which their edits and actions are attributed to. For example, User:192.168.1.2 might be given the username: User:Anon3406.
In this approach, the user’s session will persist as long as they have the cookie, even when they change IP addresses."
Ok, not withstanding the fact that from where I sit there is no good reason for this change whatsoever (And never will be owing to so called legal and ethical reasons that consistently make any attempt to understand WMF, T&S, LEGAL, etc proclamations useless from the start) would it at least be possible to create a session based identity in a way that allows for some semblance of tracking? For example, generating a username like "Anon01.05.21.12.59.xxx" could be used to track edits by editors by showing the date and time when the user was generated their id, with the xxx part filled in as the number to which they were created at the time (for example, 001 as the first user for that date and time, 015 for the fifteenth, etc)? In this way we can at least attempt to keep track of whose editing without showing the actual isp address while still allowing the community to keep tabs on unregistered benevolent/malevolent editors using a basic date/time setup. TomStar81 (talk) 08:07, 5 January 2022 (UTC)
Support for IP based identity
As an admin in German-language Wikipedia, of the two paths described here (IP based identity vs. session-based identity) I clearly prefer the IP based approach. It's just too easy to use a browser's privacy mode or to clear the cookies (I'm doing it myself all the time); changing your IP address at least requires a bit more effort, and we have already a policy against using open proxies in place. I agree with Beland that the session-based identity approach could probably make communication with well-meaning unregistered editors easier, but it just doesn't seem robust enough. Gestumblindi (talk) 08:28, 5 January 2022 (UTC)
- As an admin in German-language Wiktionary I also prefer the IP based approach for the same reasons. --Udo T. (talk) 14:04, 5 January 2022 (UTC)
- The way I understand the proposal blocking by IP remains possible: if you get IP-blocked and clear your cookies, you'll get a new "user name" based on your session but still cannot edit with it because your blocked IP is still the same. Session-based blocking could be additional, but I don't see much advantages with this for the reasons you mentioned. hgzh 15:02, 5 January 2022 (UTC)
What's going to happen at switchover time?
Hello, I'm an admin on the English Wikipedia. We have countless discussion pages used to track vandals by their IP addresses. Suppose there's an ongoing discussion about the notorious 127.0.0.1 vandal. This new system is switched on; suddenly it's illegal (for lack of a better word, but you get me) to edit that discussion because IP addresses are private. That raises questions:
- Are we going to have to search and replace every instance of 127.0.0.1 with TemporaryName123 and revision-delete the entire page history up to that point?
- What about all the archives? A "sock puppet" log may be dormant for a long time until a vandal pops up again, and have a bunch of information about IP addresses that were used before the new system came in - what to do about those?
- Will the WMF have a cut-off date for how far back we'll have to manually edit IPs out of view?
- Will there be a tool that administrators can use to turn IP addresses into their masked versions, to facilitate discussions that were IP-based? Edit: I think this is absolutely essential to avoid huge confusion
I don't expect you to have answers to all these right away, and I support this change, but boy. This is going to be hard. — Scott • talk 18:55, 4 January 2022 (UTC)
- Hi Scott, I don't have a full answer for you, but we will not implement this retroactively, that would just create too much work and trouble for everyone. The already published IPs stay. Johan (WMF) (talk) 19:07, 4 January 2022 (UTC)
- It does however leave the discussion going about how particular ranges will get discussed and referred to. People who have the new right will see the same wave of vandalism (or what have you) and that it's TempName5 and similar. When they discuss it, are they going to have to just use the new naming - how to make people who haven't made the jump (or, have, but haven't associated to the new name-set) understand its the same problematic editor? This seems to have a major ongoing risk of tying IPs to masked IPs, or hindering the process. Nosebagbear (talk) 21:37, 4 January 2022 (UTC)
- @Nosebagbear: That's exactly what I meant with my last question. For continuity we must have a tool that will "hash" IPs into their new, masked form. — Scott • talk 15:19, 5 January 2022 (UTC)
- Hello @Johan (WMF):, considering that sysops have the privilege to see the IPs that soon will be "private", are you going to implement requirement for sysops and upwards, such as they must be identifiable and not anonymous, at least to WMF staff? -- Blackcat 23:49, 4 January 2022 (UTC)
- That's not needed even for CUs - I believe only BOT (public-known) and MCDC (WMF-known) members fall into this camp currently. Nosebagbear (talk) 10:52, 5 January 2022 (UTC)
- It does however leave the discussion going about how particular ranges will get discussed and referred to. People who have the new right will see the same wave of vandalism (or what have you) and that it's TempName5 and similar. When they discuss it, are they going to have to just use the new naming - how to make people who haven't made the jump (or, have, but haven't associated to the new name-set) understand its the same problematic editor? This seems to have a major ongoing risk of tying IPs to masked IPs, or hindering the process. Nosebagbear (talk) 21:37, 4 January 2022 (UTC)
- @Blackcat the Wikimedia Foundation will not require identification of admins for this. Find more details about the new user right needed to view IP information here. –– STei (WMF) (talk) 11:41, 6 January 2022 (UTC)
- I think only seeing the IPs would not needed to identification into the noticeboard, as this is time-consuming and not the best way, because IPs are sometimes not as powerful as determining the state of privacy. Personally, I don't think that admins and IP viewers must be 18 years old, having experiences (We don't ban minors into being an admin, with a requirement that you must have a mature-like way of work and not act like a child), and not living in countries which bans Wikipedia. Thingofme (talk) 15:33, 5 January 2022 (UTC)
Global groups
Hello, My little contribution here is a mix between the questions of @MusikAnimal and Camouflaged Mirage:. The implementation is still unclear to me. I will surely repeat what has already been said, but as it seems to me essential. The interwiki patrol must be recognized by a GR to provide an acceptable "escalation" to the S or GS. That is to say a SWMT patroller should not give more work to other usergroups (S, GS) if he/she does not know the damage of IP, or even a range of IP , on all wikis. Cordially and happy new year 2022. —Eihel (talk) 21:15, 4 January 2022 (UTC)
- Kindly, can you write the 'S', 'GS', 'SWMT' in full so we are on the same page? –– STei (WMF) (talk) 12:12, 6 January 2022 (UTC)
- Hi @STei (WMF):, I'm pretty sure that they are: GR = Global Rollbacker, GS = Global Sysop, S = Sysop, SWMT = Small-Wiki Monitoring Team Nosebagbear (talk) 13:04, 6 January 2022 (UTC)
- Courtesy links: Stewards, Global rollback, Global sysops (the global groups) and Small Wiki Monitoring Team. —MarcoAurelio (talk) 13:05, 6 January 2022 (UTC)
- Hi @STei (WMF):, I'm pretty sure that they are: GR = Global Rollbacker, GS = Global Sysop, S = Sysop, SWMT = Small-Wiki Monitoring Team Nosebagbear (talk) 13:04, 6 January 2022 (UTC)
Prefer IP
I am leaning towards the IP-based identities, even if encrypted, as cookies seem more complicated to deal with and very bothersome to keep shutting their annoying pop-ups (very standard in Europe). I have to mention that I prefer that till this day, one could use Wikipedia without cookies, unless he wants to log in to edit with his username. --Mahmudmasri (talk) 01:36, 6 January 2022 (UTC)
- I also support the hashed IP approach. Wayne (talk) 09:52, 6 January 2022 (UTC)
I prefer IP
I prefer IP because a vandal might clean cookies to continue to vandalize.--Simonk (talk) 14:33, 6 January 2022 (UTC)
My own idea
After reading this page and thinking carefully about the IP editing problem, with masked IPs and recognition, we need some questions:
- 1st: How do we "report" IPs in en:WP:AIV, in which we report the full, unmasked IP address. If it's hidden to all of the people, then the reports and contributions page in the wiki are to read for anyone else?
- 2nd: If we go with IP-based idenification, we should be sure that blocking IP logs will be feasible to:
- All users, new accounts: Only the masked IP address. In this type of approach, only masked IP address are shown in the block log.
- Extended confirmed users: Can see the first part of the IP, and the last part be hidden, and shown to the anon-identity. For example:
115.68.37.254
turns into115.68..h4jis
where anons edits, for new users turn intoAnon-5hsjeh4jis
. I think this should make into a pattern hash, where two IPs in a small range would change nearly similarity: Eg:223.0.1.15
-->Anon-bx7t7uv6h2
and223.0.1.94
-->Anon-bx7t7uvmt5
. - Administrators and new user rights call "IP viewer": See the complete IP in the contribs and block logs
- 3rd: If we allow session-based approach, this would be a mess for patrollers and the mixer of them, extended confirmed users. So, in my opinion, it would not be feasible.
- 4th: If we allow the mix of session and IP based, the username would be as
Anon-(10maskIP)-(15maskSE)
, in which 10 and 15 is the number of letters in the hash. In this case, everyone can "view" the contributions on the same IP (request Anon-(10maskIP) and viewing the contributors. But letting the contributions page intact are a "bad" idea because we don't know whether of same IP address are used. - 5th: All accounts used for verifying the IP-addresses can't be registered. For this change, we would have to find accounts with the prefix
Anon-...(10char)
and rename it. Hopefully, no accounts with this name as they are forged as violation of username policy.
If you like to discuss this idea, please let your comments about my five things to notice. Thingofme (talk) 15:28, 5 January 2022 (UTC)
- I am strongly opposed to any partial masking of IP addresses. If people cannot be trusted to see the entire IP address, we should not show them any of it. A semi-trusted IP Geolocation (what country/province the ISP is located in) could be interesting if the WMF can implement it. 力 (talk) 01:47, 7 January 2022 (UTC)
- I think partial masking is NOT ok, as it's no privacy and the other way is we can't implement that. So I think we should hash the IP into a IP-based approach: Anon-(10-20randomletters) and only checkusers can check the IP address used by an unregistered users. The IPs will be a hashed algorithm based on prime numbers (RSA/SHA256) and we have somethings: Admin can only block one single IP address-hashed user; but checkusers can block an entire range and check range contributions. The rangeblock logs would only be a private log, so it's hard to handle rangeblocks. Thingofme (talk) 07:53, 7 January 2022 (UTC)
Very confused
Hi. There are a lot of words on this talk page and on the subject-space page so perhaps these questions are already addressed.
If you're simply hashing IP addresses ("User:192.168.1.2 may appear as User:ca1f46"), isn't this reversible/decipherable? And it seems like the entire trust model is based on hundreds, maybe thousands of users, continuing to have access to the IP address anyway, via some kind of user preference checkbox? What is being improved here, what's the actual benefit?
For session-based identity, the page notes "vandals in privacy mode or who delete their cookies would get a new identity without changing their IP" and then this massive abuse vector doesn't seem to be addressed at all. How will you prevent someone from maliciously editing via dozens of sessions?
Regarding the entire implementation, why not just auto-create accounts for unregistered users? We already have that flow easily established (both user registration and logged-in user sessions) and it wouldn't require all this other work.
Thanks in advance for any guidance you can provide. --MZMcBride (talk) 17:42, 5 January 2022 (UTC)
- It would be technically possible to create a confidential database table containing a list of IP addresses and their "hash". This could be as simple as:
- 127.0.0.1 is "0"
- 192.168.0.1 is "1"
- 198.51.100.0 is "2"
- 198.51.100.4 is "3"
- 127.2.3.10 is "4"
- There would be no way to decipher "4" to "127.2.3.10" just from knowledge about other addresses. Classical hashing/encryption algorithms, on the other hand, may not be simply usable due to known-plaintext attacks. ToBeFree (talk) 18:05, 5 January 2022 (UTC)
- This does sound like auto-creation. I'd like to see this whole initiative renamed to something like Nymity: accounts for readers --
- Readers are automatically assigned accounts (essentially what the session model gives you).
- This lets them set and preserve preferences
- If they want this to persist across machines / browsers, they can convert this to a user/pass
- On conversion, their old prefs carry over, and any old edits can be reattributed
- This gives us more visibility into usage patterns (as we tune the sites)
- Edit-analysis tools, from quality-assessment to vandal fighting, now have extra data: cookie in addition to IP + fingerprint
- Readers can see their nym, and choose to generate a new nym if they want (refreshing the cookie)
- Of course to customize and choose your own name, you are still welcome to make a user/pass account
- Update common edit-moderation tools to show nym edits clustered by {IP range, nym}
- Readers are automatically assigned accounts (essentially what the session model gives you).
- This does sound like auto-creation. I'd like to see this whole initiative renamed to something like Nymity: accounts for readers --
- I think combining the above (which would be a tooling + user-experience upgrade) with new NDA-style restrictions on what editors say to whom, is a very counterproductive idea. Don't add a new NDA or clickwrap honor pledge. It's not necessary or helpful for the core privacy concern; but it will cause headaches, confusion, and grief to our committed and often very-literal editors. Simply implementing the above will reduce by almost 100% the # of readers whose IP information is exposed to other casual readers, search-engine spiders, &c. –SJ talk 17:53, 7 January 2022 (UTC)
Idea
Hello. I would suggest that the masked identity be permanent for each anonymous user. AlPaD (talk) 20:29, 5 January 2022 (UTC)
- Maybe I'm missing something here. If any unregistered user wanted that, couldn't they just get a permanent masked identity – and even one of their own choice – by registering some user name? ◅ SebastianHelm (talk) 07:18, 6 January 2022 (UTC)
- Hello, yes no needed. AlPaD (talk) 20:59, 7 January 2022 (UTC)
My two cents
I'm just going to dive in here. Maybe it's the wrong place - it's hard to tell.
As an en-wiki admin (but not a check-user) I couldn't give a monkey's cuss about the actual IP address I see. Encrypt it for all I care; it matters not one jot to me. All I need to worry about on a daily basis is seeing the edits made on the full IP range on which one individual is liable to be editing on, and to do the occasional geolocate when deciding if they're avoiding a previous block (also based on other editing characteristics).
What I currently absolutely hate is the inability to see IPv6 user contributions across the whole range (usually /64) by default. There is also no button or tool to let me display this - I have to insert "/64" manually, which is so absolutely frustrating. I believe /64 contributions should be shown together by default, even if anonymised, especially as it's totally impossible to add /64 to a url on my mobile phone because it doesn't have a 'Home'/'End' function. So I simply have to let IPv6 vandals get on with it until such time as I'm back on my desktop or laptop.
By default, I need to see all contributions made across an IP address range relevant to one individual. I really don't need to know genuine IP address details. I need a system to recommend/look up the most effective rangeblock for that person which has the least collateral damage. And I need some way to communicate with past and future addresses on that single user IP address. For example, if I warn or block a /64 range of IPv6 addresses, I'd like to post - in one go - a warning to all past addresses on that range, a well as ideally having a system that identifies new addresses on that range and automatically repeats my messaging over the next 24-48 hours. What I don't need to know for my routine admin work is their actual IP address. — The preceding unsigned comment was added by Nick Moyes (talk)
- This would be a really interesting set of functionality for the new tools - and a "default to the /64" position for ipv6 would be excellent, and indeed becomes more critical with masking. Nosebagbear (talk) 10:49, 5 January 2022 (UTC)
- Agree. I frequently block persistent disruptive users whose edits are on a :/64 range. No need to see the actual addresses, just need to know that they are related.Bagumba (talk) 04:15, 9 January 2022 (UTC)
What does WMF need us to read?
I received a notification at w:User talk:SebastianHelm: How we will see unregistered users from WMF that contained a request for feedback, about which I had a question. Since I have not received a reply, I'm reposting it here below.
Thanks for the notice, Johan. For the suggestions for which you would like feedback, you're linking to a page with over 9000 words. That's a reading time of over an hour, and the TOC doesn't list a basic explanation of the “two suggested ways”. Do we have to read it all, or can you narrow down what we have to read (at least the core text; of course it can use terms that are explained elsewhere and are linked to the appropriate places; that's the beauty of hypertexts) in order to be able to give you the feedback you're asking for? ◅ Sebastian 19:02, 4 January 2022 (UTC)
I might add that over at the English Wikipedia, we have nice recommendations such as w:Wikipedia:Writing better articles that would help for such a meta-article, too. ◅ SebastianHelm (talk) 07:48, 6 January 2022 (UTC)
- Yes, it would help to just get the current proposals to comment on and to provide a more concise version of the rest (I'll make a start: the WMF legal statement can be cut to "IPs must be masked for legal reasons but we can't tell you what these reasons are" without losing any real information). I am in favour of the still-mentioned proposal 3 (disable IP editing as on the Portuguese Wikipedia), which would free up developer time for more useful issues. Kusma (talk) 19:57, 6 January 2022 (UTC)
- @Kusma The legal statement is not just that, but also "If we told you these reasons, we would basically tell everyone about a plan to cause trouble to the projects(Wikimedia Foundation) and the users." I.E, if they disclosed the reasons, Wikimedia Foundation could be likely sued, for those reasons, even if the reasons are false. Techie3 (talk) 02:39, 11 January 2022 (UTC)
Is IP-based approach truly privacy-preserving?
If the same IP always resolves to the same masked username, then once one person reveals the association (let's assume for good reason), it'll be permanently revealed. Not sure how masking will be implemented (same hash for all projects, or different hash per project) but this can have broad impacts.
It would be great if the details of the approach are shared here on Meta (as opposed to just on Phab/code docs). Huji (talk) 01:31, 5 January 2022 (UTC)
- The premier example would be the IP of a company gateway not serving too many users: any colleague would see that somebody else in the office has made those edits, and knowing views, interests and language quirks, it might not be too difficult to guess who. The colleague may also have an account and once forget to log in, then signing their IP comment by their user name. Now everybody who knows them can guess where that masked IP is coming from, compromising the privacy of the other user. –LPfi (talk) 13:00, 13 January 2022 (UTC)
Reserving the naming convention of masked usernames
We currently don't allow someone to create an account named User:8.8.8.8 for good reason: that is an IP address.
I think, similarly, we should not allow users to create accounts with names that would look similar to the masked usernames used for IPs (regardless of whether the IP-based or the session-based approach is used). Perhaps, masked usernames should look like prefix-[a-z0-9]{8,128}
and once the feature is turned on, no user would be allowed to create an account that matches that pattern. 01:35, 5 January 2022 (UTC)
- I suppose that's the intent. It might be useful to block registration of such usernames already when the prefix is suggested. –LPfi (talk) 13:03, 13 January 2022 (UTC)
- Yes, if we go this way, we’ll first be figuring out what the naming convention could be (shouldn’t have existing usernames either) and we’ll also not be allowing those names to be registered in the future. –– STei (WMF) (talk) 14:54, 13 January 2022 (UTC)
Anonymous editing restriction experience at fawiki
The Portuguese Wikipedia experience is mentioned here, but the (ongoing) experience at Persian Wikipedia (phab:T292781 and Dashboard) is not. I understand the latter is an ongoing experience, but it has some advantages over the ptwiki experience (namely, restriction is only in the main namespace, and several metrics are being monitored), and I think it is useful for others to be aware of it. Given that I am not impartial in this, I refrain from adding it myself, but suggest that someone else does so. Huji (talk) 01:38, 5 January 2022 (UTC)
- The phab ticket seems to tell that 2/3 of IP edits were not reverted, and those were lost (no significant increase in edits by registered users), 18 % of all non-bot edits. That's the raw statistics, I cannot judge the metrics, and long term effects were not discussed. I assume that most large contributions are made by registered users, so many of the lost edits would probably have been correcting spelling mistakes and the like. Whether this is a significant way of loosing new users (who never start editing and never register) cannot be seen from those figures. –LPfi (talk) 13:17, 13 January 2022 (UTC)
Please don't use the cookie approach
Personally, i generally use session cookies, which are deleted after closing the browser. To avoid tracking while surfing on the internet. Many others do too. And i know of several people who delete their cookies manually on a regular basis. All of these cases would be harder to identify and attribute recurring IP-vandalism to earlier cases. --Ghilt (talk) 13:52, 8 January 2022 (UTC)
- Yes, I'm also one of these habitual "cookie-deleters". I think that the cookie approach wouldn't work well, see also above. By the way, Johan, I suppose you don't want a "vote" on these two approaches, and I understand this, but I think, as this is the main choice we seem to have here, I would be easier to see what people prefer if we had a dedicated section/page where people could state which approach they prefer and why instead of this general feedback page. Gestumblindi (talk) 15:24, 8 January 2022 (UTC)
- Yes, we should create a RFC for which we vote on which way should we implement. We would count the votes and consensus and we will determine the result. The talk page only go about what "volunteers", "users" think. Thingofme (talk) 15:34, 8 January 2022 (UTC)
- Yeah, some browsers have that "auto delete cookies when closing" option. --魔琴 (talk) 16:52, 8 January 2022 (UTC)
- Including Firefox, not just some niche ones. And it also has the anonymous window feature, and a "clear cookies now" button in the configuration menu. Do we know how widely those are used, generally or when visiting WMF sites? –LPfi (talk) 14:00, 13 January 2022 (UTC)
- So cookies are terrible, as they also perceive data privacy, and can be deleted very often, so we can't block cookies. Thingofme (talk) 04:21, 14 January 2022 (UTC)
- Including Firefox, not just some niche ones. And it also has the anonymous window feature, and a "clear cookies now" button in the configuration menu. Do we know how widely those are used, generally or when visiting WMF sites? –LPfi (talk) 14:00, 13 January 2022 (UTC)
Support
The session-based system does seem better, and would make it easier to communicate with anonymous editors. I'm an admin on English Wikipedia, and my main interaction with IP editors is reverting and warning them against vandalism. In several cases recently I haven't even bothered posting a warning, since it seems unlikely the right person would receive it. In one case I was trying to have a conversation about some proposed change, and I was talking to several different IP addresses, and it was unclear that it was actually the same person, and I had to keep asking them about that.
I do know some people who have corrections for certain articles, and instead of making the change themselves ask me to do it. Maybe not having their IP address exposed will help lower the barrier to participation? Or maybe there are just other factors which are more important?
In the long run, I'd lean toward banning IP editing completely on English Wikipedia. Yes, it might discourage a little casual editing, but the results from the Portuguese Wikipedia seem to show that effect is minimal. Maybe the vast majority of legitimate editors are simply motivated to edit Wikipedia because it's highly visible, enough to sign up for an account if they don't already have one. It would cut out a lot of vandalism and make communication even more reliable. The idea that IP-based editing provides more privacy is weird, considering that exposing one's IP address seems considerably less private, though session-based anonymous editing would improve the privacy situation a lot. I do feel like having less vandalism to deal with frees up legitimate editors to make more substantive contributions, often with a small number of larger or more thoughtful edits, in contrast to the larger number of tiny edits which casual editors seem to make. English Wikipedia is currently suffering huge fact-checking and neutralizing backlogs, which do need those deep, time-consuming edits to fix. -- Beland (talk) 01:42, 5 January 2022 (UTC)
- @Beland: A deliberate vandal will either browse in incognito mode or delete their cookies regularly, which would completely negate any benefit you perceive from the session-based system as currently proposed. I oppose it because it hasn't been well thought out. MAC addresses would work better than cookies. Anachronist (talk) 02:10, 5 January 2022 (UTC)
- @Anachronist: I don't think the session-based system is going to substantially reduce vandalism; I think it just makes it easier to communicate with no-account editors, both vandals and legitimate. Perhaps it will reduce the number of IP-based blocks admins would make (because the first step would be to block the session pseudo-account) though IP-based blocks would still be the way to deal with the sort of vandal you're thinking of. Based on the comments from the Portuguese Wikipedia, it does seem like banning no-account editors would significantly reduce vandalism. But this masking proposal is for privacy, not security; and it does seem to me the team has thought through carefully how to avoid reducing the current level of protection against vandalism, if that's what you're worried about.
- With regard to MAC addresses, aren't they only available on the local Ethernet or other data link? I don't think web servers have access to MAC addresses because IPv4 and IPv6 don't transmit them, and JavaScript engines in web browsers don't expose that information. -- Beland (talk) 07:47, 5 January 2022 (UTC)
- Whether the cookie based sessions will allow communication depends heavily on how many of those editors clear the cookies regularly, actively or through browser features. Do we have statistics on that? –LPfi (talk) 13:24, 13 January 2022 (UTC)
- Yes, I think this is something WMF could collect data on. Izno (talk) 04:45, 15 January 2022 (UTC)
- Whether the cookie based sessions will allow communication depends heavily on how many of those editors clear the cookies regularly, actively or through browser features. Do we have statistics on that? –LPfi (talk) 13:24, 13 January 2022 (UTC)
Nickname considerations
When displayed, generated user names should contain one of #
, @
or /
to avoid collisions with registered user names.
I do expect a lengthy code.
- When displayed, code shall be grouped by three or four characters.
anon#5F28-B73C-D218-6AE3
- This individual might be addressed in conversations as
@5F28, this is an interesting aspect.
The word “anonymous” etc. should be displayed in project or even user language.
- The rendered nick shall be generated from the internal code.
- Internally it might be:
5F28B73CD2186AE3
or whatever. - Visible in English perhaps as:
anonymous#5F28-B73C-D218-6AE3
- Greek readers should see
ανωνυμιών#5F28-B73C-D218-6AE3
and in Russianанонимный#5F28-B73C-D218-6AE3
etc.- Digits are understood in almost every language and scripting. The keyword in local language is explaining the meaning of this strange thing. Not in all languages there is a concept of letters – therefore latin letters should remain. From URL users might have learnt a few latin letters, better not transferred into Alpha Beta Gamma Delta.
- Ponder on right-to-left scriptings. Should work.
- There is a need to resolve generated nicks backward into internal code.
- If there is a (e.g.)
#
inside, followed by supposed number of hexcodes and perhaps some hyphens, or not, then drop the leading word and use the hexcode sequence only.
- If there is a (e.g.)
Greetings --PerfektesChaos (talk) 09:01, 6 January 2022 (UTC)
- Language localization is ok, but provide translates to recommend the Anon letter, but we have to rename accounts which has the prefix Anon. Also, IP address can be registered as normal, and IP can only be checked by checkusers, even unregistered users. Editing banner would say "Not logged in, create an account to customize your username and having more benefits." Thingofme (talk) 13:09, 6 January 2022 (UTC)
- The pound sign (#) cannot be used for obvious reasons. @ seems interesting. Izno (talk) 04:52, 15 January 2022 (UTC)
Signature
If an unregistered user contributes to a talk page, his IP is not only recorded in the edit history, but also in the signature within the page body. What will happen to older discussion statements with IPs in their signature? Will somebody build a bot, which will replace all old signatures (very, very many) with the new session-based identity?
But even if so, you can still see the old IP signatures, if you watch older revisions in the history. To hide all editor's IPs, it will be necessary deleting all older revisions (Revision deletion). But this will be not confirm with our licence policy. --Indoor-Fanatiker (talk) 03:45, 7 January 2022 (UTC)
- @Indoor-Fanatiker, for now we will not be applying IP Masking approaches retrospectively. –– STei (WMF) (talk) 10:33, 18 January 2022 (UTC)
Expand the constituency
I have been notified by Johan of this proposal and asked to comment because I am an admin on the German Wikipedia. While I appreciate the effort of reaching out, I have a hard time providing comments, because:
- The proposal reads more like an essay than a proposal, it lacks a succinct summary and clear questions
- The whole thing feels like a foregone conclusion (Whatever the comments, the foundation wants to turn this into a tool for converting unregisteerd users into registered users, so...)
- Feedback mechanism is so unstructured that ignoring it or splitting it into a myriad sub-threads seems a given.
- The constituency for notification is both too narrow and too broad, see below.
On the last point, even though I am fairly active in de:WP, I almost exclusively deal with RfDs and DELREV as an admin. I don't do interventions against vandalism, and hardly block anyone. I believe that the notification should also go to non-Admins who are actively contributing to fighting vandalism. In de:WP, see de:Wikipedia:WikiProjekt Vandalismusbekämpfung/Ansprechpartner. Also, all past and current checkusers should be notified, not all of which are admins. See de:Wikipedia:Checkuser#Checkuser-Berechtigte. I have written a notice about this proposal and feedback process in the German version of the Signpost (Kurier). --Minderbinder (talk) 08:30, 7 January 2022 (UTC)
- Thanks! This is absolutely not a conversation where we're just looking for feedbacks from admins, nor do we think all admins are interested. But we sent out a reminder to all admins, because they're a group who have a high likelihood of being affected – both because we wanted them to be aware, and because picking one group who have a high chance of being interested also helps spreading awareness to others in the communities without pinging every content contributor. We're equally interested in all feedback here from anyone who feels affected in any way. /Johan (WMF) (talk) 13:44, 7 January 2022 (UTC)
- Hej Johan, thanks for the answer. Your approach of spreading awareness to others seems to be working, as following my summary there is a discussion on the German signpost talk page now. Check it out. --Minderbinder (talk) 15:36, 7 January 2022 (UTC)
- @Minderbinder, the project page of IP Masking does have more information (as an add-on to Johan's message), you can check it out and also share with other colleagues in your discussions if you want to. Also your issues with the structure of the proposal and talk page has also been noted. Thanks for your feedback. –– STei (WMF) (talk) 10:28, 18 January 2022 (UTC)
- Hej Johan, thanks for the answer. Your approach of spreading awareness to others seems to be working, as following my summary there is a discussion on the German signpost talk page now. Check it out. --Minderbinder (talk) 15:36, 7 January 2022 (UTC)
auto-generated usernames should be distinct from all normal usernames
As an editor or reader I don't care about IP addresses as such but I do care about (1) being able to quickly distinguish edits from unregistered and registered editors and (2) being able to make a reasonable guess at which edits are from the same editor. As I understand the "session-based approach" both aspects would be jeopardized: (1), since autogenerated usernames might be indistinguishable from real ones and (2) because the session cookie may change more often that the IP. A drawback of both schemes is that the IP range (which typically is more stable than the IP) is hidden.
But I think one could combine full masking with maintaining uses (1) and (2) even better than possible currently by simply using three hashes (one for the IP range, one for the full IP and one for the cookie (or cookie+IP))? That would obscure all identifiable information, but maintain distinctions that are useful for seeing different edits as "belonging to each other" or communicating with the unregistered editor. --Qcomp (talk) 13:43, 7 January 2022 (UTC)
- Currently, MediaWiki prohibits usernames that look like IP addresses. That should also be implemented for whatever form the obfuscated identifiers take. AntiCompositeNumber (talk) 05:30, 8 January 2022 (UTC)
- Auto-generated usernames should have a prefixed pattern like Anon-..., so it is easily identified to the usernames. IP masks are pseudo-random, so it's rubbish and easily be identified with users. Also, random names are banned from creating cause it violates the username policy. Thingofme (talk) 10:31, 8 January 2022 (UTC)
- @Thingofme,@AntiCompositeNumber, @Qcomp, there's some information here on how the auto-generated names will look like. Also +1 on distinguishing names. Any username generated will be distinct from the 'normal' ones. –– STei (WMF) (talk) 10:09, 18 January 2022 (UTC)
Cryptographic advice
At this point, the talk page has become too cumbersome for me to parse and make heads or tails out of. But if the Wikimedia Foundation does go with "hashing", feel free to reach out to me if you'd like any cryptographic advice. I used to do cryptography related work for Mozilla and Twitter, now at Dropbox, and would be happy to threat model it if that's the route that y'all go down. — Marumari (talk) 02:20, 9 January 2022 (UTC)
- @Marumari, thanks for offering. –– STei (WMF) (talk) 09:05, 18 January 2022 (UTC)
Concerns by an IP editor
I have been editing as an IP for a while (there are reasons for it), and I feel that this proposal may be for the worse. Because my IP changes over time, with the first approach it would be completely different each time it changed, and I couldn't be a "range", but a semi-random set of IPs, which would make it hard for non-admins to know all the IPs are me. Similarly, with the second approach, I clear my cookies very frequently, so that'd be worse. What can I do (apart from registering)? --67.183.136.85 03:01, 5 January 2022 (UTC)
- If you had an account, you could add your IPs to your user page – I don't know if the issue is about not registering or not logging in.
- There could be an interface for copying your session cookie and reusing it in the next session. Depending on how those cookies are stored and checked at the server side you could do it manually, but support could be included in the web interface, allowing any users to save and reuse the cookies.
- –LPfi (talk) 13:30, 13 January 2022 (UTC)
- Thank you for sharing your thoughts on this as an IP editor. With the encrypted IP method people won’t be able to see the range from which you might be coming; only those with the Admin or IPViewer role would see that. In the session approach, as User:LPfi mentioned there might be some ways to keep the same identity. Maintaining a list of all temporary user accounts that you’ve had in the past on your current temp user page would be cumbersome but one way to maintain identity. Browsers also support clearing only certain cookies and leaving some be, that approach would make things easier for you. –– STei (WMF) (talk) 17:25, 19 January 2022 (UTC)
Clearly for IP based and new tool to identify when a registered user is disconnected.
Hi User:Johan (WMF), just to share as admin that the option of cookies and so on seems to complicate the admin environment without resolve many things concerning wrong social behavior on Wiki projects. As some ones already explained, it will just change the rull of the game of troll (not game of trone ;-) but not changing situation. May be trolls will appreciate this change that make their occupation of disturbing project a bite more exiting...
In the other hand, masking IP address is an excellent idea. It's simple, don't disturb the community habits and give to the technical team lot of time to think and develop new tools. For instance, with masked IP, It should be, for instance, very useful to develop a new tool that permit to see when an IP will be used by a registered instead to be connected. For a registered user, it's indeed a frequent practice to voluntary disconnect the session before writing something unpleasant to another user without being identified. If they are identified as well when they are not connected, that's could limit this practice, while forcing users to communicate in an identified way and therefore with more courtesy.
Best, Lionel Scheepmans ✉ Contact (Fr-N, En-3, Pt-3) 15:38, 5 January 2022 (UTC)
- Lionel Scheepmans, good to hear from you. Thanks for the feedback. ––STei (WMF) (talk) 17:12, 19 January 2022 (UTC)
Localization
Whatever you guys decide on is fine, just don't pet-peeve me by making those temp-accounts English. Create a Mediawiki page for localization. Thanks. Seb az86556 (talk) 17:06, 5 January 2022 (UTC)
- @Seb az86556, thank you, yes, we are considering this. –– STei (WMF) (talk) 17:04, 19 January 2022 (UTC)
Page by page revealing
Hello,
This may have got resolved in a discussion and I missed it, but last I recall the question of how to handle the WMF (more accurately, Legal) wanting to log cases of IPs being revealed (rather than them being revealed as default to those with that setting) and the issue that doing that one by one would be a massive pain and disruption to the workflow hadn't been resolved.
Several people had mooted the possibility of revealing a page at a time, and I think Johan said they'd consider it and ask Legal (apologies if I am incorrect about that final aspect). Did this get escalated and if so, what was the outcome? Nosebagbear (talk) 13:13, 6 January 2022 (UTC)
- @Nosebagbear, Legal has confirmed that page-by-page is good to go. –– STei (WMF) (talk) 16:59, 19 January 2022 (UTC)
- Tah muchly! Nosebagbear (talk) 17:05, 19 January 2022 (UTC)
Account age
The content page mentions that there will be restrictions based on "account age". This is a problematic metric. How many years old is my account? I recommend that you recommend a metric such as "at least 100 edits in 12 of the last 18 months". 力 (talk) 20:09, 6 January 2022 (UTC)
- @力 our most recent update does say "accounts over a certain age and with a minimum number of edits." We are even considering adding a community vetting for anti-vandalism fighters who want to opt in. So this user right would be handled like other user rights by the community. It'll require a minimum number of edits, days spent editing and vetting. –– STei (WMF) (talk) 16:57, 19 January 2022 (UTC)
Why choose one, if we can have both
Moving to a session-based identification for anonymous users seams in line with the 'assume good faith' philosophy, and could potentially expand the number and quality of the contributions from occasional users, perhaps even 'convert them' to logged-in users.
The down side of such change would be, that it would be harder to detect some of the most common types of vandalism.
Now, I understand that moving to a session-based identity for anonymous users will take some work on the platform. If we are already working on supporting cookie-based identities, why not keep some of the benefits that we have from the IP-based identification?
So basically, besides getting a handler (e.g. anon-1fe49afc7bc5), for whom we can see the User contributions, we can also have a hashed-IP identifier for that user (which might be shared by different anonymous users), and direct access to any contributions done from that (still hidden) IP, be able to (temporarily) block anonymous contributions from said IP (even without seen it!), or pass it forward to a checker for more information.
I understand there are also other ideas of how to make this and other frequently used tools possible under the new scheme.
I short, I agree with the session-based identification, but let's also work on improving the mechanisms available to mitigate vandalism.
MarianoC 21:57, 6 January 2022 (UTC)
- There's a discussion about this up-page: see § IP-based versus session-based masking: Why not both?. You might want to move your comment there. — Scott • talk 22:52, 6 January 2022 (UTC)
- Thanks! MarianoC 10:49, 7 January 2022 (UTC)
- MarianoC, Scott, I guess I will see you in the other thread. ––STei (WMF) (talk) 16:46, 19 January 2022 (UTC)
Proposal: Keep own IP address unmasked
This discussion page has been brought to my attention by a notice on German Wikipedia’s version of the Signpost.
I do not have the time nor am I in the mood for reading all the discussions on this page, so I do not know whether a remark like mine has already been put forward earlier; apologies if this is the case.
I would like to propose that logged-out users still be able to see their own IP addresses (unmasked) as well as retrieve them via the API (meta=userinfo). In terms of privacy this should be entirely unproblematic (since the users are only shown information about themselves, not about other users). Given that sysops, CheckUsers and others are able to see unmasked IP addresses (and potentially investigate information associated with them, such as geolocations), it would only be fair if (potential) logged-out contributors were able to check beforehand the information exposed about themselves to those users. (As an aside, this also applies to logged-in users, who can at the moment see their IP address before logging in, albeit in their case it would only be exposed to other users in case of a CheckUser investigation against them.)
To rule out any misunderstanding: When talking about “being able to see one’s own IP address” I mean somewhere in the UI, for example in Special:Contribs, not everywhere a masked version of the IP address would be shown instead to other users; not in signatures, for instance, which are obviously generated once an edit is made and not easily available for being altered on-the-fly by the UI.
I would expect this to be piece of cake technically since the means of displaying an unmasked IP address in Special:Contribs are already there, they would just have to be made conditional based on whether the requested contributions are one’s own (as redirected by Special:MyContributions). The API query meta=userinfo would not even need to change at all since by design it only returns information about the calling user. --2A02:8108:50BF:C694:A1E3:B362:5009:4A0B 20:22, 8 January 2022 (UTC)
- Thanks for the suggestion. –– STei (WMF) (talk) 16:30, 19 January 2022 (UTC)
- The IP address in the page w:Wikipedia:Get my IP address; or we can get it by many other ways, not just in wikipedia. Thingofme (talk) 01:44, 9 January 2022 (UTC)
- Yes; as long as these ways of getting one’s own IP address are still there, there should be no problem. (As far as I can see, w:Wikipedia:Get my IP address is not a Special Page, and I don’t know whether there is an equivalent in every language version; for example, I don’t know of any in the German language Wikipedia, but I haven’t bothered looking for one, since de:Special:MyContributions aka de:Spezial:Meine Beiträge did the trick.) As for the API (meta=userinfo) it’s (probably) mostly a matter of avoiding breaking changes, in case there are any consumers out there relying on it returning an IP address (instead of a masked version thereof) as user name when queried without being logged in. --2A02:8108:50BF:C694:986:2570:11A4:4D82 12:05, 9 January 2022 (UTC)
Good when it comes to privacy protection, bad when it comes to LTA-users
In general, I do think that it is good to have more privacy protection for unregistred users. When it comes to privacy, this is definitely a good idea. However, couldn't that demotivate people from creating accounts? Also, smaller projects that struggle with constant attacks from LTA IP-offenders (e. g. Croatian Wikipedia), could take damage from that, atleast in my personal opinion but I am not making claims here. During the past, the IP helped us sysops to identify banned users, who kept returning as IP-users. If the IP is no longer visible, it might be a problem to identify banned users who stopped creating accounts and decided to edit/troll as an IP (often in use of a VPN). --Koreanovsky (Ča–Kaj–Što?!) 13:23, 12 January 2022 (UTC)
- +1 -- Wutsje (talk) 21:28, 17 January 2022 (UTC)
- @Koreanovsky, @Wutsje, unregistered users will still not have access to some features like Watchlist and Preferences. Also sysops will have IP viewer rights. –– STei (WMF) (talk) 16:29, 19 January 2022 (UTC)
- I know, but I'm more worried about long term cross-wiki abuse. Wutsje (talk) 16:58, 19 January 2022 (UTC)
Both?
I wonder if it would be possible to use both ways to determine identity:
- cookie = true, ip = foo > Alice
- cookie = false, ip = foo > Alice (deleted cookies / incognito mode)
- cookie = true, ip = bar > Alice (other IP but same browser)
- cookie = false, ip = bar > Bob
That would make it even better than the current system. It will keep the identity of people under IPv6, but also many of the ones behind CG-NAT (in my country many ISPs use CG-NAT for at least half of their IP4 ranges). Geraki TL 16:28, 5 January 2022 (UTC)
- @Geraki, there is no way for us to know that cookie=false, ip=foo is still Alice. The IP could’ve been assigned to another device. –– STei (WMF) (talk) 17:06, 19 January 2022 (UTC)
- @STei (WMF) Yes, but it is already the current (or future) IP-based identity path. There are 6 browsing devices in my home, I am already assigned one identity and if my kid edits, we already have the same identity. If I edit from another place with the same device, I will get another identity, and get back my older identity when I move back to home. So I already have two identities, and one of them is already shared with another person.
- Keeping track of the path of both the ip and devices will help people have a persistent identity (even if it is shared, which is not different from the current situation), and help with the workflow against vandals. There is no need to keep track of all the past IPs: only the last and current IP, so that multiple people will not get the same identity just by editing from an IP that was used a week before. Geraki TL 07:03, 20 January 2022 (UTC)
Compartmentalization
Regardless of which approach are being pursued, I think it should be compartmentalized per wiki so if you have permission to unmask on one wiki, you can't unmask based on information gathered from another wiki; And only people with some global unmask permission can unmask globally. This would prevent the issue if someone with nefarious intent manage to get unmask permission on a smaller wiki to be able to unmask all IPs on all other wikis. AzaToth (talk) 13:17, 8 January 2022 (UTC)
- @AzaToth thank you, yes, this is what we planned. –– STei (WMF) (talk) 16:33, 19 January 2022 (UTC)
- And vice versa: it could make it impossible to see cross-wiki vandalism for local admins, and would make it harder to judge local edits' validity.
- Which makes it logical that we need a tool to find cross-wiki same-ip (or same subnet) edits without revealing the specific IP? grin ✎ 15:02, 20 January 2022 (UTC)
Some thoughts on how the media could perceive this
@Johan (WMF): The change needs to be communicated carefully to the general public, I think. In the past, there have been several articles by investigative journalists (such es here in Switzerland, or in Germany) about manipulation of Wikipedia that relied in part on the IP addresses in the version history. They then were able to show "this article has been edited anonymously from an IP that can be traced back to corporation X", or that it is an IP from the federal government of Y, and so on. Making the IP addresses no longer visible to the public could be perceived as an attempt to sweep attempts to manipulate Wikipedia under the rug, to make the project less transparent and making the journalists' work harder. To me, the privacy enhancement / legal reasons seem to be clear and convincing enough, but I wouldn't count on the media to see and to depict it that way! Try to avoid "Wikipedia hides manipulation!" headlines... Gestumblindi (talk) 19:02, 8 January 2022 (UTC)
- Gestumblindi, thank you for this insight. Our press team will work on this. –– STei (WMF) (talk) 06:53, 11 January 2022 (UTC)
- Also it is very true: this would make it almost impossible for investigative jhournalists to trace back malevolent edits to companies, governmental institutions and other dishonest organisations.
- Maybe there ought to be a researchers' access to IP data as well (which is a can of worms, I do know that). grin ✎ 14:59, 20 January 2022 (UTC)
IP rangeblock
As someone, who fights school vandalism daily, it is the most important thing to know the range IP. However, if different cypher keys are used for different IPs, how I will be able to differentiate between two not connected IPs and two IPs from the same range without clicking to a proposed window? A09090091 (talk) 14:43, 22 January 2022 (UTC)
They need to know their IP can be seen by sombody
Hi. In the current situation they get a banner to inform them that their IP can be seen by users. In the new banner they only get an invitation to make an account. The idea of masking is good but they still need to know their IP can be seen by some users (admins etc.) Gharouni 04:03, 5 January 2022 (UTC)
Gharouni makes a good point. The banner should include:
- your IP is ... and it is visible by several users of anti-vandalism teams
- your masked IP which will be used for your contributions will be ... and will be visible to all users
--FocalPoint (talk) 09:53, 5 January 2022 (UTC)
- And users who have the same IP will be able to see that your edits came from the same IP. –LPfi (talk) 13:33, 13 January 2022 (UTC)
- LPfi, Gharouni, we will inform editors that their IP addresses will be seen by admins as we've indicated in this screenshot on the project page. Thank you for your suggestions. –– STei (WMF) (talk) 17:22, 19 January 2022 (UTC)
- @STei (WMF): I think admins seeing them is a minor concern, compared to whether people they know (such as their boss) can see the connection, and whether the addresses are saved in logs that may be given away, such as in the case with penet.fi (Penet remailer). –LPfi (talk) 17:59, 19 January 2022 (UTC)
- I hope that the popup is not fhe final version. It doesn't mention the new IP reviewer group, and it's missing a link to a page or pages where it's explained who can see the IPs. "Administrators" probably means nothing to most people, and many might mistakenly believe that administrators are WMF employees. kyykaarme (talk) 09:35, 23 January 2022 (UTC)
Suggestions
I think anyone who has access to the new user right that allows them to see IPs should have to sign the nondisclosure agreement or some other form of a privacy agreement. I am an active vandalism fighter and a SWMT member and I intend to seek it. I also think there should be some form of a global user right that gives users access to the privilege. Bobherry (talk) 00:54, 18 January 2022 (UTC)
- I agree with the trust of Bobherry's comment, noting tough that a restriction of access with extensive exceptions is essentially no restriction at all, so that the it's all in the details. Thank you also to STei (WMF) for the links provided below. Cheers. 2601:246:C700:558:4934:BD1D:2B41:D139 22:50, 23 January 2022 (UTC)
- Bobherry, we have some notes here and there with more information on editors who will be able to see IP addresses and what they are required to do. –– STei (WMF) (talk) 08:00, 18 January 2022 (UTC)
Preferences in the session-based approach
Would anon editors be able to use preferences in the session-based approach? It seems feasible because browser cookies are tied to a computer account (used by one user) instead of a modem (possibly used by several users). --67.183.136.85 04:41, 20 January 2022 (UTC)
- Without a registered account, an editor will not have access to Preferences. –– STei (WMF) (talk) 06:17, 24 January 2022 (UTC)
Suggestion about the new right of viewing IP addresses
I suggest to default give the "IP viewing" right to extended confirmed if any, and create a new right for sysops to assign if not. Also, give the community time to talk about this and determine the right of getting this role. This allows anti-vandalism can be continue without any mess after IP masking are added.--Emojiwiki (talk) 05:43, 20 January 2022 (UTC)
- @Emojiwiki we have more information on the proposal about who sees IP addresses here. There's additional information regarding how we want to work with the community here also. –– STei (WMF) (talk) 06:51, 24 January 2022 (UTC)
Status?
The message that was sent to admins said "we will decide after 17 January". At least it was not straight after that date. Or is it extended? Stryn (talk) 08:08, 23 January 2022 (UTC)
- Stryn, status not available yet. We will do our best to spread information when there's an update. –– STei (WMF) (talk) 06:23, 24 January 2022 (UTC)
As a...
dedicated non-logging (IP) editor, committedly so in the context of the very longstanding commitment of WP to such freedoms, I would suggest you put out a further call, through registered editors, seeking contact with IP editors to include in the current "what to do with..." discussion. I say this as one of those, but one who has managed to engage a circle of editors who are sufficiently patient to ignore bot-edit warnings, read edit summaries, and engage edit texts for their constructive, quality-directed aims. (That is, to edit with few misinterpretations of work-as-vandalism, and so very few cursory reversions.) Whatever new guidelines and attitudes are engendered by these discussions, they should not force IP editors into positions of recognition here that they have reasons to avoid, nor should they feed destructive attitudes that already exist—that there is something suspicious, if not wrong with the lot of us. I volunteer as one, but I would suggest (something like) your picking your top 100 articles, asking a regular registered editor at each to skim histories for recent, regular or otherwise significant IP editors at each (because these might, like myself, be ones with academic or other expertise), and reaching out to those "good" IP editors for their views. Will look in again here for a response. Cheers. 2601:246:C700:558:4934:BD1D:2B41:D139 22:38, 23 January 2022 (UTC)
- Thank you for your suggestion –– STei (WMF) (talk) 09:03, 1 February 2022 (UTC)
Welcome
So, very soon we'll be sending out a notice to all admins. (This might include some people who very recently were admins, but are not longer, and exclude some who very recently became admins.) This is not because we're interested only in their opinion, but because it's one way we've identified to reach out to the communities and people who are likely to be active in vandal-fighting, without spamming every content contributor.
We're specifically interested in feedback on the solutions we have listed in IP Editing: Privacy Enhancement and Abuse Mitigation#IP Masking and how to protect the wikis (9 December 2021 Update). We're of course open to feedback on everything else, too. And I figure we'll get some "why are you doing this?" again, so I'll just point out again that we're doing this because the Wikimedia Foundation Legal department has told us it needs to happen, because of changing regulations and norms around privacy – the regulations are not the same today as they were in 2001. /Johan (WMF) (talk) 13:45, 4 January 2022 (UTC)
- @Johan (WMF): I got the message three times (admin on three different wikis). If you can't even tell that an admin is the same user across wikis, I'm not convinced that "Session based identity" will help at all with cross-wiki spam compared to sticking with IP addresses or a direct analogy. Thanks. Mike Peel (talk) 18:27, 4 January 2022 (UTC)
- The easiest way to make sure all administrators are notified is to notify all administrators on all wikis. Inventing a way to remove duplicates and to prioritize wikis (per edit count? per importance? Did you expect a notification on wikidata or enwiki?) is largely unnecessary. ToBeFree (talk) 18:33, 4 January 2022 (UTC)
- Ain't that one of the good things of SUL, that you know such stuff and just send it to the Homewiki? It's not even necessary that it's a Wiki where s/he is sysop, it's just home base. Grüße vom Sänger ♫(Reden) 21:47, 4 January 2022 (UTC)
- I'm sure the complaint would then have been "Why have I received this message on the wrong wiki? I [could] have turned off notifications there." ToBeFree (talk) 23:41, 4 January 2022 (UTC)
- This —TheDJ (talk • contribs) 10:51, 8 January 2022 (UTC)
- I'm sure the complaint would then have been "Why have I received this message on the wrong wiki? I [could] have turned off notifications there." ToBeFree (talk) 23:41, 4 January 2022 (UTC)
- Ain't that one of the good things of SUL, that you know such stuff and just send it to the Homewiki? It's not even necessary that it's a Wiki where s/he is sysop, it's just home base. Grüße vom Sänger ♫(Reden) 21:47, 4 January 2022 (UTC)
- The easiest way to make sure all administrators are notified is to notify all administrators on all wikis. Inventing a way to remove duplicates and to prioritize wikis (per edit count? per importance? Did you expect a notification on wikidata or enwiki?) is largely unnecessary. ToBeFree (talk) 18:33, 4 January 2022 (UTC)
- I'm an admin on several wikis, and I also administer single sign-on for a major university as my career, so I'm familiar with changing privacy norms around the web. I'm not at all surprised to see IP-based identification going away. While IP masking would work, I'd be in favor of switching to a session-based identity path. I think this is the more solid and user-comprehensible approach, and it opens better opportunities for anonymous editing in the future. I'm happy to see WMF dealing with this issue proactively.– Quadell 18:29, 4 January 2022 (UTC)
- That said, I'm not sure at all this measure would protect any contributor from government's prosecution: a malicious government might create an user that gain adminship privilege and the IP privacy measure is made ineffective. -- Blackcat 18:31, 4 January 2022 (UTC)
- Outside the US, if I remember my experience working for a supplier for Deutsche Telekom correctly, governments in many countries (except the US) can gain access to phone/internet details very easily -- although it is far more difficult, if not impossible for corporations to get this information. On the other hand, it is difficult for the US government to gain access to those details (who must seek a court order) while it is very easy for corporations to get it. (And thru these corporations, foreign countries can buy the information either directly or thru a strawman intermediary. There's less privacy out there than we know, dammit.) -- Llywrch (talk) 21:23, 4 January 2022 (UTC)
- @Llywrch: I'd be less worried by a corporation rather than a malicious government, but my point was that you cannot declare "private" an IP address then let an admin disclose it. We do not have a policy for admins and admins are anonymous. To be consistent, WMF should issue a rule that whoever has sysop upwards privileges must be disclosed and cannot be anonymous. -- Blackcat 23:40, 4 January 2022 (UTC)
- I think that if people can't be trusted to see the whole IP address, then people should not be trusted to see the partial IP address. Only checkusers can see the IP address made by anons, and block IP ranges (IP range block log should be kept private to only checkusers). Also, the Confidentiality agreement for nonpublic information must be rewritten and existing functionaries should sign again. Thingofme (talk) 09:54, 8 January 2022 (UTC)
- @Llywrch: I'd be less worried by a corporation rather than a malicious government, but my point was that you cannot declare "private" an IP address then let an admin disclose it. We do not have a policy for admins and admins are anonymous. To be consistent, WMF should issue a rule that whoever has sysop upwards privileges must be disclosed and cannot be anonymous. -- Blackcat 23:40, 4 January 2022 (UTC)
- Outside the US, if I remember my experience working for a supplier for Deutsche Telekom correctly, governments in many countries (except the US) can gain access to phone/internet details very easily -- although it is far more difficult, if not impossible for corporations to get this information. On the other hand, it is difficult for the US government to gain access to those details (who must seek a court order) while it is very easy for corporations to get it. (And thru these corporations, foreign countries can buy the information either directly or thru a strawman intermediary. There's less privacy out there than we know, dammit.) -- Llywrch (talk) 21:23, 4 January 2022 (UTC)
- I prefer the session-based approach. It provides more value in being able to identify and communicate with legitimate anonymous editors. However, at the same time, we need abuse filter options to be able to identify multiple new sessions from a single IP. These could be legitimate (from a school, for example), but will most likely represent abuse or bot activity. One feature I haven't seen mentioned yet. When a session user wants to create an account, it should default to renaming the existing session ID to the new name of their choice. We need to be able to see and/or associate the new named user with their previous session activity. -- Dave Braunschweig (talk) 18:37, 4 January 2022 (UTC)
- Hello Dave, when a session user creates an account we are planning to not carry over their edits. This is because we can’t be sure that the device was used by a single person (the one now creating the account) and therefore shouldn’t attribute all the edits to them. This could be common on public or family computers for eg. ––STei (WMF) (talk) 14:46, 13 January 2022 (UTC)
- The ability to perform purely session-based blocks in addition to the existing IP+session blocking would be an interesting upgrade. Being able to communicate with IPv6 users through their session instead of their repeatedly changing IP address would also be a benefit. ToBeFree (talk) 18:42, 4 January 2022 (UTC)
- @Johan (WMF): Not an admin but saw this on various talkpages I'm watching. I noticed the message said "There will also be a new user right for those who need to see the full IPs of unregistered users to fight vandalism, harassment and spam without being admins. Patrollers will also see part of the IP even without this user right". Who does "patrollers" refer to here? New page patrollers? Recent changes patrollers? Something else? I do think this right would be somewhat useful for the work I do patrolling. Elli (talk) 18:46, 4 January 2022 (UTC)
- Anyone is equally welcome to comment! ("[N]ot because we're interested only in their opinion", as I wrote above.) Who would need this user right would be largely for the local community to decide. Anyone involved in vandal-fighting (or who needs it for some other reason) who lives up to some simple requirements, and can be granted the right through some sort of simple community process. I would personally find it more useful when patrolling recent changes than when looking at new pages on my home wiki, but I don't imagine we know all potential needs. Johan (WMF) (talk) 18:54, 4 January 2022 (UTC)
- Including it in dewiki's "Aktiver Sichter" and enwiki's "rollbacker" could be an idea. The former is assigned automatically according to rather strict criteria. Is manual assignment a strict requirement? ToBeFree (talk) 18:58, 4 January 2022 (UTC)
- Anyone is equally welcome to comment! ("[N]ot because we're interested only in their opinion", as I wrote above.) Who would need this user right would be largely for the local community to decide. Anyone involved in vandal-fighting (or who needs it for some other reason) who lives up to some simple requirements, and can be granted the right through some sort of simple community process. I would personally find it more useful when patrolling recent changes than when looking at new pages on my home wiki, but I don't imagine we know all potential needs. Johan (WMF) (talk) 18:54, 4 January 2022 (UTC)
- I remain a steadfast IP anti-masker, but I feel that my opinion probably won't change things at this point. I'm grateful as an enwiki admin for the ability to still see IPs to block malicious ones. This will probably speed up the already-inevitable path to mandatory registration on enwiki, I think. John M Wolfson (talk) 18:49, 4 January 2022 (UTC)
- @John M Wolfson: indeed I am almost completely agreeing. In order to fix a minor problem we are creating a bigger one. At this point either a) allow open proxies to avoid IP addresses to be tracked or b) impose mandatory registration. WMF seemingly chose to hide IP addresses from malicious eyes though admins, that can see those IPs, are largely anonymous. -- Blackcat 23:45, 4 January 2022 (UTC)
- Anyone seriously interested in vandalizing or evading our controls will be using more than a a single computer or single browser. So do many good-faith users (I currently use ≥3 at least occasionally) They are usually but not always working from the same router, and I almost always use the same browser, but not necessarily the same version. If I were an ip unregistered user, trying to link what I do via session-based cookies would seem to be useless. DGG (talk) 18:56, 4 January 2022 (UTC)
- There certainly would be an incentive for good-faith unregistered users to use an account. I'm not certain that's a bad thing. LtPowers (talk) 19:00, 4 January 2022 (UTC)
- Based on a quick perusal of the issues, session-based IDs seems like the best solution. LtPowers (talk) 19:00, 4 January 2022 (UTC)
- In the session based model, will we be able to see all sessions belonging to a given IP? Kusma (talk) 20:11, 4 January 2022 (UTC)
- @Kusma -- that's definitely possible to do by building a tool which exposes that to users who have IP-address access. It's also possible to do the reverse -- see all IPs associated with the same session which we are planning to expose with the help of the IP Info feature. -- NKohli (WMF) (talk) 12:21, 4 February 2022 (UTC)
- Hi Johan, I (and presumably all the other admins) just got the message. Since most of it I've already discussed with you, I was just wondering why it places "norms" before "regulations" in the reasoning for why. I thought we'd settled that regardless of if they have or haven't changed in this way, Legal have no right to change things on that basis, and using it as cover is not acceptable for a top-down imposed change Nosebagbear (talk) 21:40, 4 January 2022 (UTC)
- Changing norms lead to changing regulations, is probably how my mind went at the time, but it was weeks ago, so honestly I couldn't tell you. This is me putting together a very brief explanation in a message I wanted short enough for people to feel they had the time to translate it. I wouldn't read too much into it. (: Johan (WMF) (talk) 23:20, 4 January 2022 (UTC)
- Considering there has been near-unanimous opposition to this IP masking since day one and "But WMF legal told us to!" is the only reason anyone is pushing this and as more and more Wikimedia projects adopt the required registration policy I think simply implementing that globally would be better than this absolute mess. Don't solve a tiny problem by creating a much larger one. Naleksuh (talk) 23:53, 4 January 2022 (UTC)
- I understand why this happens, but I think it is very close to seriously damaging our capacity to administrate wikis:
- If a user edits on two wikis simultaneously from the same device/browser (e.g. creates an articles and adds a link to Wikidata), whatever the scheme, the identifier absolutely must be the same (otherwise it is just useless). If a user adds similar nonsense pages to all wikis, but they are Anonymous1235 on enwiki, Anonyme245 on frwiki and Анонім74 on ukwiki, we will need a steward to find this out, while today any bystander can detect this.
- For session-based identifiers, I would really like us to use a MAC address rather than a cookie. Cookies are very easy to circumvent, and I don't think cookie restrictions will be any efficient against vandals. It would really be a cat and mouse game: an admin blocks by cookie, a vandal deletes it and can edit again immediately. A vandal can literally change their identifier after every single edit, and this change will take just 1-2 sec (restarting a router to get a new IP address for a dynamic range usually takes 1-2 min), which makes circumventing restrictions unreasonably easy. Unless our vandal has zero IT skills, this would be completely useless as modern browsers allow to purge cookies very easily (or offer quick access to private mode).
- If using MAC addresses is not possible, keeping blocks by IP addresses is the best option. This approach has lots of disadvantages, but at least we know how to deal with it. However, there are two important conditions.
- We absolutely need a replacement for range identifiers, notably for things like 3RR, bans and filters. Today we know that three anonymous reverts in a row made from the same range are almost surely the same person (more rare the range is, more likely is this fact). In future we would need to easily find out if these three anonymous edits are coming from the same range or from completely unrelated ones. There are significantly more use cases, for instance, we have AbuseFilters targeting some very specific behaviour from specific ranges. The most common tools we need would be <do anons X, Y and Z come from the same range> or <check all edits from X's range>. We can do it in a smart way, e.g. automatically querying WHOIS to get the respective range of the provider, or automatically checking provider's IDs to identify a different range of the same provider.
- We need to prevent extra burden on admins. Anonymous edits are already quite hated by our admins. If these changes will mean that non-admins can do literally nothing with an IP edit (cannot check what other edits on this topic were made from this range, cannot check if they circumvent sanctions etc.), this would put an unreasonable burden on admins. I don't believe we would magically get more admins, so quite likely we will just end up doing what the ptwiki has done and ban IP edits altogether.
- To sum up: session blocks are OK only if we use MAC addresses instead of cookies, IP bans are OK only if we have a good replacement for range tools for non-admins; both need to keep identifiers consistent cross-wiki. If not, banning IP editing altogether seems just the simplest option — NickK (talk) 00:00, 5 January 2022 (UTC)
- There is fortunately no way for an internet website provider to obtain the MAC addresses of their clients. This would be a privacy nightmare. IPv6 users can use their MAC address to generate an address in their /64, but this method is usually disabled for privacy reasons. Instead, a random address is generated ("Privacy Extensions", RFC 4941). ToBeFree (talk) 02:17, 5 January 2022 (UTC)
- @ToBeFree: Thanks for this clarification, I did not do any research into it. Thus our only option to ban a specific device is setting a cookie that a vandal can immediately delete? Or do we have any better option? For instance, pairs <IP+port> might be good in many cases, although we are not using them now AFAIK. It would be good to have a brainstorming on what our options (even if not readily available now) are — NickK (talk) 20:10, 6 January 2022 (UTC)
- A new source port is used for each connection, to separate them from each other. Connection reuse for multiple requests to the same server is a thing, but there may be multiple connections to the same server to improve performance, and at very least closing the browser and re-opening it throws the connections away. TCP/UDP source ports are thus less useful than IP addresses and cookies for any kind of identification.
- The thing is, if there was something more identifying than cookies (Browser fingerprinting has been mentioned above), then using it for identification would contravene the idea behind the whole action: Improving users' privacy. So even if there is something that could be technically used in place of the IP address, it won't be used for exactly the same reasons. ToBeFree (talk) 23:26, 6 January 2022 (UTC)
- Ah, and to address your banning concern: We can still see and block IP addresses, they're just not public anymore (#Blocking unregistered editors). ToBeFree (talk) 23:29, 6 January 2022 (UTC)
- @ToBeFree: Well, my main concern is: we are deciding what identifiers will see most registered users. The problem is that there are two separate approaches: we do want to protect privacy of good-faith casual unregistered contributors, but we need to be able to fight annoying unregistered vandals. There are way more non-admin vandal fighters than admins (perhaps at least by a magnitude of 10). I want to have some meaningful approach that would allow non-admin vandal fighters understand which vandal they are dealing with, otherwise we would have a real communication problem between admins (who do see IPs) and non-admins (who would have some weird identifiers). While I would definitely prefer a cookie identification for a 60-year-old lady contributing from time to time on the same topic but having no idea how IPs or registration work, I do want to have an IP (incl. range and provider) identification for a 16-year old geeky vandal who knows how to delete cookies or get a new dynamic IP.
- Regarding ports, of course I mean IP + port pairs as a way of identifying distinct connections, particularly for dynamic ranges. We already have this data somewhere and it is quite an industry standard, thus I think it is an option that should be explored and might be helpful — NickK (talk) 21:48, 10 January 2022 (UTC)
- Fortunately, there will be a user right we can give to non-admin vandal fighters to let them view the IP address ("The IP address itself will be visible to administrators and patrollers"). Perhaps it can even be given to all members of existing antivandalism groups such as enwiki's "rollbackers", or even to all members of automatically assigned groups such as dewiki's "Aktive Sichter". We'll need details about this (18:58, 4 January 2022 (UTC)).
- IP+port pairs are being used to identify connections all the time; that's their technical purpose. Displaying source ports to users – randomly generated, meaningless numbers up to 65535 – doesn't provide any benefit, though. ToBeFree (talk) 22:57, 10 January 2022 (UTC)
- @ToBeFree: Thanks for this clarification, I did not do any research into it. Thus our only option to ban a specific device is setting a cookie that a vandal can immediately delete? Or do we have any better option? For instance, pairs <IP+port> might be good in many cases, although we are not using them now AFAIK. It would be good to have a brainstorming on what our options (even if not readily available now) are — NickK (talk) 20:10, 6 January 2022 (UTC)
- There is fortunately no way for an internet website provider to obtain the MAC addresses of their clients. This would be a privacy nightmare. IPv6 users can use their MAC address to generate an address in their /64, but this method is usually disabled for privacy reasons. Instead, a random address is generated ("Privacy Extensions", RFC 4941). ToBeFree (talk) 02:17, 5 January 2022 (UTC)
- The session identity needs WAY more thought before rolling out. Cookie-based session identities? Come on now! That is way too easy to circumvent, which the deliberate miscreants will do anyway, negating any benefit that might have come from it. NickK has a good point that MAC addresses would be better. There is no harm in rolling out IP masking first. It doesn't change anything from the administration side, causes minimal disruption to existing workflows, and preserves privacy (particularly editors in countries like China who are hesitant of repercussions editing here). Anachronist (talk) 02:04, 5 January 2022 (UTC)
- As long as IP based blocking remains possible, it seems like the session-based approach has some advantages. I don't think session-based blocking will be useful, almost everyone knows how to clear cookies. -- hgzh 14:58, 5 January 2022 (UTC)
- Sorry ToBeFree however extraction and use of client MAC addresses is quite common. There are a number of open-source and commercial products available to do this (the IP stack is quite simple and open, however that's another story, should never have seen the light of day). It would not be difficult to extract MAC & IP and create a hash to be used for display purposes. It then creates an almost unique handle identifying IP editors, though obviously not as specific as a login, identifying only the device, however pretty close. Admins would need a tool for lookup for compare purposes, 'is this the same IP and a different device or perhaps a different IP and the same device' as that's the granularity that could be made be available, while hiding actual values from most. Would make sockpuppetry, etc, even easier to detect, to some degree. This is nothing new, it's an operational requirement in network providers all over (they don't bother creating hashes though!). Neils51 (talk) 22:26, 3 February 2022 (UTC)
- Neils51, the client's MAC address is never transmitted to the HTTP(S) destination; it is removed by the client's gateway to the Internet. ToBeFree (talk) 07:40, 4 February 2022 (UTC)