Jump to content

Talk:CopyPatrol/Archives/2017

From Meta, a Wikimedia project coordination wiki

Remarks & ideas from French community

Hello Com Tech ! The French community warmly welcomed this new tool. Here the firsts remarks from users :

  • Jules78120 says that the tool have problems with renamed pages.
  • Alaspada explain that needed action on the interface is not evident (can we revert directly from here, what are concretely the different consequences of the two buttons...) - with a good explanation by Antoniex. Perhaps write a quick popup for the first login on the tool ?
  • Finally, Antoniex purpose to add a discuss system directly in the tool ask a question, post remarks with others, ... Perhaps a comment system by entry ? Just a text field that is stored and printed for others with username.

I've two other questions :

  • About @ערן: EranBot, I've added a few French domains in the whitelist. Do you think that en:User:EranBot/Copyright/Blacklist should be on meta instead enwiki ? I'll look about a PR of the code.
  • About CopyPatrol, I've idea to send a copyvio warn with my bot to users that edits are marked as "Done" on the tool. How can I access to the database ? On EranBot's source code I found a xxx__copyright_p db name, but it not exists on tool's sql server....

Thank for all your work ! I let you create phabs tasks if it necessary. --Framawiki (talk) 19:05, 5 December 2016 (UTC)

Framawiki, thank you for the comments.
  • Moving the whitelist to meta - I think it is good idea.
  • DB - Here are the DB connection + sample query: -h enwiki.labsdb s51306__copyright_p -e "select * from copyright_diffs where lang='fr' limit 5;"
(BTW: s51306 is not secured/private, but I didn't hard coded it to make it easier to port the bot to other projects within lab just in case...)
eranroz (talk) 19:21, 5 December 2016 (UTC)
Yes! SELECT * FROM s51306__copyright_p.copyright_diffs WHERE lang = 'fr' AND status_user = 'Framawiki' AND status = 'fixed'; --Framawiki (talk) 21:40, 5 December 2016 (UTC)


@Framawiki: Thanks for the feedback!

Jules78120 says that the tool have problems with renamed pages.

Will look into this

Alaspada explain that needed action on the interface is not evident (can we revert directly from here, what are concretely the different consequences of the two buttons...) - with a good explanation by Antoniex. Perhaps write a quick popup for the first login on the tool ?

If you hover over the review buttons, you should see a little caption explaining what they do. I do however like the idea of a more obvious pop up when you first login, explaining how the interface works. Perhaps you could also translate the instructions at CopyPatrol#On enwiki for your wiki? The last bullet point "please update the status in the interface..." is the information Alaspada needs.

There is no way to revert the edit from the interface, but there is a ticket to add this (T138058). Would you find this useful?

Finally, Antoniex purpose to add a discuss system directly in the tool ask a question, post remarks with others, ... Perhaps a comment system by entry ? Just a text field that is stored and printed for others with username.

I will talk to my team about this. We have had similar requests for adding notes to entries (T139650, T135301). Would that help? So, when you do a review, you could enter a summary. MusikAnimal (WMF) (talk) 05:36, 6 December 2016 (UTC)

Yes MusikAnimal, allow to revert directly from the interface would be the bast. But lot of time users edits after the copyvio for wikify it... --Framawiki (please notify !) (talk) 17:32, 6 December 2016 (UTC)


I've created fr:Wikipédia:CopyPatrol and a script to help use the tool, by my bot. It does : [1]

Do you think that enwiki or others can be interested by this bot ? --Framawiki (please notify !) (talk) 17:17, 25 December 2016 (UTC)

I'll look to port this bot to other languages.
A lot of user's drafts are just simple copyright violations, so @ערן: can you enable the bot for the User NS too ? They are in pages User:XX/Brouillon in frwiki. --Framawiki (please notify !) (talk) 17:02, 5 January 2017 (UTC)

Framawiki, the bot runs either based on regular pagegenerators of pywikibot, or in in "live" mode using recent changes feed (using IRC generator)[2]. Live mode is the regular mode run in production and it doesn't support complex queries [e.g: (ns=0 or (ns=2 and "draft" in page_title))]. If you have a suggestion to improve page_filter in a more generic way, you are welcome to pull request :) eranroz (talk) 19:49, 5 January 2017 (UTC)

False positives

Hi, I just discovered this tool today, and saw that it is working for edits in the Portuguese Wikipedia so decided to try it out. I think it will turn out to be quite useful, but there are some issues that I would like to hear your opinion about.

  1. The diff presented (e.g. [3]) was not the one where the text identified as copyvio was added. Can the tool make use of the difference in text rather than the full article for detection?
  2. It is most likely that the site copied Wikipedia than the other way around (e.g. [4] seemed to have copied the article after 2010 (when the info about a concert by Elton John was added - see [5]). I don't think there is an easy way to see who copied who unless perhaps the first issue is resolved (if the edit is recent and the source site was already there, for example), but maybe there are other ideas. GoEThe (talk) 14:53, 28 February 2017 (UTC)

w:en:Philippine Atheists and Agnostics Society

Please lock this page, has been vandalized several times by people in the Philippines..

Greetings from NYC, USA:

My name is Marissa Torres Langseth, RN, MSN.. I am the true founder of PATAS. I have too many stalkers and haters making PATAS very vulnerable and been vandalized since its inception. The IP address in the Philippines have been notably the culprit, however, they are ghost writers. May I request to lock that page.. or remove it entirely since I still have haters who want to smear my reputation especially that scholar of mine who recently spread fabricated lies about me when I stopped supporting him His name is Aljohn de Leon with too many aliases..so it is moot and academic to look for his IP address.

Kindly lock the page :https://en.wikipedia.org/wiki/Philippine_Atheists_and_Agnostics_Society or remove it in its entirety is you cannot protect it. I am tired of these haters trying to bring me down by editing that page.

Thank you and kind regards, Marissa Torres Langseth, RN, MSM

Not seeing that much problem. But enough to protect for a week. Doc James (talk · contribs · email) 01:38, 16 September 2016 (UTC)

Hello there again: PATAS was vandalized again.. May I request that you delete this page.. the website is already non existent. too long a story.. it is no longer a viable entity. Some people just are evil, they come here to edit this wiki to vilify me. Thank you and kind regards, ms M Hapimarissa (talk) 14:09, 12 May 2017 (UTC)

CopyPatrol is not working

https://tools.wmflabs.org/copypatrol/en is not working today; I am getting a "Slim Application Error". Thanks, Diannaa (talk) 12:56, 13 April 2017 (UTC)

I'm getting a similar problem at Copypatrol today. The main page opens fine, but trying to review any of the entries returns a "There was an error in connecting to database." error. Likewise, trying to open the Leaderboard returns the slim application error. /wiae /tlk 14:03, 14 April 2017 (UTC)

The iThenticate links are failing to work at https://tools.wmflabs.org/copypatrol/en. I have tried several and I can't get them to load. Thanks, Diannaa (talk) 17:12, 22 May 2017 (UTC)

Working again Diannaa (talk) 17:41, 22 May 2017 (UTC)
And now it seems to have quit again. Diannaa (talk) 18:17, 22 May 2017 (UTC)
Sketchiness seems to have resolved itself, as I've had no issues for a few days. Diannaa (talk) 12:50, 26 May 2017 (UTC)

Copypatrol not working - Slim application error

I am getting a Slim application error when I try to access the copypatrol today. Any help getting this tool running again would be appreciated. Diannaa (talk) 11:43, 23 June 2017 (UTC)

Pinging @Niharika and Ryan Kaldari (WMF):--The ORES link at Copy-patrol should link directly to mw:ORES than to Objective Revision Evaluation Service.Thanks! Winged Blades of Godric (talk) 09:46, 3 August 2017 (UTC)

Copypatrol is down

The copypatrol bot appears to be down, as there's been no new reports since 18:59 on August 28 (over 24 hours ago). If you could have a look that would be appreciated. I have also posted at en:User talk:ערן. Thanks, Diannaa (talk) 20:04, 29 August 2017 (UTC)

Now resolved Diannaa (talk) 12:54, 30 August 2017 (UTC)

Copypatrol is working only intermittently

Copypatrol is working only intermittently. Right now the page refuses to load. This problem has been occurring on and off for the last few days. If you could see if there's anything you can do to help that would be perfect. Thank you! Diannaa (talk) 12:28, 11 August 2017 (UTC)

As an update, it took over four minutes to load today.--Sphilbrick (talk) 13:29, 4 October 2017 (UTC)
Today, it is fine.--Sphilbrick (talk) 13:10, 5 October 2017 (UTC)
I think Labs was undergoing some ugrades or such. Let me know if it happens again and I'll look into it. Pinging me will be helpful. Thanks. -- NKohli (WMF) (talk) 22:08, 10 October 2017 (UTC)

Vandalism at CopyPatrol

Thanks to User:MER-C for detecting a problem: a new user (since checkuser blocked and globally locked) has vandalized at CopyPatrol by marking more than 150 reports as "no action needed". We are unable to undo these reviews to get them back on the board. MER-C has filed tickets (https://phabricator.wikimedia.org/T178682, https://phabricator.wikimedia.org/T178681) as it would be helpful in an instance like this to undo these reviews. My suggestion is that only highly trusted editors should be able to log in at CopyPatrol. For example editors with at least 30 days and 500 edits, or perhaps an even higher threshold. Adding: I am repairing the damage now, by re-evaluating each review by this user. Diannaa (talk) 14:21, 20 October 2017 (UTC)

Sorry about that! I have ran a query to undo all of their reviews in the database. There were 85 cases that were reset. We are now going to work on phab:T178681. We will talk about making CopyPatrol only available to experienced users, too, but I think making it respect on-wiki blocks and being able to undo reviews is the first step. Thanks — MusikAnimal talk 14:31, 20 October 2017 (UTC)
@Diannaa: To make sure you see this. I know you've been manually going through and redoing their reviews — MusikAnimal talk 14:33, 20 October 2017 (UTC)
Thank you! I have repaired the Oct 18 reports, most of which had already been dealt with, and will now carry on with the work. In the future I will check daily to ensure that people who are assessing reports are doing it properly until such time as the suggested automatic control is in place. Diannaa (talk) 14:41, 20 October 2017 (UTC)
I fully support the actions being taken. We probably ought to debate, calmly and carefully the criteria for access to this tool, and not make a hasty decision but closing it off to blocked editors is certainly justifiable.--Sphilbrick (talk) 14:50, 20 October 2017 (UTC)
(I was initially surprised to see the queue so low this morning, but I guess it's because Diannaa did even more than usual.)--Sphilbrick (talk) 14:52, 20 October 2017 (UTC)
The user was blocked on the 18th and did the vandalism on the 20th, so yeah, ensuring that blocked users cannot log in/assess reports would be a good start. Diannaa (talk) 15:16, 20 October 2017 (UTC)
I have created phab:T178700. Feel free to comment there, but we'll be monitoring the discussion here too. Best — MusikAnimal talk 17:14, 20 October 2017 (UTC)

Search in CopyPatrol

Hello all! CopyPatrol now lets you search by page title (similar or exact). This was requested in T171555 by TonyBallioni. I was using this feature just now for testing and here's something I found a little strange - link. Hopefully this feature will help catch users who make repetitive copyvio edits and can be reached out to explain the problem. Trusted users found via this feature can be added to the whitelist to avoid their edits showing up in the feed. Thanks. — The preceding unsigned comment was added by NKohli (WMF) (talk) 21:45, 23 October 2017 (UTC)

Thank you; that looks useful. Diannaa (talk) 13:10, 29 October 2017 (UTC)
Just so you know, nothing strange with those reports, as the source document is a publication of the US Department of the Army, Office of the Chief of Military History, as is visible in this url, and is thus in the public domain. Diannaa (talk) 13:20, 29 October 2017 (UTC)
So much of what I deal with is related to Wiki mirrors or declared public domain sources.enL3X1 ¡‹delayed reaction›¡ 01:21, 31 October 2017 (UTC)

Is something wrong?

Copypatrol was unavailable this morning. While it is now available, the most recent entries are three days old some of which had previously been addressed and presumably marked as fixed. Normally, after the passage of a few hours, there will be some new recent items at the top of the page but that's not the case. Is it possible that the system is in a pause mode and needs to be restarted?--Sphilbrick (talk) 18:40, 1 November 2017 (UTC)

ToolForge is experiencing some problems. See this announcement. I expect it will take a while before things get back to normal. -- NKohli (WMF) (talk) 18:45, 1 November 2017 (UTC)
Thanks for the prompt response.--Sphilbrick (talk) 18:57, 1 November 2017 (UTC)
Sphilbrick and everyone: The bot's running again now. We're going to investigate how we can make CopyPatrol more stable -- it's phab:T179537, if you want to follow the discussion. Thanks for your report. -- DannyH (WMF) (talk) 23:15, 1 November 2017 (UTC)
It looks like the content was restored from a back-up done on October 28, 3-4 days ago, which is not adequate in my opinion. Backups need to be done more frequently please, if that's the case! Luckily I had the unresolved reports from October 31 open in a tab, so all we missed is less than one day's worth of copyvios. - Diannaa (talk) 00:31, 2 November 2017 (UTC)

"Curiosity" for whoever planned to host something like Turnitin in EU: https://edri.org/is-anti-plagiarism-software-legal-under-eu-copyright-legislation/ --Nemo 18:14, 30 November 2017 (UTC)

Deleted diffs: Diffs with no editor found

I have several times ran across cases in which "no editor was found" because the page had been deleted. Because of this, I cannot check as to whether is was a copy vio or not. While any potential problems are solved by deletion for the time being, if at any point the article is refunded a copyvio may exist. Should anything be done about this or in the case of refunding will CopyPatrol pick it up again for manual reveiw? thanks.enL3X1 ¡‹delayed reaction›¡ 01:56, 4 December 2017 (UTC)

I have not tested it, but yes, I believe such restored revisions will show up again in CopyPatrol. Also, Community Tech bot should be auto-reviewing any reports where the page was deleted. Perhaps it was deleted after you loaded CopyPatrol? In other words, you should see no red links to articles in CopyPatrol. MusikAnimal (WMF) (talk) 05:06, 4 December 2017 (UTC)
This happens sometimes when a page is re-created under the same title before the bot removes the article from the list. Diannaa (talk) 15:18, 18 December 2017 (UTC)