Jump to content

Talk:Pageviews Analysis/Archives/2020/1

From Meta, a Wikimedia project coordination wiki

Query strings in the URL

It would be very useful to include query strings in the URL (or provide the option for a user to manually insert them) to exactly reproduce a chart. For example, I may construct a two-article chart that looks good as a bar graph; I copy the URL (either from the browser or using the "Permalink" button), but the resulting link doesn't have a parameter for chart type, so it produces a line graph (example). MANdARAX  XAЯAbИAM 20:55, 10 January 2020 (UTC)

Topviews bug report

"Error querying Yearly pageviews API" displays for 2015 and 2016. -Thibbs (talk) 17:22, 19 January 2020 (UTC)

2016 topview doesn`t work

https://tools.wmflabs.org/topviews/?project=uk.wikipedia.org&platform=all-access&date=2016&excludes= A1 (talk) 16:55, 30 January 2020 (UTC)

The yearly stats are not provided by the APIs, rather computed using a script. I am not sure why, but this script doesn't work for 2015 and 2016. I will try to investigate more. MusikAnimal (WMF) (talk) 20:16, 2 February 2020 (UTC)

Refuso

Nella pagina https://tools.wmflabs.org/userviews/ sta scritto " Visualizzazioni di tutte le patine create da un utente " anziché " Visualizzazioni di tutte le pagine create da un utente ". Da controllare anche se le traduzioni in altre lingue sono anch'esse errate. --Daniele Pugliesi (talk) 20:34, 2 February 2020 (UTC)

@Daniele Pugliesi: (usando Google Traduttore) Puoi correggere le traduzioni su translatewiki.net. Dovrai richiedere l'autorizzazione per modificare le traduzioni, se non disponi già dell'autorizzazione. In alternativa, puoi contattare direttamente i traduttori e chiedere loro di risolvere il messaggio. Cordiali saluti, MusikAnimal (WMF) (talk) 20:48, 2 February 2020 (UTC)

Yearly visits

Tracked in Phabricator:
Task T241765 resolved

Hi, it seems that the articles most viewed during 2019 are not visible due to a problem with the API. I tried for ca.wiki and for en.wiki and it is not working for any of then. Is there any way to solve it? Thank you! Xavi Dengra (MESSAGES) 12:02, 13 January 2020 (UTC)

The data should be available sometime later today :) I will ping you once it's available. MusikAnimal (WMF) (talk) 17:50, 13 January 2020 (UTC)
@Xavier Dengra: 2019 stats are here! Regards, MusikAnimal (WMF) (talk) 20:44, 13 January 2020 (UTC)
@MusikAnimal (WMF): That is a fast reply! Thank you! But I would like to check the mobile phone accesses, as the totals have been wrong for the last year and a half, full of false positives of Mr. Puigdemont and other bios that keep continuously showing unreal values of visits. Could you have a look at it? Very much appreciated. Xavi Dengra (MESSAGES) 16:56, 14 January 2020 (UTC)
@Xavier Dengra: The right-most column shows what percentage of the traffic was from mobile. Is this not what you're looking for? MusikAnimal (WMF) (talk) 22:47, 14 January 2020 (UTC)
@MusikAnimal (WMF): Nope, the problem is that there are lots of false positives and it is obvious (at least for Catalan Wikipedia) that the list doesn't represent at all the reality. Moreover, some of the names have shown unexpectedly high numbers since two years ago and our community is already aware that the numbers are wrong and in the main page we correct them by showing the daily visits via phone through a bot (Townie and Joutbis are managing it). Historically, we always had the chance to filter the most viewed articles by app, phone, or all. I would like to get back this option also for the most viewed of the year, in order to assess and double-check numbers... Otherwise if we send a press release we are somehow lying and getting in a bad position (especially as the top views are very much related with politics and it does have a huge impact in the Catalan media). Thanks for the understanding. Xavi Dengra (MESSAGES) 18:40, 30 January 2020 (UTC)
@Xavier Dengra: This is hard to explain, but basically the yearly stats are manually generated by running a query directly against the pageviews database. This is a very expensive query that requires a lot of disk space, too. This is why we computed the percentage of mobile traffic in the same query, so that you can easily filter out the false positives. I understand this isn't adequate for your wiki, though. It is possible to get datasets for each platform, which I will look into, but ideally we'd get the data in the API itself. That issue is tracked at phab:T154381. Best, MusikAnimal (WMF) (talk) 20:40, 2 February 2020 (UTC)

pageview analysis: nothing works today! Johann Nepomuk (talk) 20:49, 3 February 2020 (UTC)

Why ignoring redirects?

Hi, I`ve just noticed a significant problem with ignoring redirects for calculating stats. The result with calculating and without sometimes are so different that its ignoring lead to significant mistakes as following. Topviews for Ukrainian wiki in 2019 year currently shows the next top5:

rank article views
1 Volodymyr Zelenski 826343
2 Ukrainia 796055
3 Transition of Church Communities to the Otrtodox church of Ukraine 749946
4 President elections 2019 in Ukraine 641539
5 Taras Shevchenko 628464

While calculations including redirects shows the following:

rank article views
1 Transition of Church Communities to the Orthodox Church of Ukraine 1070534
2 Volodymyr Zelenski 857758
3 Ukrainia 837122
4 President elections 2019 in Ukraine 654693
5 Taras Shevchenko 654424

As you see the difference is very significant. If calculating without redirects as it is shown for today, our "champion" is Zelenski, current President of Ukraine. In this case, newspapers will write that Zelenski is the most popular for those who read Ukrainian Wikipedia. Otherwise, if count with corrections for redirects, newspapers will write that the most popular topic is about a newborn Orthodox Church of Ukraine. Do you feel how significant the difference is? In fact, the second table is truer, because redirects lead the reader to the same article and finally the reader reads the same article. But unfortunately, programmers who developed the tool made a decision to ignore this issue and as a result, we have such a misunderstanding. Please do something to resolve this problem. I think the best way is to add the feature - count including redirects or without. --A1 (talk) 09:49, 1 February 2020 (UTC)

@A1: The underlying API used by Topviews does not account for redirects. Sorry! This is a larger issue with the pageviews API in general. phab:T121912 is the relevant task. In the meantime, you can run pages individually through Redirect Views, but we cannot automatically do this for every result in Topviews. Regards, MusikAnimal (WMF) (talk) 20:14, 2 February 2020 (UTC)
Please do something to fix it. It`s possible to run pages individually to analyze a few pages, but very difficult to get top 100 for example because you have to check at least 100 pages manually. It seems the same problem occurs if using Massviews to get a top list within a given category for example. A1 (talk) 11:15, 3 February 2020 (UTC)
Indeed. Massviews can be fixed, but it might mean it will crash for some users. I hope to add an option "include redirects", off by default. This is tracked at phab:T200256. As for Topviews, it's impossible to include redirects with assurance the rankings are correct. The underlying API gives only the top 1000 results. However there could be millions of pages. So, one of the pages below the top 1000 could end up having more pageviews than the others when redirects are included, but we wouldn't know about it. Hopefully that makes sense. I apologize for this caveat! MusikAnimal (WMF) (talk) 23:31, 3 February 2020 (UTC)

Deleted pages in Topviews

Hi. Please look at this result. According to this, 4th most visited page on hywiki is hy:Մաշտոց Ա. Քհնյ. Արապաթլեան but that page does not exist since April 2019. --ԱշոտՏՆՂ (talk) 14:51, 4 February 2020 (UTC)

Feature request: top 1%, 10%

I use traffic reports to make all sorts of decisions.

There is symbolic value in editing anything that is top ranked. Having events and outreach to promote popular content is a big draw and a good way to set priorities. Without looking at traffic, people indiscriminately invest their time and resources in very popular content and almost unread content without distinction.

In English Wikipedia with 6 million articles, the top 10% is 600,000 and the top 1% is 60,000. These numbers are big enough that anyone, regardless of their interest, should be able to find something they like which is top-ranked by popularity.

I do not have an easy way to see how much traffic the #600,000 article is getting, but if I did, then I would know that anything with more traffic than that is top 10%. Similarly, I would like to know the amount of traffic to be editing top 1% content.

Historically, most outreach campaigns focused on getting anyone to make any edits, or to make new articles without considering how many people would read them. I think we could get new interest and more relevant outreach engagement if the community had easier access to some numbers.

Can pageviews report when an article is top 10% or 1%, and perhaps link to another page giving some general stats about broad ranking?

Thanks, I show Pageviews and Massviews off regularly in presentations. I showed Topviews off in a presentation today. These tools are a foundation in what I tell people about Wikipedia. Blue Rasberry (talk) 00:16, 6 February 2020 (UTC)

@Bluerasberry: Thank you for the kind words! Unfortunately, the underlying API used by Topviews only gives us the top 1,000 pages for a given project. You could write a program that uses the raw datasets to get the data you're seeking, but this is tedious and outside the scope of Pageviews Analysis, which relies entirely on the APIs provided by the Analytics team. As for editing, there is a list of the top-edited pages on Wikistats, though it does not have a namespace filter. Not a great answer to your question, I know. Sorry! MusikAnimal (WMF) (talk) 21:20, 6 February 2020 (UTC)

Feature Request: Weekly numbers in date type

Screenshot of date range today

I would appreciate if a weekly date type is added for viewing the view-data. Country differences might have to be considered (calendar week in e.g. Germany is from Mon-Sun). --Frshmn (talk) 09:49, 31 January 2020 (UTC)

The Pageviews API only provides daily and monthly granularity. Sorry. MusikAnimal (WMF) (talk) 20:06, 2 February 2020 (UTC)
Ok valid argument. Where can I ask for configuration changes that allow weekly recording in the future? --Frshmn (talk) 09:43, 3 February 2020 (UTC)
I actually found a task at phab:T158901 that I forgot about. My proposal was to compute the weekly granularity clientside, meaning we wouldn't need any update to the underlying API. This is possible, but it will be a bit challenging to implement. The API itself would be the better place for this. I also found phab:T133575, which is about adding weekly granularity to the API for the most-viewed pages, specifically (I'm assuming your request is about the main Pageviews Analysis tool). MusikAnimal (WMF) (talk) 23:37, 3 February 2020 (UTC)
Thanks for digging in this topic. Is there anything I can do as a user to support the realization of either of these tickets? --Frshmn (talk) 15:48, 8 February 2020 (UTC)
Resolved.

If I go to the following URL

https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-90&pages=Dungeness_(headland)|Dungeness,_Washington|Dungeness_(album)|Dungeness_(Cumberland_Island,_Georgia)

and click "Permalink", I get the following copied into my clipboard:

https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-11-05&end=2020-02-03&pages=Dungeness_(headland)%7CDungeness,_Washington|Dungeness_(album)|Dungeness_(Cumberland_Island,_Georgia)

It looks like it's only replacing the first pipe with %7C. I guess this is just a regexp that's only replacing once when it should be replacing across the whole string. --Lord Belbury (talk) 14:59, 4 February 2020 (UTC)

Thanks for noticing this! Your guess is correct. Fortunately the permalink still worked. I think it's probably best to encode the pipes, only because some markdown (such as on Phabricator) uses pipes in the link syntax, which causes conflicts. Anyway I've got this fixed and it will go out with the next release. MusikAnimal (WMF) (talk) 04:56, 7 February 2020 (UTC)
Thanks for the fix. I was actually hitting some problems on a Wikipedia talk page yesterday where the unescaped version of the string broke when I tried to use it as a link (which led me to trying the Permalink button instead, and which I had to finish cleaning up manually), but I can't seem to replicate it this morning. --Lord Belbury (talk) 09:42, 7 February 2020 (UTC)

Pageviews not friendly to 4:3 monitors?

I was using Chrome 80 (Stable channel) on Windows 7 (no longer supported by Microsoft) on a 4:3 monitor. (My 16:9 monitor just broke internally.) Then I was checking out stats of just one page. Then suddenly, the vertical scrollbar just flickers, i.e. rapidly and cyclically appears and disappears every 0.001 seconds (estimate). Then the flickering stops when I adjust the browser (Chrome) window (or maximize the window).

I tried testing the same page out on IE11. I don't see the vertical scrollbar flickering, but then the daily view popup doesn't work there. I've not yet tested this out on other browsers, like Edge and Opera. Does anyone have 4:3 monitors still working? George Ho (talk) 19:41, 10 February 2020 (UTC); edited, 19:42, 10 February 2020 (UTC)

Massviews

Massviews consistently fail to process large categories for long time, like "living people" since 2015.--Maxaxax (talk) 09:35, 13 February 2020 (UTC)

I'm assuming you're referring to the English Wikipedia. w:Category:Living people has around one million category members, and you're asking for all-time stats, so unfortunately that query is indeed unlikely to finish :( I can try to find some ways to mitigate this, or at least give you some data, but I can't make any promises. The real solution I believe is introducing a full backend solution. That is being tracked at phab:T157830. Regards, MusikAnimal (WMF) (talk) 22:21, 17 February 2020 (UTC)

Can't see pageview statistics on emoji articles

Hi there,

When I try to run a report on the actual existing page 🍆, which I believe has more then zero visits (certainly my own), the query fails. A similar test for another single character article, Ä, does yield results. The error I get is "🍆: Error querying Pageviews API - Not found." Queries for 🎉 and other emoji from this category seem to yield the same issue. Tested with an up to date Firefox and IE on an up to date Windows 10 system. Note that the examples are all Unicode v6.0, so not part of any new emoji set. Milliped (talk) 21:30, 14 February 2020 (UTC)

Further testing shows that I can't create a query for nl:Ş by entering the character in the search field but when I modify the URL manually it actually works. Milliped (talk) 12:28, 15 February 2020 (UTC)
@Milliped: Thanks for reporting this! It appears the underlying API stopped recording pageviews for emoji titles on April 23, 2019. For instance you see some data for 🍆 if you extend the date range. I have reported this bug at phab:T245468.

As for nl:Ş, it appears the wiki's search engine is using some heuristics to return results it thinks you want, when in this case they aren't. If I type in "Ş" into the search bar at nl.wikipedia.org, I don't see it in the result list there, either. Fortunately in Pageviews Analysis you can change the search method. Try going to the "Settings" and use the "No autocompletion" option under "Search method" (or even "Autocompletion including redirects" seems to work). As a side note, your wiki may wish to add a link to the Pageviews tool at nl:MediaWiki:Histlegend or nl:MediaWiki:Pageinfo-footer so that you will have a direct link. See the top of the URL structure documentation for some example code you can use on-wiki. Best, MusikAnimal (WMF) (talk) 22:17, 17 February 2020 (UTC)

Feature request - visualize across renames / page moves / redirects

At WikiProject Medicine some of us are trying to report the traffic to various COVID-19 articles. The three main articles all have multiple odd names.

The pageviews tool is set up to only analyze one name at a time. For articles like these which have had renames, the analysis gives a view in error, because a search for the latest name returns a report only for that name and no views for previous names.

These articles and their views are interesting for off-wiki external review by epidemiologists and health reporters as Wikipedia's traffic has a place in the story of global health communication.

I would like to request a data visualization in the Pageviews system which includes the traffic for all previous names of the article and all redirect terms. If this is not possible for this disaster then there will be other future disasters with this same need Thanks. Blue Rasberry (talk) 20:25, 17 March 2020 (UTC)

@Bluerasberry: Easier said than done, unfortunately. Relevant task is phab:T141332. As an interim solution, you could use Redirect Views to include the pageviews of all redirects, e.g. [1]. Hope this helps, MusikAnimal (WMF) (talk) 22:55, 17 March 2020 (UTC)
Thanks, I will work with it. As I have described to you before, I see audience metrics as the chief justification for institutional partners to fund their Wikimedia collaborations. Most or all members of the Wikimedian in Residence Exchange Network justify their employer's engagements based on communication impact of the sort which these metrics suites provide.
I can work with the redirect scheme, and thanks for reminding me of it, but if you or your department ever needs a justification to do more metrics development then just ask and I will get it for you. Blue Rasberry (talk) 20:07, 18 March 2020 (UTC)
@Bluerasberry: There is now an "Include redirects" option on the main Pageviews app. Let me know if you run into any problems. Next is to add redirect support to the other apps such as Langviews and Massviews. Best, MusikAnimal (WMF) (talk) 16:37, 24 March 2020 (UTC)
I should also mention there is an "Always include redirect" option available in the Settings. Just keep in mind this can cause the tool to slow down, depending on how many redirects there are. Best, MusikAnimal (WMF) (talk) 19:02, 24 March 2020 (UTC)
@MusikAnimal (WMF): I was ready to accept not having this. Thanks for making the miracle. I presented this feature at en:Wikipedia_talk:WikiProject_Medicine#New_feature_-_pageviews_-_now_can_include_redirects and will also put it in the next issue of The Signpost. I am very grateful. Blue Rasberry (talk) 23:26, 24 March 2020 (UTC)

Pageviews incomplete (cf topviews)

I notice that pageviews and topviews give differing results for topics. Checking on the article "sw:Virusi vya Corona"

  1. Pageviews show something above 800 views
  2. topviews show 809 views for "Virusi vya Corona" but above it 2502 views for "sw:Virusi vya corona" (small c). We often have redirects because orthography and specially use of capital letters is very much in flux in Swahili (and probably in many African languages). The article had been started with "corona" and was then moved to "Corona".

So I guess any results obtained via Pageview Analysis do not give a correct result for topics if there are redirects around. Can we have a setting that Pageview Analysis accepts also redirects?

TOPVIEWS LINK: https://tools.wmflabs.org/topviews/?project=sw.wikipedia.org&platform=all-access&date=yesterday&excludes=xss

PAGEVIEWS LINK: https://tools.wmflabs.org/pageviews/?project=sw.wikipedia.org&platform=all-access&agent=user&range=latest-20&pages=Virusi_vya_Corona (all in Firefox) — The preceding unsigned comment was added by Kipala (talk)

@Kipala: Same as above; try using Redirect Views for now. I am working to get the main pageviews app to support redirects, too, but this will take a while, perhaps a week or two. MusikAnimal (WMF) (talk) 23:05, 19 March 2020 (UTC)
@Kipala: An "Include redirects" option is now available for Pageviews Analysis. See [2] for example. There is a "Always include redirects" option available in the "Settings". Unfortunately we won't be able to offer this feature for Topviews, due to technical limitations. Thanks and let me know if you have any questions or concerns, MusikAnimal (WMF) (talk) 19:01, 24 March 2020 (UTC)

Glitch in pageviews?

https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=0&range=this-year&pages=2b2t

221k views out of nowhere for no reason? This must be a glitch of some sort. Melofors (talk) 06:55, 29 March 2020 (UTC)

@Melofors: Not a glitch. See the FAQ :) In short, it is very like bot-inflated traffic (compare mobile web versus desktop). Best, MusikAnimal (WMF) (talk) 20:02, 29 March 2020 (UTC)