Page MenuHomePhabricator

Consolidate, simplify and cleanup data collection relating to Special:MobileOptions
Closed, ResolvedPublic3 Estimated Story Points

Description

We will cleanup and clarify the information we log with regards to the beta mode.

Acceptance criteria

Done in https://gerrit.wikimedia.org/r/#/c/404495/.

Done in https://meta.wikimedia.org/w/index.php?title=Schema_talk%3AMobileOptionsTracking&type=revision&diff=17650781&oldid=13302734.

  • Notify Analytics Engineering that all MobileOptionsTracking data can be dropped.

See T185339: Archive and drop the MobileOptionsTracking EventLogging MySQL table.

Background

  1. We have Schema:MobileOptionsTracking. This is meant to track activity to the beta field (e.g. opt-in and opt-out) and lives inside Special:MobileOptions. It uses server-side event logging.
  2. We also log opt ins to beta/stable via wfIncrStats (statsd) for mobile.opt_in_cookie_set and mobile.opt_in_cookie_unset. It lives inside MobileContext
  3. When a user is logged in and opts into beta we record this their preferences (https://phabricator.wikimedia.org/T67079#3815127).

The current implementation of the new form in the specialpages branch introduces asynchronous saving via JavaScript. This means it cannot use server-side event logging, i.e. it breaks #1. If we want to continue using #1, we'll need to create some new client-side code to log this. It does not break #2 or #3.

Which of these do we actually need and what can be removed?

Consider the following questions that we want to answer about the mobile site's beta mode:

  • The changes in current (and historical) number of users in the beta feature over time
  • The ratio of pageviews with beta mode enabled to all pageviews of the mobile site 👍

Closed Questions

If we want to understand how people interact with the form pages differently post the redesign (T67079), we will likely need track these interactions directly, as it may not be possible to isolate their impact on the beta pageview ratio and/or the number of number of beta users (for which we have data, see above).

We need to work out how to do this with statsd or EventLogging or something new. The EventLogging is broken and written on server side. Fixing this would require fixing the instrumentation in the master branch AND rewriting it in the new branch. Statsd may be easier but may not be measuring exactly what we need, and it may be harder to debug (because it's not queryable by event).

@phuedx: This is answered/dismissed in T182235#3887715.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@Tbayer can you take a look at this task and see if the strawman proposal makes sense?
Does statsd give you the sort of information you'd need or would it be more important to continue to support MobileOptionsTracking?

As a general note, we would need:

  • current (and historical) number of users in the beta feature
  • increase in usage over time
  • beta sessions/total sessions over time
ovasileva lowered the priority of this task from High to Medium.Dec 12 2017, 6:26 PM

I also wonder if we can use pageviews to do this ? Are we able to tell if a pageview has the optin cookie in its headers? (this would give us the most accurate picture of which of our users are in beta). That would need a talk with analytics.

I also wonder if we can use pageviews to do this ? Are we able to tell if a pageview has the optin cookie in its headers? (this would give us the most accurate picture of which of our users are in beta). That would need a talk with analytics.

This is a good point. I don't think this information is available as part of the webrequest (or pageview) data currently. But it could conceivably be added to the X-Analytics header (somewhat similar to the existing loggedIn flag for logged-in pageviews) and then queried in the webrequest table.

Doing that would be considerably less effort then rewriting the event logging schema but a little more than relying on the existing statsd.

I also wonder if we can use pageviews to do this ? Are we able to tell if a pageview has the optin cookie in its headers? (this would give us the most accurate picture of which of our users are in beta). That would need a talk with analytics.

This is a good point. I don't think this information is available as part of the webrequest (or pageview) data currently. But it could conceivably be added to the X-Analytics header (somewhat similar to the existing loggedIn flag for logged-in pageviews) and then queried in the webrequest table.

According to that page, we already do. The link and the contact information are out of date but [there is a part of the MobileFrontend codebase that adds a mf-m=b to the X-Analytics header](https://phabricator.wikimedia.org/diffusion/EMFR/browse/master/includes/MobileContext.php;895d6cf2eadd7affa2e38ec2d8be3e789483f0a0$1114).

Update

I've been bold and updated the X-Analytics documentation on wikitech.

Following on from the above, it does seem like mf-m=b part of the X-Analytics header is making its way to the wmf.webrequest table:

[0]
select
  count(*) as n
from
  wmf.webrequest
where
  year = 2017
  and month = 12
  and day = 13
  and hour = 0
  
  and x_analytics like '%mf-m=b%'
;

+-------+
|   n   |
+-------+
| 3954  |
+-------+

Putting [0] in context:

[1]
select
  count(*) as n,
  count(case when x_analytics like '%mf-m=b%' then 1 else null end) as n_beta
from
  wmf.webrequest
where
  year = 2017
  and month = 12
  and day = 13
  and hour = 0
  
  and access_method = 'mobile web'
  
  and webrequest_source = 'text'
  and agent_type = 'user'
;

+-----------+---------+
|     n     | n_beta  |
+-----------+---------+
| 40139378  | 3954    |
+-----------+---------+

0.01% of pageviews on the mobile site were made by a user who had opted into the beta mode.

@Tbayer: With this in mind, does it make sense to remove all of the other instrumentation? /cc @ovasileva

Looks like the task needs more discussion.

I think our event logging is broken anyway as on SpecialMobileOptions:244 we log the event, when there is an error. Later on lines 253/259 we change the beta to [on|off] - but we do not send the event.

Thanks for adding those questions, @ovasileva.

changes in current (and historical) number of users in the beta feature over time

FTR @ovasileva and I plotted the metrics we send to StatsD in this graph: https://grafana.wikimedia.org/dashboard/db/mobile-dashboard?orgId=1&panelId=1&fullscreen

ratio of pageviews to beta users over time

As @Jdlrobson and @Tbayer have suggested, this information is already available (and has been available for a long time!). We can rerun [1] in T182235#3833724 over whatever period we like with Analytics Engineering's blessing.

Given @pmiazga's additional information that the EventLogging instrumentation has been broken for some time, my recommendation would be to remove it, mark the schema as inactive, and ask Analytics Engineering to drop the table.

@Tbayer can you take a look at this task and see if the strawman proposal makes sense?
Does statsd give you the sort of information you'd need or would it be more important to continue to support MobileOptionsTracking?

We talked about this last week (in standup or grooming on Tuesday). IIRC @phuedx pointed out that statsd has the downside of not being queryable and in particular not retaining some dimensions that might often be of interest and which one gets for free in EventLogging, such as OS or browser version. That said, it doesn't sound like these will be too important in the context of the particular question that arose recently, i.e. how to track the impact of T67079.

I also wonder if we can use pageviews to do this ? Are we able to tell if a pageview has the optin cookie in its headers? (this would give us the most accurate picture of which of our users are in beta). That would need a talk with analytics.

This is a good point. I don't think this information is available as part of the webrequest (or pageview) data currently. But it could conceivably be added to the X-Analytics header (somewhat similar to the existing loggedIn flag for logged-in pageviews) and then queried in the webrequest table.

According to that page, we already do. The link and the contact information are out of date but [there is a part of the MobileFrontend codebase that adds a mf-m=b to the X-Analytics header](https://phabricator.wikimedia.org/diffusion/EMFR/browse/master/includes/MobileContext.php;895d6cf2eadd7affa2e38ec2d8be3e789483f0a0$1114).

Update

I've been bold and updated the X-Analytics documentation on wikitech.

Ah, excellent. (I had looked for the string "beta" in the X-Analytics field for a sample of webrequests, but not for "mf-m=b" ;)

Given @pmiazga's additional information that the EventLogging instrumentation has been broken for some time, my recommendation would be to remove it, mark the schema as inactive, and ask Analytics Engineering to drop the table.

Does that mean that the MobileOptionsTracking instrumentation has never (in its entire lifetime since spring 2014) produced valid data? Otherwise there might still be value in preserving the earlier data.

This comment was removed by phuedx.

I guess that's basically the same graph as mentioned last week in T182235#3837047 , right? (Interesting BTW that there seem almost always more opt-outs than opt-ins, cf. last 90 days.)

I guess that's basically the same graph as mentioned last week in T182235#3837047 , right?

I forgot that I'd noted that we'd created the dashboard!

Interesting BTW that there seem almost always more opt-outs than opt-ins, cf. last 90 days.)

IKR. I think it'd be worth sanity-checking that instrumentation too.

Change 399327 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@specialpages] Drop use of MobileOptionsTracking schema

https://gerrit.wikimedia.org/r/399327

...

Interesting BTW that there seem almost always more opt-outs than opt-ins, cf. last 90 days.)

IKR. I think it'd be worth sanity-checking that instrumentation too.

What would be the best way to do this? (BTW, this kind of situation is actually where it would be better to be able to query the data, i.e. to use EventLogging instead of statsd, where we have found and fixed many bugs that way. Not insisting on that here though - given that we seem to agree that stakes are currently low regarding these data questions, I'm OK with us investing our instrumentation energies elsewhere ;)

Note that data source #3 from the task description, on user preferences, is not affected by the decision to stop using the EL schema and rely on statsd to monitor opt-ins and opt-outs. We could still decide to monitor it (i.e. in regular intervals retrieve and store the number of logged-in users who have activated mobile beta on a particular wiki, or with more effort on all wikis), giving a different answer to question 1 ("changes in current (and historical) number of users in the beta feature over time").

I just repeated the query from T67079#3815127 ; it looks like we gained around 1700 logged-in mobile beta users in the last two weeks on enwiki alone. (Of course not all of those 70k+ are active.)

mysql:research@analytics-store.eqiad.wmnet [(none)]> SELECT COUNT(DISTINCT up_user) FROM enwiki.user_properties WHERE up_property = 'mfMode' AND up_value = 'beta';
+-------------------------+
| COUNT(DISTINCT up_user) |
+-------------------------+
|                   73010 |
+-------------------------+
1 row in set (2 min 58.43 sec)
Interesting BTW that there seem almost always more opt-outs than opt-ins, cf. last 90 days.)

IKR. I think it'd be worth sanity-checking that instrumentation too.

I think I can rexplain what's happening here.
The increment of stats will be executed every time MobileContext::setMobileMode is called regardless of whether the value has changed.

The only place this method will get called in production is when the Special:MobileOptions form is submitted (SpecialMobileOptions::submitSettingsForm). So.. if somebody submits the form at Special:MobileOptions but has not changed whether they are in beta then something will still be logged.

Assuming it to mean opt-ins and opt-outs is thus a little risky as all it's doing is telling you is when someone visits Special:MobileOptions and submits the form whether they end up in stable or beta.

e.g.

This will trigger mobile.opt_in_cookie_set

Likewise, in stable if I just submit the form without doing anything I will trigger mobile.opt_in_cookie_unset

So, what's probably happening here, is for some reason, people are submitting that form without changing the checkbox. Why? I'm not sure. Maybe the cookie doesn't get set (e.g. browser has disabled settings), or maybe they just don't understand what the form is for.

As @pmiazga mentions the EventLogging schema is indeed broken as it only logs on errors.
As @Tbayer mentions Beta-loggedin-users will continue to be registered in Preferences

The analytics headers seem to be the most reliable way to get this so I would recommend getting rid of the broken things. I've submitted a patch that does exactly that: https://gerrit.wikimedia.org/r/399327

@ovasileva we need your help to work out how to proceed with this ^

phuedx reassigned this task from Tbayer to ovasileva.
phuedx updated the task description. (Show Details)

Whoops!

I just made some further tweaks to the task description based on my recollection from our January 2 meeting. (In particularly, it's ultimately not my call whether we want to do the work necessary to answer questions about the impact of T67079 - I had just said that if we want to answer them, we may need to instrument the opt-ins/opt-outs directly, as looking at the time series of the beta pageview ratio and/or the total number of logged-in users who opted in may not be sufficient for isolating that impact.)
I also wasn't quite sure what was meant by "How do we want to see this data? Graph? SQL query?"
A query is just a tool to extract the data ;) it can be presented in various ways, including graphs. (BTW if it's about creating graphs without having to write any query, we could perhaps ask the Analytics Engineering team if they might be willing to include this schema in their experiment of making EL data available in Pivot, which they already did for the Popups schema recently.)

We talked through this and we are concluded implementing instrumentation would delay the timeline by at least 1 month. We are happy to use page views and settings form submissions to get signals of the impact of our change with the caveat we cannot rely on statements such as "beta opt ins increased by X percent!" We just want data to reassure us that our change was good.

We also want to update the existing graph with an explanation of what it shows.

FYI I've added a per week and per month graphs to that dashboard but, as @Jdlrobson notes, we need to make sure that we've added a "How to read these graphs" section that details any gotchas.

ovasileva set the point value for this task to 3.Jan 10 2018, 6:31 PM

Change 404495 had a related patch set uploaded (by Phuedx; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Drop use of MobileOptionsTracking schema

https://gerrit.wikimedia.org/r/404495

Change 404495 restored by Phuedx:
Drop use of MobileOptionsTracking schema

Reason:
Nope nope nope.

https://gerrit.wikimedia.org/r/404495

@Jdlrobson: I've tested and +1'd the rebase. Cool?

Change 404495 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Drop use of MobileOptionsTracking schema

https://gerrit.wikimedia.org/r/404495

Change 399327 abandoned by Phuedx:
Drop use of MobileOptionsTracking schema

Reason:
Superseded by I365a31a2.

https://gerrit.wikimedia.org/r/399327

phuedx updated the task description. (Show Details)

Change 404703 had a related patch set uploaded (by Phuedx; owner: Phuedx):
[mediawiki/extensions/MobileFrontend@master] Hygiene: Remove ExtMobileFrontend::eventLog

https://gerrit.wikimedia.org/r/404703

Change 404703 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Hygiene: Remove ExtMobileFrontend::eventLog

https://gerrit.wikimedia.org/r/404703

Does this need QA...? If so how?

I've just repeated the same tests that I performed on my development instance during code review on the Beta Cluster: I tested that I could opt in and out of the beta mode both logged in and out.

I think our event logging is broken anyway as on SpecialMobileOptions:244 we log the event, when there is an error. Later on lines 253/259 we change the beta to [on|off] - but we do not send the event.

As I started to write the email to Analytics Engineering, I decided to dig into this.

Instrumentation was added to Special:MobileOptions in rEMFRff663541943d: Card 1757: Add EventLogging to Special:MobileOptions. After a bit of searching (and marvelling at the power of git log -L), I found that it was broken in rEMFRc89f371ea9e7: Drop the disable images feature. That is, the instrumentation has been broken since Thursday, 15th June 2017. I'll ask that we keep the MobileOptionsTracking data recorded prior to that date.

@alexhollender asked about the percentage of mobile beta pageviews, so I re-ran the calculation from above (T182235#3833702 ) , correcting the queries a bit (in particular restricting it to webrequests that are pageviews):

There are currently around 130k mobile beta pageviews per day, corresponding to 0.5% of all mobile web pageviews.

I'm also updating the documentation with this.

SELECT 
  SUM(IF(x_analytics LIKE '%mf-m=b%',1,0))/7 AS mobile_beta_views_per_day,
  ROUND( 100*SUM(IF(x_analytics LIKE '%mf-m=b%',1,0))/SUM(1), 4) AS mobile_beta_pc
FROM
  wmf.webrequest
WHERE
  year = 2018
  AND month = 11
  AND day <= 7
  AND is_pageview
  AND agent_type = 'user'
  AND access_method = 'mobile web'; 

mobile_beta_views_per_day	mobile_beta_pc
130664.14285714286	0.0453
Time taken: 894.843 seconds, Fetched: 1 row(s)

And the same for logged-in views:

There are currently around 60k logged-in mobile beta pageviews per day, corresponding to 7% of all logged-in mobile web pageviews. A bit over a half of all mobile beta pageviews are by anonymous users.

SELECT 
  SUM(IF(x_analytics_map['mf-m'] = 'b',1,0))/7 AS mobile_beta_views_per_day,
  ROUND( 100*SUM(IF(x_analytics_map['mf-m'] = 'b',1,0))/SUM(1), 4) AS mobile_beta_pc
FROM
  wmf.webrequest
WHERE
  year = 2018
  AND month = 11
  AND day <= 7
  AND x_analytics_map['loggedIn'] IS NOT NULL
  AND is_pageview
  AND agent_type = 'user'
  AND access_method = 'mobile web';
  
mobile_beta_views_per_day	mobile_beta_pc
58894.28571428572	7.34
Time taken: 1582.184 seconds, Fetched: 1 row(s)

The same numbers broken down by project:

wikiall beta views /day% betalogged in beta views /daylogged in % beta
en.wikipedia596700.0465269176.9017
ar.wikipedia116700.2229356318.6706
ja.wikipedia73430.034445707.5832
es.wikipedia72030.033228798.0482
de.wikipedia48700.0317306210.2721
zh.wikipedia46960.07933709.8615
ru.wikipedia45050.03515926.6541
fr.wikipedia35040.025611143.755
fa.wikipedia29630.079111867.2568
it.wikipedia25150.020114414.03
id.wikipedia19860.04693997.0751
en.wiktionary18430.14778559.7201
bn.wikipedia14270.32823609.019
pt.wikipedia13750.01856856.3104
pl.wikipedia12620.02714034.5361
hi.wikipedia10450.069823312.4287
th.wikipedia7830.051433.6785
vi.wikipedia7000.05252918.0016
he.wikipedia6240.05625115.2011
ur.wikipedia6071.180839445.5642
Commons5750.05015746.7866
ko.wikipedia5650.04033888.4317
hu.wikipedia4850.053238.5521
m.mediawiki4791.41139416.31
nl.wikipedia4540.01752615.0125
uk.wikipedia4220.036628211.2484
sr.wikipedia3640.067830219.8593
az.wikipedia2900.060122314.3093
cs.wikipedia2470.02371094.609
sv.wikipedia2410.0129721.9715
fi.wikipedia2340.02031263.8207
el.wikipedia2330.03915011.4261
tr.wiktionary2110.435820560.9911
ar.wikisource2100.15794727.6897
simple.wikipedia2090.0558579.1175
ro.wikipedia2000.0261564.6218
ms.wikipedia1720.068423.0014
tr.wikipedia1720.0395857.0348
no.wikipedia1650.0262634.4547
tl.wikipedia1620.10128358.4247
my.wikipedia1541.80784113.4328
Wikidata1320.0919677.2273
eu.wikipedia1300.771411734.7163
ca.wikipedia1200.0538434.7957
pt.wiktionary1130.270510380.0661
Meta-wiki1080.05871073.8568
mr.wikipedia1070.06754614.6885

Limited to wikis with >= 100 mobile beta views per day.

Data via

SELECT 
  CONCAT(normalized_host.project,'.',normalized_host.project_family) AS wiki,
  INT( SUM(IF(x_analytics_map['mf-m'] = 'b',1,0)) / 7 ) AS mobile_beta_views_per_day,
  ROUND( 100*SUM(IF(x_analytics_map['mf-m'] = 'b',1,0))/SUM(1), 4) AS mobile_beta_pc,
  INT( SUM(IF(x_analytics_map['mf-m'] = 'b' 
    AND x_analytics_map['loggedIn'] IS NOT NULL,1,0)) / 7 )
    AS loggedin_mobile_beta_views_per_day,
  ROUND( 100*SUM(IF(x_analytics_map['mf-m'] = 'b' 
    AND x_analytics_map['loggedIn'] IS NOT NULL,1,0))/
    SUM(IF(x_analytics_map['loggedIn'] IS NOT NULL,1,0)), 4) 
    AS loggedin_mobile_beta_pc
FROM
  wmf.webrequest
WHERE
  year = 2018
  AND month = 11
  AND day <= 7 
  AND is_pageview
  AND agent_type = 'user'
  AND access_method = 'mobile web'
GROUP BY normalized_host.project, normalized_host.project_family
HAVING SUM(IF(x_analytics_map['mf-m'] = 'b',1,0)) / 7 >= 100 -- = mobile_beta_views_per_day 
ORDER BY mobile_beta_views_per_day DESC
LIMIT 1000;