Jump to content

Talk:Spam blacklist/Archives/2018-02

From Meta, a Wikimedia project coordination wiki
Latest comment: 6 years ago by Billinghurst in topic Proposed additions

Proposed additions

This section is for completed requests that a website be blacklisted

images.google.(tld)/imgres?



Im suggesting to add the preview pages from Google Images Search, as beeing simmiliar to their normal web searches redirector. Surprisingliy it seems like this has not been discussed yet, though there are at least 800 such links on dewiki, 600 on enwiki and more than 1000 on commons. --Nenntmichruhigip (talk) 12:53, 15 September 2016 (UTC)

@Nenntmichruhigip: if there are that many links on those wikis, and the sites are neither blacklisted, nor the links removed, then it is not the place of meta to impose its own opinion. If you believe that we should continue the discussion the please raise the issue at enWP, deWP and Commons and point the users here to discuss.  — billinghurst sDrewth 16:26, 16 September 2016 (UTC)
I've been directed here from dewiki, and don't know where a suitable place on enwiki and commons would be. --Nenntmichruhigip (talk) 13:36, 17 September 2016 (UTC)
The occurences on dewiki are cleaned up by now. On commons there already is an abuse filter catching new additions since two months ago, but afaict no ongoing cleanup of existing ones, despite quite some copyvios. --Nenntmichruhigip (talk) 07:33, 27 September 2016 (UTC)
Until these other wikis address the matter here, it may be best to pursue a local blacklist addition at deWP.  — billinghurst sDrewth 10:19, 27 September 2016 (UTC)
Hi!
Those links can be used as sbl-circumvention.
As we normally blacklist all possible sbl-circumventions globally here at meta, in my opinion the first place to discuss blacklisting of that domain is here. -- seth (talk) 22:04, 27 September 2016 (UTC)
Agree, though we are not meant to be setting the overarching link policy without direct consultation. To further assist, I have put a bot clean up request to Commons. I have also copied over the filter from Commons to enWP to look at the new additions to monitor, and maybe start a conversation there.  — billinghurst sDrewth 23:11, 10 October 2016 (UTC)
@Nenntmichruhigip:  Declined at this point of time, stale request  — billinghurst sDrewth 07:36, 6 February 2018 (UTC)

g.co

See w:en:MediaWiki_talk:Spam-blacklist#google.co.in_shortener, reported by User:Ravensfire



This is ugly, this is one of google's link shorteners. Used across our projects a lot. However, it is prone to abuse as usual for shorteners. It is currently used in w:en:Akasa_Singh, where it redirects to a plain search result. If that is possible.... --Dirk Beetstra T C (en: U, T) 13:52, 29 November 2016 (UTC)

Standard linksearch does not give a lot .. but is informative. --Dirk Beetstra T C (en: U, T) 13:54, 29 November 2016 (UTC)
It appears that the url will indicate which google page/service the link is for. g.co/maps/ for maps, doodle for doodle, etc. g.co/kgs/ appears to be for searches, so if there is a need to keep g.co available for some services, blocking just the search may be useful. Ravensfire (talk) 15:37, 29 November 2016 (UTC)
For such a change of a commonly used url, I would have the expectation that (some|the) wikis would block it first, and we would follow. I do not see it blocked.  — billinghurst sDrewth 10:58, 3 December 2016 (UTC)
It is a url shortener .. they get blanket blacklisted on meta, not through local communities. We declined this earlier because people were arguing that it could only be used for google maps. It turns out that that is not the case, it has been (ab)used to link to search engine results (which is discouraged on some wikis, to say the least). --Dirk Beetstra T C (en: U, T) 03:32, 4 December 2016 (UTC)

Ping. Still think that this is/can be abuse. --Dirk Beetstra T C (en: U, T) 15:39, 5 July 2017 (UTC)

@Ravensfire:  Declined stale request  — billinghurst sDrewth 07:38, 6 February 2018 (UTC)

webcache.googleusercontent.com/search



  • webcache.googleusercontent.com/search

Google Cache. Easy to use blacklist circumventer, links are also very short-lived. Incidence rate may grow as Google Chrome now provides an option to view cached pages when connecting to a site fails. Train2104 (talk) 05:30, 4 March 2017 (UTC)

Let us see what COIBot can dig up.  — billinghurst sDrewth 07:20, 4 March 2017 (UTC)
@Train2104:  Declined at this point of timeno evidence that there is abuse. Check back with the wikis what they want.  — billinghurst sDrewth 07:44, 6 February 2018 (UTC)

e-camm.ga





e-camm.ga redirects to wlpreview.dnxlive.com, both URL are porno and morbo. --VR0 (talk) 17:16, 27 March 2017 (UTC)

Only one addition to a (now deleted) page on es.wikipedia for the former of the two. But the editor was also on en.wikipedia, where they are blocked for spamming after creating one article with



Maybe a pattern. --Dirk Beetstra T C (en: U, T) 03:32, 28 March 2017 (UTC)

@VR0:  Declined at thsi time, stale request  — billinghurst sDrewth 07:38, 6 February 2018 (UTC)

Abbywinters.co



Sustained cross-wiki spam campaign of replacing abbywinters.com with abbywinters.co. See e.g. [1] and [2], [3]. Tgeorgescu (talk) 03:40, 26 October 2017 (UTC)

@Tgeorgescu: At this point, this looks as though it can be managed at enWP.  — billinghurst sDrewth 21:28, 26 October 2017 (UTC)
@Tgeorgescu:  Declined  — billinghurst sDrewth 07:39, 6 February 2018 (UTC)

twitter.com/search



  • Link/text requested to be blacklisted: twitter.com/search

Seeking opinion on the consequences of blocking the search function of twitter urls. I am starting to see more of it among the spambots, and if it isn't a needed url then we can wave away some of that rubbish a little more easily. [Noting that we already have a range of url search tweaks in the blacklist to stop this as an abuse means.] @Beetstra: do we have any decent means to identify the use of the search url only at the wikis to judge the consequence? Is this something we can determine from a query of the database?

@Billinghurst: I don't really see why we need this. This is one of the problems - search functions (of almost any website) are not suitable in content namespaces for anything, and IMHO could be blocked easily for that (especially if there is a significant background of abuse). Searching the LW-db is going to be heavy, it needs a search within a string, and is likely going to give a massive number of hits. Off course an on-wiki search Special:LinkSearch/*.twitter.com/search does not give results for the /search as only results, but gives ALL twitter.com links ... --Dirk Beetstra T C (en: U, T) 05:40, 19 December 2017 (UTC)
I am presuming that you are meaning you are not certain why we need a check. I am just asking, and linking to or from twitter is not something in my interest or knowledge zone.  — billinghurst sDrewth 06:23, 19 December 2017 (UTC)
A check would be great, but currently I don't know of a quick way of doing this (what is current use of this link in content namespaces).
For twitter, we either link to the feed itself (as 'official website of a subject') or directly to a specific tweet (in case of a reference). I am not familiar with searching on twitter, but if the results resemble anything like linking to a google search .. then that has hardly any function (in mainspace) - neither as an external link ('here, use this search link to find her twitter feed'?) or as a reference ('look, if you search for this on twitter you can find her feed being mentioned here'?). En.wikipedia discourages use of search-results in any form (in mainspace), and I would favour blacklisting such links if it would not interfere with the very much needed use in the non-content namespaces (all deletion discussions, our LinkSummary, etc. etc.). It is one of the reasons why I want our blacklist functionality to be upgraded (google search results have been abused in the past, that is why we blacklisted the google/cse....). --Dirk Beetstra T C (en: U, T) 07:19, 19 December 2017 (UTC)
@Beetstra: Added Added to Spam blacklist. -- — billinghurst sDrewth 07:49, 6 February 2018 (UTC)

saken.no



I have just removed 3 references to saken.no from a wiki page (https://en.wikipedia.org/wiki/Baneheia_murders). saken.no seems to be some kind of autogenerated web page where the content is taken from the norwegian wikipedia. Okay to add saken.no to the global blacklist? 84.211.118.128

Also, those references I removed did not make sense. It's not impossible that they had been auto-added by a bot, or at least manually added by someone that wanted to generate traffic to his web page. 84.211.118.128
 Declined no evidence of xwiki abuse. If it is problematic at enWP, then please request blacklisting there.  — billinghurst sDrewth 07:45, 6 February 2018 (UTC)

wikkimedia.com



Note, this is not Wikimedia. The name Wikkimedia is obviously intended to prey upon the similarity. Although I have just encountered one instance of this being added to Wikipedia (here), I think this is such egregious bad faith with potential for harm and confusion that it should be immediately blacklisted. Edgar181 (talk) 14:52, 21 February 2018 (UTC)

@Edgar181: typosquatting, Added Added to Spam blacklist. --Dirk Beetstra T C (en: U, T) 17:37, 21 February 2018 (UTC)

apyoth.com

Cross-wiki spamming targeting cryptocurrency articles. Deferred to here from enwiki (w:en:MediaWiki_talk:Spam-blacklist#apyoth.com). Edgar181 (talk) 13:51, 20 February 2018 (UTC)



Users
@Edgar181: Added Added to Spam blacklist. --Dirk Beetstra T C (en: U, T) 13:41, 21 February 2018 (UTC)

cafemom.com/search



  • Link/text requested to be blacklisted: cafemom.com/search

being used by spambots, unneeded search url for wikipedias.  — billinghurst sDrewth 10:18, 28 February 2018 (UTC)

Added Added  — billinghurst sDrewth 10:19, 28 February 2018 (UTC)


Proposed removals

This section is for archiving proposals that a website be unlisted.

tezukainenglish.com



Hi all. I wanted to write an article about an anime, but the tezukainenglish.com website from where I wish to use some references is on blacklist. Apparently it was added here. If the cause does not hold anymore, can someone please remove it? Thanks, Whitepixels (talk) 19:15, 14 June 2017 (UTC)

It is a fan site, and one that was spammed. The Wikipedias look sideways at fan sites; and we look sideways at spam. I would suggest the wiki where you wish have the article,and see if they can whitelist the url that you wish to utilise.  — billinghurst sDrewth 06:15, 15 June 2017 (UTC)
Thanks for your reply. Whitepixels (talk) 11:36, 17 June 2017 (UTC)
@Billinghurst: I see no sign of spamming in the report; link additions are either made as part of major content expansions or done by experienced users. And most Wikipedias do not look sideways at fansites when those are the best available content for users who don't speak Japanese. You can see that from the link still being in the dewiki Tezuka article, for example (having been added more than 3 years ago). --Tgr (talk) 14:42, 17 June 2017 (UTC)
@Tgr: It would have been spammed as I quickcreated the report, and that would have been in response to spambots, and repeated spambots activity; and a review of whether it was being used at wikis. I don't remember anything particular about the particular case. Where a site has been added for spambot activity to sites it has been the practice to ask users to seek whitelisting as the means to graduate our way out of blacklisting. 1) It shows that communities want the link, and 2) that sites can have it and not be spammed with it. Re sideways, w:Wikipedia:External links is my guide; it is neither a ringing endorsement nor an invitation to add fan sites.  — billinghurst sDrewth 14:56, 17 June 2017 (UTC)
I Support Support the removal of this site from blacklist. It's a good information source for Tezuka-related articles, and its blocking is really frustrating. --Sasuke88 (talk) 13:52, 31 December 2017 (UTC)

Support Support Per Sasuke88, who is an expert of these themes. Bencemac (talk) 15:26, 1 January 2018 (UTC)

@Whitepixels: Removed Removed from Spam blacklist.  — billinghurst sDrewth 07:45, 6 February 2018 (UTC)

daviscountyutah.gov



Hello. I would like to use that website as a reference, it's a county government website, but it has been blocked since 2015 because

has a redirect service that is being abused

Is the problem solved now? Thanks, October Ends (talk) 12:03, 20 February 2018 (UTC).

@October Ends: Removed Removed from Spam blacklist. Also answered over at frWP.  — billinghurst sDrewth 12:25, 20 February 2018 (UTC)

Troubleshooting and problems

This section is for archiving Troubleshooting and problems.

Discussion

This section is for archiving Discussions.

coastcarevic.files.wordpress.com



In particular I want to reference: https://coastcarevic.files.wordpress.com/2015/09/coastline_32.pdf in https://en.wikipedia.org/wiki/Froggatt_Awards.

coastcarevic.files.wordpress.com is the official web site / blog for a state Government sponsored community engagement initiative.

It appears to be blocked due to the global block: \bfiles\.wordpress\.com\b, which I cannot find in the global block list ?

Aoziwe (talk) 11:39, 9 February 2018 (UTC)

@Aoziwe:  not globally blocked. The domain was blocked a while back, for a couple of weeks or so, though is no longer, and is explicitly whitelisted at enWP these days anyway. Noting that the domain is blacklisted locally due to spam efforts.  — billinghurst sDrewth 03:06, 10 February 2018 (UTC)

Beta Cluster spamming of 2018-02-18











MarcoAurelio (talk) 11:34, 18 February 2018 (UTC)

I'm adding all except medium.com. —MarcoAurelio (talk) 22:59, 21 February 2018 (UTC)
This section was archived on a request by: —MarcoAurelio (talk) 11:07, 26 February 2018 (UTC)

Beta Cluster spamming of today









MarcoAurelio (talk) 23:26, 15 February 2018 (UTC)

Added. —MarcoAurelio (talk) 23:26, 15 February 2018 (UTC)
This section was archived on a request by: —MarcoAurelio (talk) 11:07, 26 February 2018 (UTC)

Thoughts about blacklisting .club/ .space/ and .website/







Call me a curmudgeon, however, the repeat and prolific amount of spambotting for .club, and now the propagating spam for .space domains has me thinking that for the vast bulk of wikis that these two top level domains are just junk that we should block early, and forcefully. I have yet to see a useful link in either of those domains, however, that could be me living a sheltered life. What are peoples thoughts, as we may just be seeing early opportunistic spam from the same set of spambots as they migrate their activities.  — billinghurst sDrewth 23:03, 13 February 2018 (UTC)

Agree. All domains with .club and .space should be blacklisted. All these fake accounts already spam at id.wikisource. RaymondSutanto (talk) 17:06, 15 February 2018 (UTC)
I've done that some days ago on the Beta Cluster where (\.space and \.club) due to the enormous ammount of spam there. I see nothing positive comming from those. +1 from me. Specific links can be whitelisted if useful. —MarcoAurelio (talk) 18:31, 15 February 2018 (UTC)
Also .website seems to be getting spammed. See the Beta Cluster section above. —MarcoAurelio (talk) 23:27, 15 February 2018 (UTC)
Adding .website/ to subject line. Also noting that I blocked these top level domains awhile back at enWS, though it is easier achieve at such a site.  — billinghurst sDrewth 01:04, 16 February 2018 (UTC)
To note that I am flagging this discussion to enWP, deWP, frWP and Commons.  — billinghurst sDrewth 01:20, 16 February 2018 (UTC)
Test first (I read here about a beta environment where this is being tested now) if blocking helps. My prediction is that the spambots will need less than a week to switch from club to com. Three weeks tops. If I can be proven wrong, go ahead. Alexis Jazz (talk) 01:49, 16 February 2018 (UTC)
I think we should use a global abusefilter instead (which disallows only non-established users).--GZWDer (talk) 08:06, 16 February 2018 (UTC)
Hi!
AFAIK there hasn't been any major recent spamming in deWP via .club, .website or .space domains. Maybe at the weekend I can spare some time to search the replica database at wmflabs for those TLDs, just to see how many links would be affected. Or is there anybody else who has the time for this? -- seth (talk) 08:15, 16 February 2018 (UTC)
@Lustiger seth: that would be useful in both seeing how many useful links are present, as well as spam that is present. Maybe it is the case that we are catching most of the spam with existing filters, and only a little bit of tidy up. It may be that the spam is only at certain wikis, that is not unknown. There is nothing urgent with this request, eyes wide open is far better with these cases.  — billinghurst sDrewth 11:07, 16 February 2018 (UTC)
quarry:query/24816 gives an overview of links to those TLDs in dewiki. Only 151 links in total for all three (http+https). —MisterSynergy (talk) 11:29, 16 February 2018 (UTC)
Thanks. :-)
So with these stats it's quite obivious that .club should not be banned. And even the other two TLDs seem to be useful and it would be too much work to use the white list here, see quarry:query/24829 (enwiki: .space and .website). IMHO global blacklisting would not be a benefit. GZWDer suggested to use the edit filter. I guess this would be less destructive than global blacklisting. -- seth (talk) 12:30, 18 February 2018 (UTC)
Some (to all?) of the current spam is being caught by a global filter, which is why the question arises. So unless we start to get more successful breakthrough then there is no urgent requirement for an additional filter. If it becomes more problematic(ie. greater break through of existing filter) we can look to target it by new user spam global filter. Thanks for the feedback.  — billinghurst sDrewth 13:03, 18 February 2018 (UTC)
question to all: are these genuine websites, or are they hidden redirect or front-ends. --Dirk Beetstra T C (en: U, T) 07:25, 19 February 2018 (UTC)

vivirasturias.com



vivirasturias.com is a travel and tourism agency that has a biographies section on its website (vivirasturias.com/biografias-asturias). On each page it has a copyright mark that says: "© EuroWeb Media SL. EuroSeb Media, SL es el único titular de los derechos de propiedad intelectual de textos y fotografías de vivirasturias.com", however in Wikipedia in Spanish we have verified that his articles are copies (plagiarism) of preexisting articles of Wikipedia (see here). Mainly from the editions in Spanish, Catalan and Asturian (in many cases in the versions of the December 2016), sometimes using automatic translators and others changing a few words. Despite the massive plagiarism, in a year the site was linked +600 times in es.wiki, +230 times ca.wiki, +50 times in en.wiki, +30 times in ast.wiki, and it is also linked in other Wikipedias. In some cases linking directly to the home page. The site is using Wikipedia dishonestly to attract visitors to its website and then offer on each page its tourist services. Instead of requesting the same in each affected Wikipedia, I request its inclusion in the global blacklist. Especially because any honest editor (of any language) who does not know the plagiarism case can then use it as a reference or even think that it was Wikipedia that plagiarized the website. --Metrónomo-Goldwyn-Mayer 05:35, 8 February 2018 (UTC)

@Metrónomo: why do I have the feeling that we saw something like this before - is there a related domain blacklisted? Is it asturiasenimagenes.com ?? --Dirk Beetstra T C (en: U, T) 08:41, 25 February 2018 (UTC)
Those both sites are not related. vivirasturias.com is property of EuroWeb Media (a tourism agency) and asturiasenimagenes.com is property of Hotel Hernán Cortés de Gijón (a hotel). By the way, I want to rectify my request. Apparently, in June 2016 when they renewed the domain they changed the technical contact and then the website was redone. All the plagiarism to Wikipedia is listed in vivirasturias.com/biografias-asturias, but these does not share the same subdomain. I can not find a pattern that filters the new pages of the old ones... I am confused. --Metrónomo-Goldwyn-Mayer 11:15, 25 February 2018 (UTC)
It is possible that this feeling is because the spamming of tourism agencies is very common. They create a page with basic information of a locality and then these are added as external links in Wikipedia, thus attracting visitors to their site, where they then offer their services. It is also linked as reference by honest users, because it usually appear easily in the first places in the search engines when you put the name of a locality. This site exists since 2001, that is why in Wikipedia we link it so much, we have been directing our readers towards their business (un)consciously by years. --Metrónomo-Goldwyn-Mayer 11:31, 25 February 2018 (UTC)
@Metrónomo: OK, I must have had something else in mind. I'll try to remember.
So, is it enough to blacklist:
  • Link/text requested to be blacklisted: viviresturias.com/biografias-asturias
? The rest of the site is OK, but a significant part of the biographies are copied from Wikipedia? --Dirk Beetstra T C (en: U, T) 11:04, 26 February 2018 (UTC)

href.li



Apparently a redirect site. —MarcoAurelio (talk) 22:02, 26 February 2018 (UTC)

Added. —MarcoAurelio (talk) 12:25, 27 February 2018 (UTC)

Spambots 2018-02-27



Spambot fest at id.wiktionary. —MarcoAurelio (talk) 22:03, 27 February 2018 (UTC)

Added. —MarcoAurelio (talk) 22:03, 27 February 2018 (UTC)