Forum Moderators: phranque

Message Too Old, No Replies

Using <Location> but excluding some sites / pages

         

csdude55

5:06 am on Dec 20, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm playing around with mod_substitute, as suggested by @phranque. This is my sandbox script:

<Location "/">
AddOutputFilterByType SUBSTITUTE text/html
Substitute "s|(<body.*?>)|<script>var foo = 1, bar = 14;</script><script src="//example.com/lorem.js"></script>\n$1|iq"
</Location>

Or possibly reversing that and placing $1 before <script>. Either way, my plan is to place this in the /etc/apache2/conf.d/userdata directory, so that it applies to all of the sites on the server.

Now I have 2 modifications that I'd like to consider:

1. Let's say that I want to explicitly exclude certain websites. This would be easier than manually including sites, since I expect 99 out of 100 of them to use this. How would I modify <Location> to exclude them?

2. Let's say that I have a website that I want to use the substitution, but I have certain pages in that site that I want to exclude. Those pages are written in PHP, and already have a custom header:

header('X-Foo: true');

How would I exclude that from <Location>?

phranque

11:53 am on Dec 20, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



if those PHP-generated page requests eventually resolve to a .php file, you could try using a FilesMatch container with a regex to exclude those requests:
<Location "/">
<FilesMatch "^\.php$">
...
</FilesMatch>
</Location>

csdude55

1:24 am on Dec 21, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm afraid that it's more complicated than that :-( The PHP script reads MySQL data that will be shown on the page, and if it contains what Google would consider "objectionable content" that I set a header that tells Ezoic to exclude the page from their ads.

I'd like to be able to use that same header to exclude this <script> for the same pages with objectionable content. So it wouldn't simply be a .php script, it would be a .php script that meets certain conditions.

If I set a custom header in PHP of header('X-Foo: true'), does that show up in Apache as %{X-Foo:true}? If so:

<Location "/">
# excluding lorem.com, ipsum.com, and if there's a header of X-Foo
<If "%{HTTP_HOST} ne 'lorem.com' && %{HTTP_HOST} ne 'ipsum.com' && -n req('X-Foo')">
AddOutputFilterByType SUBSTITUTE text/html
Substitute "s|(<body.*?>)|<script>var foo = 1, bar = 14;</script><script src="//example.com/lorem.js"></script>\n$1|iq"
</If>
</Location>

phranque

2:34 am on Dec 21, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the If directive will not work for you here because that is resolved very early in the request processing and you are looking at the headers in the response or possibly environment variables that would be set by your script (i.e., too late)

again, you need a solution that is an output filter here.

phranque

2:41 am on Dec 21, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



also, wouldn't it be far easier to solve this in your PHP script?

csdude55

5:57 am on Dec 21, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Blerg.

Real world usage, I'm trying to offer free website hosting to locals by placing a single sticky banner at the bottom of the page. But I need for them to NOT be able to remove it, which is why I'm using mod_substitute (per the other thread where you suggested the module).

BUT!

1. I have clients that are paying for hosting, too, and there wouldn't be an ad for them. Which is why I need to be able to exclude the substitution on a per-domain basis.

2. My own sites have sections that Google doesn't like, so I would also like to exclude those pages from showing the ad. I already send a PHP header to exclude them, so I was hoping to use the same header here.

I could manually place the <script> tag on MY sites, of course, but then I would have to exclude 100+ domains in the .conf AND remember to add new ones that I buy over time. So possible, yes, but it would be better if I could figure out a way to exclude them in a more automated way.

Can you think of anything else I can do to the PHP script that the Apache .conf could read here? Or does this run before PHP entirely?

phranque

8:54 am on Dec 21, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



if you want to examine any results of the PHP script in the apache environment you would have to implement an output filter.

csdude55

6:16 pm on Dec 21, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Can you think of a way for Apache to read a DNS record before <Location>? In theory, I could set a TXT record that would determine whether to use mod_substitute.

phranque

11:00 pm on Dec 21, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



there is no way that i can think of to do this using an Apache directive but it should be trivial to check a DNS record in your PHP script.

csdude55

1:11 am on Dec 22, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Unfortunately, that wouldn't help :-( There has to be SOME way of doing it since Ezoic is using the header to exclude things, but maybe it's proprietary.