Appendix E

Migration of Apache mod_rewrite Rules to Advanced Policies

267

RewriteRule ^MyPage\.html$ MyPage.20.html [L]

RewriteRule ^fMyPage\.html$ MyPage.32.html [L]

NetScaler solution for browser-specific settings

add patset pat1

bind patset

pat1 Mozilla/1

bind Patset

pat1 Mozilla/2

bind patset

pat1 Lynx

bind Patset

pat1 Mozilla/3

add rewrite

action act1 insert_before 'HTTP.REQ.URL.SUFFIX' '"NS."'

add rewrite

action act2 insert_before 'HTTP.REQ.URL.SUFFIX' '"20."'

add rewrite

action act3 insert_before 'HTTP.REQ.URL.SUFFIX' '"32."'

add rewrite

policy pol1

'HTTP.REQ.HEADER("User-Agent").STARTSWITH_INDEX("pat1").EQ(4)' act1

add rewrite policy pol2 'HTTP.REQ.HEADER("User-Agent").STARTSWITH_INDEX("pat1").BETWEEN(1,3 )' act2

add rewrite policy pol3 '!HTTP.REQ.HEADER("User-Agent").STARTSWITH_ANY("pat1")' act3

bind rewrite global pol1 101 END bind rewrite global pol2 102 END bind rewrite global pol3 103 END

Blocking Access by Robots

You can block a robot from retrieving pages from a specific directory or a set of directories to ease up the traffic to and from these directories. You can restrict access based on the specific location or you can block requests based on information in User-Agent HTTP headers.

In the following examples, the Web location to be blocked is /~quux/foo/arc/, the IP addresses to be blocked are 123.45.67.8 and 123.45.67.9, and the robot’s name is NameOfBadRobot.

Apache mod_rewrite solution for blocking a path and a User-Agent header

RewriteCond %{HTTP_USER_AGENT} ^NameOfBadRobot.*

RewriteCond %{REMOTE_ADDR}

^123\.45\.67\.[8-9]$

RewriteRule ^/~quux/foo/arc/.+ - [F]

Page 281
Image 281
Citrix Systems 9.2 manual Blocking Access by Robots, 267