ProxySG & Advanced Secure Gateway

 View Only
  • 1.  Bulk Lookup of URL Categorisation

    Posted Dec 05, 2017 10:57 AM
    I'm trying to tidy up a messy filtering policy on Proxy SG, including implementing a consistent filtering policy across all countries. At present we have a load of individual domains/URLs listed as exceptions due to an "unblock a web site" service request process we have. Many of these I've checked fall into categories I'm not planning to block going forward, such as advertising, and so I would like to clear these out. I'd like to avoid search every address on the WebPulse Site Review page to check what they are currently categorised as. Does anyone know a way to do a bulk lookup of these ?


  • 2.  RE: Bulk Lookup of URL Categorisation

    Posted Dec 05, 2017 12:42 PM

    Hi Simon,

     

                 Easy way I could think of is to use a linux script to query the proxy to get the results and parse them accordingly. Manage to test the below with my lab proxy to find it able to pull it. Steps below

     

    Create a file called "urls.txt" will all the urls withe one per line

    • Enable HTTP-Console in ProxySG
    • Create a bash script "script.sh" with below content. Change the IP address to that of proxy's
    #!/bin/bash
    
    url="http://192.168.1.10:8081/ContentFilter/TestUrl"
    for i in $(cat urls.txt); do
        content="$(curl -u admin:admin -s "$url/$i" | grep Blue |sed -n -e 's/^.*Coat: //p')")"
        echo "$i" "$content" >> Output.txt
    done
    
    • Put script.sh and urls.txt in same directory.
    • Execute script by using ./script.sh (Assuming that chmod is used to set executable permission)
    • Output will be dumped in a file "Output.txt"

     

    My Lab testing gave me the output in below format with one url per line

    [root@localhost ~]# cat Output.txt
    http://www.google.com Search Engines/Portals
    https://www.yahoo.com Search Engines/Portals
    https://www.symantec.com Technology/Internet
    http://www.mathrubhumi.com News/Media
    http://edition.cnn.com News/Media
    http://www.playboy.com Adult/Mature Content; Entertainment

     

    Note: This script is made after a good amount of search in Google and my minimal knowledge on bash. This might or might not work as expected 



  • 3.  RE: Bulk Lookup of URL Categorisation

    Posted Dec 06, 2017 01:39 AM

    There is a small typo the script. Unable to edit the comment, so mentioning it below

     

    #!/bin/bash
    
    url="http://192.168.1.10:8081/ContentFilter/TestUrl"
    for i in $(cat urls.txt); do
        content="$(curl -u admin:admin -s "$url/$i" | grep Blue |sed -n -e 's/^.*Coat: //p')"
        echo "$i" "$content" >> Output.txt
    done

     

    Info which needs editing

     



  • 4.  RE: Bulk Lookup of URL Categorisation

    Posted Dec 06, 2017 08:01 AM

    Excellent, man thanks Arivind.

     



  • 5.  RE: Bulk Lookup of URL Categorisation

    Posted Dec 06, 2017 10:15 AM

    Hi Simon,

     

    Do share the feedback after trying with this.