Extracting links from a URL to Maltego
There is another recipe in this book that illustrates how to use the BeautifulSoup
library to programmatically get domain names. This recipe will show you how to create a local Maltego transform, which you can then use within Maltego itself to generate information in an easy to use, graphical way. With the links gathered from this transform, this can then also be used as part of a larger spidering or crawling solution.
How to do it…
The following code shows how you can create a script that will output the enumerated information into the correct format for Maltego:
import urllib2 from bs4 import BeautifulSoup import sys tarurl = sys.argv[1] if tarurl[-1] == “/”: tarurl = tarurl[:-1] print”<MaltegoMessage>” print”<MaltegoTransformResponseMessage>” print” <Entities>” url = urllib2.urlopen(tarurl).read() soup = BeautifulSoup(url) for line in soup.find_all(‘a’): newline = line.get(‘href’) if newline[:4] == “http”: print”<Entity...