PDA

View Full Version : Extracting links from webpages



Midavalo
18-04-2005, 10:00 AM
Is there a tool (an add-on to IE6 perhaps) I can get from the net somewhere that I can use to extract page links into a txt or html file, along with the description (or whatever its called) of the link?

Thanks,

Midavalo.

Speedy Gonzales
18-04-2005, 10:07 AM
Select the link and right mouse / copy shortcut?

Midavalo
18-04-2005, 10:13 AM
Select the link and right mouse / copy shortcut?Thanks :) I meant all links on a page, at once. If there were only 3 or 4 links on a page then that method would be fine, but not when there are 100+ links per page, plus numerous pages.

Cheers,
M.

tweak'e
18-04-2005, 10:21 AM
depending on what your wanting to do with the links.....i know some download mangers can extract all the links in a page (handy for bulk downloading).

vinref
18-04-2005, 10:23 AM
The Lynx text browser (runs in a DOS window) can do exactly what you want except give you a description of the linked site.

Midavalo
18-04-2005, 10:58 AM
The Lynx text browser (runs in a DOS window) can do exactly what you want except give you a description of the linked site.What I mean by "description" is where in the html you include the link URL, but then just provide a word or two as the link... like <a href = http://www.pressf1.co.nz> Press F1 </a> or however it works. I'd call the "Press F1" bit the description.

M.

Midavalo
18-04-2005, 10:59 AM
depending on what your wanting to do with the links.....i know some download mangers can extract all the links in a page (handy for bulk downloading).They're not download links that I want - just links to other pages/sites or wherever. I've got a download manager that can extract all the links, but it only outputs to the download manager, which is no use to me :)

cheers,
M.

Rob99
18-04-2005, 02:17 PM
File - Save Page As - Web page

sal
18-04-2005, 02:21 PM
Is there a tool (an add-on to IE6 perhaps) I can get from the net somewhere that I can use to extract page links into a txt or html file, along with the description (or whatever its called) of the link?

Thanks,

Midavalo.
What format would you like the links output? eg

http://site.com/link1.html
http://site.com/link2.html
http://site.com/link3.html
http://site.com/link4.html

or more advaced than that?

KiwiTT_NZ
18-04-2005, 02:27 PM
Save the webpage to temp dir then at the command prompt

i.e.

c:
cd \temp
find /i "http://" c:\temp\webpage.html.* >links.txt
notepad links.txt


This show the output each line with http to a file called links.txt

sal
18-04-2005, 03:26 PM
Save the webpage to temp dir then at the command prompt...
Doesn't appear very affective. Let me know how the following fares Midavalo

ImageF1 > Ext > Get Links (http://www.sal.neoburn.net/imagef1/ext/getlinks/)

Midavalo
18-04-2005, 06:56 PM
File - Save Page As - Web pageThat still leaves me with the same problem - how to extract the links :)

M.

Midavalo
18-04-2005, 06:57 PM
What format would you like the links output? eg

http://site.com/link1.html
http://site.com/link2.html
http://site.com/link3.html
http://site.com/link4.html

or more advaced than that?I'd also want the text that's displayed on the link (if you know what I mean?) if possible...

M.

Midavalo
18-04-2005, 06:59 PM
Doesn't appear very affective. Let me know how the following fares Midavalo

ImageF1 > Ext > Get Links (http://www.sal.neoburn.net/imagef1/ext/getlinks/)Thanks sal - that's almost what I want :D What I hunting for is something where I can right-click and select from a menu, or click a button from the toolbar, while already viewing a webpage wherever it might be, and it just finds all the links and saves them to file or something. If I can't find anything though your little page will definitely be the next best thing :)

Cheers,
M.

sal
19-04-2005, 01:13 AM
May I ask what you require this functionality for?

Midavalo
19-04-2005, 08:06 AM
May I ask what you require this functionality for?Sure :)

I'm using a website that seems to be updated regularly, so I want to keep track of the links (that's 90% of what is on the pages on the site) so I can check them out before the page is updated and some of the links might disappear. Then I can keep my own record of the links, and browse to them in my own time rather than having to check them all out and putting them in my favorites or wherever before they disappear from the website. There are a lot of pages with a lot of links, that's why I'm hoping to find a tool that I can stick on my toolbar that I can just use to extract the links from whatever page I might be on.

Cheers,
M.