Results 1 to 4 of 4
  1. #1
    morgenmuffel
    Join Date
    Dec 2004
    Location
    Putaruru
    Posts
    2,905

    Default Regular Expression help - only preserve contents of Anchor tags

    Hi all

    I am currently trying to get info out of a former frontpage site which is a mess to put it bluntly

    Basically all i want out of the pages is the anchor links like below

    <a href="/files/sopwith/camel.htm">Biggles and Algie</a>

    While finding them should be easy the sheer amount of extraneous tags is making the going painful,

    But i can't get my regular expressions working
    Morgenmuffel - This word needs to be a part of the english language, in fact you should use it in everyday conversation

  2. #2
    morgenmuffel
    Join Date
    Dec 2004
    Location
    Putaruru
    Posts
    2,905

    Default Re: Regular Expression help - only preserve contents of Anchor tags

    This works to find the links, but what i want it to do is remove everything else, and i am blowed if i can figure it out, I also xan't get the below code to work in notepad++, but it works in an elderly version of dreamweaver

    Code:
    <a\b[^>]*>(.*?)</a>
    Morgenmuffel - This word needs to be a part of the english language, in fact you should use it in everyday conversation

  3. #3
    morgenmuffel
    Join Date
    Dec 2004
    Location
    Putaruru
    Posts
    2,905

    Default Re: Regular Expression help - only preserve contents of Anchor tags

    I take that back the above code is only finding some links and not all as it isn't finding any that have line breaks in them
    eg
    <a href="/files/sopwith/camel.htm">Biggles and Algie
    </a>
    dammit my brain is now officially hurting
    Morgenmuffel - This word needs to be a part of the english language, in fact you should use it in everyday conversation

  4. #4
    morgenmuffel
    Join Date
    Dec 2004
    Location
    Putaruru
    Posts
    2,905

    Default Re: Regular Expression help - only preserve contents of Anchor tags

    Eureka-ish
    Code:
    <a\b[^>]*>([\s\S]+?)</a>
    probably not the most elegant, and i still can't work out how to get rid of all the other text on the page, or pipe the result into a new file on windows
    Morgenmuffel - This word needs to be a part of the english language, in fact you should use it in everyday conversation

Similar Threads

  1. Anchor Chain
    By TideMan in forum PC World Chat
    Replies: 12
    Last Post: 24-09-2007, 06:13 AM
  2. Where can I find MS Expression Web?
    By Vallis in forum PressF1
    Replies: 4
    Last Post: 20-02-2007, 03:26 PM
  3. Replies: 2
    Last Post: 18-08-2005, 12:27 PM
  4. How can I preserve anoymity in my documents?
    By Mercurio in forum PressF1
    Replies: 9
    Last Post: 08-04-2003, 06:52 PM
  5. Need a better boat anchor
    By in forum PressF1
    Replies: 3
    Last Post: 25-05-2002, 11:26 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •